Blake Commagere: “As you
Scaling Websites (RE: Large sites powered by Java web frameworks and Tiles + WebWork)
Matt Raible is look for large sites powered by Java web frameworks.
Maybe I am in a cranky mood, but I wouldn’t worry about scaling out the web tier.
You can scale JSF just fine (although I may not like JSF at all ;)
Slashdot does just fine with Perl + memcached.
Your scalability concerns are going to come in with your architecture, and caching.
If the bottleneck is in taking the HTTP info in, parsing it into objects, and the on the back-end doing the opposite, then you have done an amazing job scaling the database and the network layers.
Of course we need to test our architectures etc, but I have just seen a PHP application that handles a massive load, with an average architecture :)
It handled everything just fine because of the caching they use.
This is why TheServerSide.com running on EJB etc was/is such a hilarious thing. Tapestry isn’t the bottleneck ;)
And with 64 processors each with 64-cores, CPU bound scalability probably won’t be the case often too!
Use the tech that gets out of your way and feels right to you. This is why Rails is doing so well.
I was on a new Web 2.0 application last nite. The Ajax was flowing nicely through this app, and it was sure pretty.
But then I started to notice some weird behaviour. If I added something it showed up fine on one page, but didn’t show up on another. As I navigated around this world I kept seeing inconsistencies from area to area.
I see this from time to time, and normally it smells like aggressive page caching.
I have nothing against caching at the page level. It makes a LOT of sense for many things, as the closer you get to the user, the less work you are repeating.
However, you always pay a price in this balanced world of performance and scalability. In this case, there is a lot more to keep in sync, and a lot of people ignore that side of the equation.
This is why I really like to have a caching layer for my applications which are further towards the DB than the web page itself. This cache does the hard work of keeping all of the info that I need in sync, but when it does change, the dynamic web pages automatically get that update.
This means that you get a nice balance of all worlds:
- Data is cached closer to the user, yet not too far from the DB
- Access times to this data cache are almost in-memory, and very fast
- You have consistent data showing up on all of your pages
As always, this will depend on what you are doing, and it is a tricky balancing act…. but let’s try to not just turn on page caching and walk around, expecting everything to Just Work ™.