Monday, July 14, 2008

Do you Scale?

Great post by Dare, Scalability: I Don't Think That Word Means What You Think It Does
following up on a post from Scott Loganbill at Google about their choice of "scalable" services.

Scalability is a funny question. Every application is designed to handle a certain type of architecture - it's usually one of those fundamental sales analysis questions "and how many orders do you typically process a day? How many people do you have working for you".

While scalability problems can be due to both technology and design, it's typically design that is the major culprit. (ok, yes, the guy that said he can run a 15,000 transaction a day system in Access - he likely chose the WRONG technology).

But everytime I think of scalability, I always think of 37 Signal's approach to scalability - in short, "Do you really need 12 servers now if you can run on two for a year?" - note read the comments from David Hansson "... Not worrying too much about scaling doesn’t mean building Basecamp on an Access database with a spaghetti soup of PHP making 250 database calls per request. It simply means having a profitable price per request, chilling the f*ck out, and dealing with problems as they arise."

Actually, their better quote is here. "If you've got a huge number of people overloading your system then huzzah! That's one swell problem to have."

Many of these posts and statements are generally made for large systems designed for mass usage. Not necessarily the same issues facing the systems that small businesses or even medium-sized businesses need. But still, just thinking about scalability when designing your databases can help in a big way.

I see silly design decisions that limit scalability all the time and they are generally made for no reason at all. A limit of 2 characters for Sequence numbers - what?!? you're never going to get over 99 items on a single order? Regardless of how unrealistic it may seem, chances are someone will hit it.

And then there are those that become requirements: A 5 character code for a customer ID - then the customer comes up and says I need to use 6. Is that really a scalability issue? Not really - until you've reached 26^5 customer codes, but the customer may perceive it as such. But the whole 26^5 argument does show there is a scalability design flaw in that definition. Will it ever show up? Hard to say.

Two Real Factors
So consider two different factors when it comes to scalability: hw uptime/access (as in can your hardware infrastructure handle a large number of users/transactions) and design (can your application handle a large number of users/transactions). Both go hand in hand yet demand different solutions.

The infrastructure factor can play off the design - if your application is designed for small groups but has lots of small groups, you can always put one group onto another server, if need be.

The Design can detrimentally affect the infrastructure. You can have as many servers as you want but if you constantly pull down lots of records when you only need one, you're asking for headaches.

And One more...
There is one more scalability factor to consider...and it's one that I think many people disregard because it's a Design issue but not so much an architecture design one - it's a user experience scalability design. Many applications were/are built for single transactions (the first web banking site for my bank only allowed one bill to be paid at a time). Just as much as uptime/downtime can play into it, if your application doesn't make it EASY to handle the increase in use, you can just forget about your other scalability problems.

Purists will say "Every software should scale " (and scale infinitely). Pragmatists will say "there will always be SOME scalability flaw in any design".

Think of this another way...even scales have scalability problems - in the design. Many only go to 300lbs. With the obesity epidemic these days, I'm not sure that's a limitation that users can overlook.

1 comment:

Eiso Kant said...

Hi Andrew,

Your post on scalability even describes my wrong use of Excel ;-). I enjoy the categorization you make when it comes down to scalability problems and I couldn't agree more that the 37signals quote is often most applicable.

Eiso