Scaling modern web applications
Author: Andrea Rákosfalvy
Date published: 5/11/2021
Time to read: 10 minutes
Horizontal and vertical scaling - which one is the right choice for your app?
When your application is your business and your livelihood, scalability can mean the difference between success and failure. Your application needs to be able to scale up to meet growing demands as well as scale down and save you from overspending when you hit hard times.
Both horizontal and vertical scaling offer an opportunity to handle such fluctuation and the evolution of technology led to various types of scalable hosting solutions to accommodate the needs of modern businesses and applications.
In this article series we present vertical and horizontal scaling in depth. While we will also demystify the inner workings of application scaling with Enscale, we hope that the concepts presented within the series will be of use to you regardless of your choice of hosting provider.
Let’s dive in by defining the terms:
Vertical scaling: adding or removing resources within the same server.
Horizontal scaling: adding or removing same-role servers from your server architecture.
The transportation analogy
Damien, our Operations Director and co-founder of Layershift famously has a transportation analogy for all aspects of web hosting; so naturally he has one for scaling as well. Here it goes:
Consider your site as an important away game for your favourite football team. Your site visitors are the fans that need to travel to the stadium, and the server is the transportation method to get there.
You - the developer - are president of the official supporters club, responsible for arranging the ideal transportation to the big match!
Depending on the number of fans your favourite team has - maybe they're a non-league side - perhaps they even fit into a single car! It's ok, just you and a mate is a cosy fan club. On a good day, maybe you can convince a few more people along so you need a minibus, or even a double decker - fitting more people into a single vehicle is vertical scaling. One vehicle, but a bigger one with capacity to carry more people.
On the other hand, if your team is top of the European Super League (ahem!), you might have a few more fans trying to get to that important away game. It'd be crazy to squash 1,000's of people into one vehicle! Instead, you hire several buses or coaches - horizontal scaling. Many vehicles (usually fairly large ones), each carrying lots of people, to give a greater overall capacity than available in a single vehicle.
Money is tight in your supporters' club, so obviously you wouldn't hire a fleet of coaches if it's only you and Dave going to the match! In the same way, if you were to make multiple trips carrying everyone as a passenger in your own car, there's no way you could drive 1,000's of people to the match in time for kick-off!
The same considerations apply to scaling your application. Horizontal and vertical scaling techniques both exist, but which to use depends on your circumstances.
Physical vs virtual server scaling
Similarly to the analogy, before virtualisation when an application needed to scale vertically a different type of “vehicle” was required - you can only fit so many people in a car, if you need more space you need a bus instead. Translating this on to physical servers: each physical server has a limited capacity, if you need resources beyond what it offers you need to - in best case - replace parts of it, or replace the entire thing.
The entire task of purchasing, upgrading, migrating involves a lot of time and effort and will also mean downtime for your application. To prevent this disruption it’s quite normal to deliberately over specify hardware that you can grow into over time - but the ROI depends on whether your estimations prove to be correct. Scaling down to cut costs is effectively impossible, as that’d require purchasing new (smaller) hardware (at an additional cost).
Scaling horizontally also requires investing in new infrastructure, setting it up, and configuring your application in a way that works with the new infrastructure.
If you don’t want to purchase your own servers, you could always lease them, but due to the nature of the beast, hosting providers generally require a larger lock-in period or offer these servers at a higher cost. (Here you should calculate with the cost of maintenance, backups, restores, patching etc - cheaper providers generally make you handle all of this).
With the evolution of virtualisation technology, both vertical and horizontal scaling became a lot easier and whilst physical scaling is still suitable for certain projects, most modern projects benefit from the flexibility provided by virtualisation.
In the hypervisor-based virtualisation model a bare metal server is split up into several virtual servers (virtual machines, or VMs). Resizing VMs is easier, provided the underlying hardware has spare resources, but it still requires a reboot, so this needs to be done sparingly to avoid frequent service disruption. On the other hand container based solutions can perform these resource allocations dynamically and without requiring restarts, so they can be performed anytime. There’s an excellent article about hypervisor- vs. container-based virtualization by Michael Eder, in case you want to read more.
Both technological advances also come in handy for horizontal scaling models as well, but you will need to configure your application in a way that knows how to handle the multiple-server setup. We will be going into the practical considerations of horizontal scaling in future posts.
There's a lot of cool engineering content about how the likes of Facebook, Netflix, Shopify and Twitter scale - but the rest of us mere mortals face very different challenges that in turn require different solutions.
As you can see, there are multiple ways to achieve scalability for your application, so how can you choose the right one? In our experience of hosting and scaling applications for well over a decade, there is no universal “best way”. You have to analyse your needs and figure out what's most appropriate for your particular case, but we've put together some guidelines to highlight some of the most important considerations that we hope will help in making this decision for your own app.
What to consider when scaling your application?
The financial impact
Naturally you want your application to be redundant and highly available, but that comes at a cost and adds considerable complexity to your application. Always ask yourself: do I really need it? Aim for minimum viable complexity. To put it in perspective, consider the worst case scenario and how you’d be able to cope with it.
If that scenario would make your life a living hell and put you in serious financial strain, you probably should do everything you can to avoid it, including paying more for a hosting setup which means one or all of the following:
- Horizontally scaled servers for redundancy and high availability
- A regularly rehearsed, detailed disaster recovery plan
- Hardened security solutions (WAF, DDoS protection etc).
Each of the above require careful planning and don’t come cheap, while Shopify probably has them, the investment isn’t worth it for you unless it’s vital for your business.
On the other hand if your impact is more accurately described as inconvenient than financially critical, a costly setup will just drain your budget - surely you wouldn’t pay a thousand pounds for hosting Refinery if it’s only for personal use. For that it’s enough to have an easy solution to spin up a new server to deploy your code to and a fairly recent backup to restore from. If push came to shove, you can also start from scratch, sure it’ll be frustrating, but it wouldn’t be the end of the world.
The app’s requirements (aka. the “numbers”)
The question in the diagram about the number of users is oversimplified and the definition of many and few is pretty vague, but I assure you there is an absolutely good reason for it.
The question in fact has to do with thread count and concurrency requirements and we can’t really give an informed number or even range that is valid for everyone. It depends on a lot of things, such as the complexity of your code, whether processes can be queued without damaging the user experience and so on.
The fewer processes you need to run concurrently, the more likely your app will do perfectly fine on a single server. While vertical scaling also increases the number of concurrent processes you can run (you get more CPU power), this is not always enough. With horizontally scaled servers on the other hand processes are split between the multiple servers ensuring you have the concurrency you need.
The performance question on the low budget impact side of the diagram refers to whether you can make a tradeoff between having a high performing app for peak periods (horizontal scaling costs more) or a cheaper setup (inconsistent user experience at high loads).
Let’s see a couple of examples of how this works.
Say you have an online streaming service - you have the primary processes serving the current visiting users and you have background processes that upload new content to the server which can be made available next week. The prioritization is already handled by sidekiq, and background processes are being queued for non-peak periods when there’s spare capacity to handle them - these are negligible from a concurrency point of view. However as nobody is likely to subscribe to a clunky streaming service, the primary processes still need to be able to run concurrently to keep your site visitors happy and the way to achieve this is by horizontally scaling your application - or in the least prepare for this in advance
In our second example you have a news site - the primary focus is serving site visitors, but you also need to create new content. Both process types are equally important, so to avoid performance issues, you’d need enough processing power to handle everything concurrently. As long as your content is good, users will be likely to forgive decreased performance during peak periods though, unlike in the previous case. So instead of focusing on great performance during rush hour, you should keep an eye on the average performance and scale up vertically as you grow.
Consistency of the load
If your application’s load is consistent at all times and you require the same amount of resources, you can opt for a server with sufficient resources to handle said load. When running a promotion, higher load is anticipated and naturally you will want to scale your server up for the promotional period. Once the load returns to normal you can scale down. For both operations a maintenance window can be scheduled to let users know of the expected downtime and avoid negative user experiences.
On the other hand imagine that your site gets loads of visitors during lunchtime and for the rest of the day it’s just tumbleweed. As the scaling process normally comes with downtime or requires some sort of manual input, it’s not feasible for you to keep scaling the server up and down each day every day; so you’d need a server sufficiently powerful to handle the lunchtime crowd even though it’s overkill for the rest of the day.
There are hosting providers (*wink wink*) who offer dynamic vertical scaling. Basically you get a “large” server, but charges are usage-based - so you only pay for the resources you’re using during specific intervals. For instance Enscale (and Jelastic) charges you by the hour for the resources consumed for that hour, so during those tumbleweed periods you only pay for the absolute minimum required to keep your site running.
The larger the difference between the baseline usage of your app and the resource requirements during peak periods, the more money you save using a dynamic vertical scaling model. And you can use it in combination with horizontal scaling as well, of course.
The diagram sums up any hosting experience, but it’s true about scaling as well. Fast and reliable hosting requires horizontal scaling and is never cheap. Fast and cheap can also be achieved with vertical scaling, but a single server means a single point of failure which makes it unreliable. Reliable and cheap would be small horizontally scaled servers, but those naturally won’t be as performant as hosting on a single server for the same cost.
There’s always a tradeoff to be made and I hope this article has provided some useful guidance on how to choose the most efficient solution for your particular application. In upcoming articles we will continue to explore both scaling models more in depth, so be sure to check back here or follow us on Twitter.
- Scaling became increasingly important for modern web applications
- Vertical scaling is like switching from a car to a minibus to a double-decker. Horizontal scaling is like getting an entire fleet to transport people. Both are viable transportation options, but the number of people determines which one is more efficient and cost-effective.
- Scalability is easier to achieve with virtualization, but there are still limitations (downtime, complexity etc)
- The three deciding factors:
- Money - your financial dependance on your application primarily determines how much you should spend on hosting
- Size and performance requirements - the number of visitors (aka. required processes and resources) impacts whether it’s more feasible to split your application up in a horizontally scaled model or keep it on a single server
- Load consistency - the more inconsistent your load, the more more you should investigate automatic scaling solutions (such as offered by Enscale)
- The tradeoff between cost, reliability and performance is as real for scaling as is for hosting in general.