What workloads you must move to the Cloud – Part 1 (for application scalability)

by Suvo Dutta · Published November 18, 2019 · Updated December 8, 2019

Many organizations, including a lot of those in Fortune 500, are running all or most of their business applications on-premise. These organizations have not considered moving their applications to the public cloud yet as a result of which they are unable to take advantage of the strategic advantages that public cloud offers in terms of performance & application scalability, reliability, cost-benefit, innovations, and operational excellence.

What that leads to is that these organizations still grapple with issues that concern with application scalability, availability, high infrastructure costs, and limitations in business capabilities.

Thus, if you are in charge of managing and modernizing IT systems for such an organization then it is high time that you start thinking about the public cloud.

The worldwide public cloud services market is projected to grow 17.5 percent in 2019 to a total of $214.3 billion, up from $182.4 billion in 2018, according to Gartner, Inc.

Which workloads should you consider moving to the cloud?

Moving every application to the cloud may not always be the right idea due to several reasons like there might be regulations that would prohibit the migration of specific data and applications to the cloud depending on the work that the organization is involved in. As a result, it is important to take a moment to consider what workloads work best in public cloud platforms. Technically, the following four typical workloads are most suitable for hosting on the public cloud provided they do not have any regulatory or legal constraints.

External facing web applications that have high variability in their workloads

Business-critical applications that require a high resilience

Traditional “batch” applications that require high throughput

Analytic workloads with heavy-duty data processing

This blog post series takes a deep dive into each of the above workloads in four parts.

The first part takes a look at how the cloud helps to support workloads with high variability, or in other words, workloads that need frequent scaling to serve critical business functions.

What are scalable applications (what is application scalability)?

These are applications whose workloads typically go up and down over a period of time, the result of which is that depending on the magnitude of the workload at a given point in time the need for system resources, like physical servers, memory, CPUs, and storage, for these applications can vary. This results in variable capacity requirements over time.

According to Wikipedia’s definition, scalability is the property of a system to handle a growing amount of work by adding resources to the system. These resources can be added either vertically, or horizontally, or in combination.

In the above context, when resources are added vertically the process is called vertical scalability or scale-up in which the additional resources are added in multiple layers in the same box, for example, adding more CPUs in the same box or machine.

In contrast, when resources are added horizontally then the process is called horizontal scalability or scale-out which involves an increase in the number of boxes.

Also, vertical scalability is often constrained by hardware limitations and cannot scale beyond a certain extent, but horizontal scalability can support indefinite scaling, at least in theory.

Typically what kind of systems require high application scalability?

While the type of applications that require high scalability is not limited to these as below, this is a common and generalized list:

Websites that attract a lot of users on the weekend
Travel websites that have seasonal spikes in user demand
Applications whose demand vary by calendar events, like Thanksgiving sales or health insurance enrollment events
Applications whose demand increases at certain times of the day (like, gaming apps that experience enhanced usage in the evenings)
eCommerce websites that have a sudden rise in user visits due to advertising campaigns
Graphics or video rendering applications that have frequent ups and downs in workload

What are the challenges in achieving high application scalability on-premise?

Managing application scalability on-premise, in the traditional method, is painful and requires a hell of capacity planning. IT capacity planning is an expert’s job and is a work of art which if not done right can lead to results that would be far from accuracy. That can impact both IT efficiency and budget.

With no knowledge of the future trajectory of the variable workload, capacity planning is always a prediction that can lead to significant gaps between estimates and actuals the result of which is either over allocation or under allocation of resources. What’s more, there always exists high latency and cost in the process of allocation or de-allocation of resources to address the needs for changing workloads.

This results in the inability to ramp up resources in a timely manner during a spike in workload for an application that can make the application unavailable, which is a high business impact.

Additionally, excess resources at a time when the workload for the application is low would lead to the idling of the resources that can cost significantly.

How does the cloud help with application scalability?

Modern cloud platforms provide architecture frameworks that automate application scalability.

These frameworks can be server-based (or provisioned) or serverless frameworks that are fully configurable and are capable of supporting both vertical and horizontal scalabilities.

Let’s take a deeper look.

Application scalability in server-based (or provisioned) architectures

Cloud platforms provide auto-scaling capabilities out-of-the-box which monitors workload for your applications and automatically adjusts the capacity to maintain steady, predictable performance at the lowest possible cost. Using auto-scaling, it’s easy to setup application scaling for multiple resources across multiple services in minutes.

Auto-scaling on cloud enables you to allocate or deallocate resources that are exactly needed for the application to run optimally, and pay only for resources that you need.

AWS provides EC2 auto-scaling that allows EC2 instances (that are virtual servers in the AWS platform) to be part of auto-scaling groups. In EC2 auto-scaling, an auto-scaling group is set up as a collection of EC2 instances in which the number of instances can go up or down automatically.

Below is an EC2 topology in auto-scaling groups and multi-AZ availability configuration using the AWS Elastic Load-Balancer.

EC2 auto-scaling with load-balancing — aws.amazon.com

This EC2 auto-scaling ensures that an auto-scaling group always has the correct number of EC2 instances available to it so that it can optimally handle the workload for the application at any point in time, that is hosted on these EC2 instances.

As part of its configuration, each auto-scaling group can be set with a maximum and a minimum number of instances. The auto-scaling group ensures that the group never goes above the maximum size or below the minimum size.

Thus, if the auto-scaling group is configured with the desired number of instances either at the time of creating the group or at any other time after its creation the auto-scaling group ensures that the group always maintains those many instances.

What is dynamic auto-scaling?

The auto-scaling group can launch or terminate new and additional instances elastically to meet variable workload capacity for the application (i.e. as demand increases or decreases) if there is a scaling policy applied to the auto-scaling group.

Such a scaling policy is used to increase and decrease the number of running instances in the group dynamically to meet changing workload conditions. This is also called “Dynamic Scaling”. Here is the link to the AWS documentation for dynamic scaling for a more detail study.

Below is an example of Dynamic scaling-

The following auto-scaling group has a minimum size of one instance, the desired capacity of two instances, and a maximum size of four instances. The scaling policy defined keeps adjusting the number of instances within the minimum and maximum number of instances, based on the criteria as specified.

This diagram illustrates the elastic nature of AWS auto-scaling group with dynamic scaling enabled through the use of a scaling policy. — Image from AWS documentation

Application scalability in serverless architectures

Let’s look at the serverless side of things. Modern cloud platforms offer serverless computing services that have built-in scalability. These services scale automatically on-demand and are highly cost-effective.

In the serverless model, you do not have to manage the servers. The cloud provider administrates and runs the servers, and dynamically manages the allocation of machine resources. This results in a flexible pricing model that is based on the actual amount of resources consumed by an application, rather than on pre-purchased units of capacity.

You focus entirely on your business logic, upload your code and data to the AWS platform, and use those serverless services AS IS.

Here is an interesting article on how to scale your web application on the cloud!

How serverless services scale in AWS cloud platform

There can be a wide range of serverless services available on public cloud platforms. Below is a list of commonly used serverless services in the AWS platform for your reference. The grid also contains, for each service, links to relevant AWS documentation about an overview of the service, its features, pricing and setup procedures, as well as the description of how the service scales automatically.

The information in this grid will help you to become conversant with these services, how they can help in your context, how to set them up, and also the price model for these services.

Service type	Name of service in AWS	Links to the relevant documentation on overview, features, pricing and setup procedures	Scaling behavior
Compute Service (or Function as a Service)	Lambda function	Lambda	Supports extensive concurrent executions, details of which are available in this link.
Content Delivery Networks (CDN)	CloudFront	CloudFront	Uses a global network of 187 Points of Presence (176 Edge Locations and 11 Regional Edge Caches) in 69 cities across 30 countries, and scales automatically.
Key value and Document database	DynamoDB	DynamoDB	Manages throughput capacity automatically using auto-scaling technique. Please refer to the link.
Object storage and static website hosting	S3	S3	Below is an excerpt from the AWS documentation that describes S3’s scalability limits –“Your applications can easily achieve thousands of transactions per second in request performance when uploading and retrieving storage from Amazon S3. Amazon S3 automatically scales to high request rates. For example, your application can achieve at least 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix in a bucket. There are no limits to the number of prefixes in a bucket. You can increase your read or write performance by parallelizing reads. For example, if you create 10 prefixes in an Amazon S3 bucket to parallelize reads, you could scale your read performance to 55,000 read requests per second.”
API Proxy Service	API Gateways	API Gateways	Scales Automatically up to default limit of 10,000 requests per second (RPS) with an additional burst capacity. These limits can be increased on request.
Message Queues	SQS	SQS	As per AWS documentation, “Amazon SQS works on a massive scale, processing billions of messages per day. You can scale the amount of traffic you send to Amazon SQS up or down without any configuration. Amazon SQS also provides extremely high message durability, giving you and your stakeholders added confidence.”
Notification Service	SNS	SNS	Scales almost infinitely and automatically.
ETL Service	Glue	Glue	Fully managed ETL services in AWS that scales automatically and indefinitely.
Workflow Service	Step Function	Step Function	Scales automatically.
Data Query and Analytics Service	Athena	Athena	Scales automatically with no need to provision or manage servers.

Scalability in CI/CD

While not limited to these, some of the core components for automating CI/CD in the AWS cloud platform are the following services. All of these are managed services and scale automatically to meet your needs:

Scalability in Relational Database Systems(RDS)

Relational Database Services in cloud platforms like AWS offer both vertical and horizontal scalabilities to keep up with varying demands.

In this context, while horizontal scalability is an effective strategy for read-heavy RDS applications, vertical scalability works well for applications with nearly equal reads and writes.

Taking the AWS perspective, databases can be scaled up vertically quite easily with the push of a button (through stacking up additional resources vertically).

Read replicas are effective ways to scale databases horizontally for better performance and durability. The below diagram shows how ‘Read replicas’ help to scale AWS RDS.

Scalability in Amazon RDS — aws.amazon.com

Please refer to this AWS blog post for more information on scalability in the AWS RDS service.

Effective June 20, 2019, AWS RDS started support for storage auto-scaling for all the database types that it supports the supporting documentation from the AWS website for which is as below:

RDS Storage Auto Scaling continuously monitors actual storage consumption, and automatically scales up capacity when actual utilization approaches provisioned storage capacity. Auto Scaling works with new and existing database instances. You can enable Auto Scaling with just a few clicks in the AWS Management Console. There is no additional cost for RDS Storage Auto Scaling. You pay only for the RDS resources needed to run your applications.

Here is the link to AWS documentation about RDS storage auto-scaling in the AWS platform.

Scalability in Amazon Aurora

Refer to Aurora scaling.

Auto-scaling in Microsoft Azure:

Check here to learn about Microsoft Azure’s auto-scaling capability.

Check how Azure Serverless can help build applications faster without managing infrastructure.

Summary for application scalability

Cloud platforms provide application scalability and elasticity out-of-the-box. This eliminates the need for upfront capacity planning and any investment in infrastructure for continuous resource monitoring or resource provisioning for changing workloads.

It’s just a few clicks that can configure all of this heavy-duty infrastructure management up and running on the cloud. The cloud platform automatically and continuously strikes a balance between demand and capacity for the applications that are running on it.

It simplifies infrastructure management by enabling infrastructure to be built and managed using software code, by means of which it shifts ‘application scalability’ from the domain of systems operations to the domain of software development.

Makes a clear case for hosting our variable workloads on the cloud, isn’t it!

You can find Part 2 of this four-part blog series in What workloads you must move to the Cloud – Part 2 (for application resilience)

Here is another related article on platform modernization

What workloads you must move to the Cloud – Part 1 (for application scalability)

Which workloads should you consider moving to the cloud?

What are scalable applications (what is application scalability)?

Typically what kind of systems require high application scalability?

What are the challenges in achieving high application scalability on-premise?