What workloads you must move to the Cloud – Part 3 (for Batch workloads)

Cover
Photo by Dylan Gillis on Unsplash

Batch workloads date back to the early days of computation when mainframes dominated the computing world.  They still play a significant role in different disciplines including business, engineering, medical, healthcare, and other areas.

Batch workloads are normally designed to run in the background and are meant to process huge amounts of data.

What are the challenges of running batch workloads on-premise?

1. Poor customer experience

Less availability of business functionality

On-premise IT ecosystems typically have batch workloads that need an extensive amount of computing and storage provisioning. This is why real-time workloads cannot share environments with batch workloads due to wanting resources. As a result of this, real-time workloads have to be kept suspended when batch workloads are typically run. This makes systems unavailable to customers and users for an extended period of time.

But such an arrangement is not viable in today’s digital world as most businesses rely heavily on 365 X 24/7 availability of real-time workloads to provide a seamless customer experience. Thus, keeping real-time workloads suspended to allow large compute-intensive batch workloads to be executed is significantly impactful to the consumer experience.

High information latency

What’s more, large batch processes are typically scheduled in time to run only once or twice a day and such a low frequency increases the latency of information processing. Such latency impacts customer experience even more.

2. High costs

Batch processes are compute-heavy with high throughput and storage requirements for data. Such high resource needs increase the cost of infrastructure on an on-premise configuration.

What can we do to alleviate these challenges in batch workloads

There can be different ways to solve the above challenges.

If it is necessary to retain the existing long-running batch jobs AS IS then these jobs can be hosted and executed on cloud resources instead of running them on-premise to take advantage of the cloud’s economy of scale and low storage costs. That approach would be advantageous from a cost reduction standpoint but may not be helpful from a customer experience or modernization standpoint.

Alternatively, we can resort to a more transformative and innovative approach that can modernize batch processes in a way that will save both costs and also improve customer experience and operational efficiency. Think of re-imagining and re-architecting an existing batch process to make it part of the real-time operations. For example, it will create a whole new meaning for bulk processing to happen in small (mini or micro) batches in place of large batches of data (as is typical) and when triggered on business demand to run in real-time or near-real-time. Such split-up and event-based triggering of batch processes in the interest of real-time and on-demand processing can generate never-before business value from a cost and customer experience standpoint.

Why cloud platforms can help?

Cloud platforms provide economy of scale and also very cheap storage, which makes them ideal for running heavy-duty batch jobs that process terabytes of data. By migrating on-premise batch workloads to the cloud you can free up your expensive resources that can be utilized more effectively and efficiently.

Cloud platforms provide very cheap storage that is highly available, durable and secured.

For example – AWS S3 object storage offers the durability of 99.999999999% of objects across multiple Availability Zones, 99.99% availability over a given year, comprehensive security and compliance capabilities, all at very cheap pricing.

Also, public cloud platforms provide tools and services to support innovation out-of-the-box. It provides architectures frameworks and components that can help to transform traditional time-windowed long-running batch processes into an on-demand, event-driven, real-time (or near-real-time) processing pipeline that can bring great benefits from a customer experience, cost reduction, and analytic standpoints.

What do cloud platforms provide?

Cloud platforms provide high-throughput storages and high-efficiency compute services built upon the foundations of modern architecture that include clustering, map-reduce framework, auto-scaling, caching, parallel processing, decoupling of processes, etc. to name a few that induce high performance.

Alongside, cloud platforms provide event-driven choreography and orchestration patterns that help to create a real-time or near-real-time behavior for the batch processes.

What is an event-driven pattern?

An event-driven application responds to actions generated by users or the system. These can include both user-generated actions like mouse clicks and keystrokes as well as system-generated events such as the upload of a data file into the system.

The advantage of this pattern is that it enables bulk processing to be triggered and run on business demand in contrast to a time-scheduled mechanism of triggering bulk processing that normally induces a high-latency. Such a pattern makes existing batch processes to become more relevant from a business context.

How do you configure events that create triggers? Well, there can be many options, as a couple of options described below:

1. Event-driven bulk processing:

As bulk data in the form of a file object which may exist in different formats like CSV, CSV, JSON, XML, etc. gets uploaded onto an AWS S3 bucket, S3 can trigger a lambda function to process that data and/or trigger the next process or set of processes in the processing pipeline.

A conceptual diagram of event-driven batch processing in the AWS cloud is as in the below diagram.

Event driven batch
Image credit to Medium.com and aws.amazon.com

Data can be either uploaded onto S3 bucket directly as a file object or with the help of streaming platforms like AWS Kinesis Firehose or Kafka that support data buffering to create micro or mini-batches before dispatching the batched data to an S3 bucket as an object.

The above pattern is often implemented using a micro-batching technique where the overall batch dataset is split into a set of tiny (micro) and independent batches of data, of more-or-less similar size. These tiny batches are then processed either in sequence or in parallel. This pattern is in contrast to the processing of the whole batch of data up-front.

Micro-batches can be created either by data volume (i.e. the number of records / rows each batch in the set would contain), or by data size (usually 1 mb to 5 mb), or by time-interval of data accumulation into a given batch (normally 1min or 5 mins to upto 60 mins).

The micro-batching technique helps to gain performance advantage over traditional batch processing through a decoupled and parallel processing technique that leads to low-latency and high turnaround time, and also supports high-availability of business applications to customers and users.


A conceptual diagram of micro-batching

Microbatching
Image credit to SlideShare

2. Real-time streaming for data processing in real-time:

Batched data can be streamed one record at a time with the help of steaming platforms, like AWS Kinesis Streams, to a DynamoDB table for which data streaming is enabled. DynamoDB stream can trigger a lambda function to process the streaming data one record at a time in a real-time fashion.

The different use cases and design patterns for the DynamoDB stream are available in the AWS documentation.

DynamoDB streaming
Courtesy SlideShare.com

Lambda functions are a part of AWS serverless architecture that supports near-unlimited scalability out-of-the-box. Here are Lambda’s limits.

Out-of-the-box batch services on the cloud:

All major cloud platforms offer batch services. Below is AWS’s definition of its batch service as available in the AWS documentation.

AWS provides AWS Batch service that dynamically provisions optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. With AWS Batch, there is no need to install and manage batch computing software or server clusters that you use to run your jobs, allowing you to focus on analyzing results and solving problems. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

You can find Part 1 of this four-part blog series is available at What workloads you must move to the cloud – Part 1 (for application scalability)

You can find Part 2 of this four-part blog series in What workloads you must move to the Cloud – Part 2 (for application resilience)

Suvo Dutta

I have over 22 years of IT experience in strategy, advisory, innovations, and cloud-based solutions in the Insurance domain. I advise clients in transforming their IT ecosystems to future-ready architectures that can provide exemplary customer experience, improve operating efficiency, enable faster product development and unlock the power of data.

You may also like...