Cloud computing has been around for quite some years now. Amazon first popularized the concept in 2006, before the other tech giants followed in 2008 (Google) and 2010 (Microsoft). At the heart of this technological shift is the idea that you shouldn’t be bothering with setting up your own IT systems unless this is your job in the first place. How is this possible? Let’s get straight into it!
Cloud computing definition
First, we should define what the whole fuss is about. Cloud computing is a service allowing you to access IT resources and applications:
- on-demand,
- through the Internet,
- with a pay-as-you-go pricing
It basically means that you could create the new killer app using Amazon’s servers (IT Resources), sitting at home (through the Internet), scaling your infrastructure as your app grows or slows down (on-demand), but always paying for what you are using (pay-as-you-go).
Needless to say that this setup allows you to really focus on your product and not on procurement, IT maintenance or scaling. You benefit from the economies of scale done by tech giants. As an example, AWS, Amazon’s cloud computing business, is used by Netflix for almost all its backend infrastructure. When you link this to the fact that Netflix accounts for almost 15% of downstream traffic on the internet (see the 2019 Sandvine report), it becomes clear that most probably, AWS strongly negotiated when buying their servers, electricity, etc. allowing their end customers to enjoy lower prices.
Last but not least, cloud providers spread their data centers all over the world. That concretely means that you will be able to serve your users with roughly the same speed throughout the world as it is very likely that they are close to the cloud providers’ facilities. That allows you to create applications for a global audience, within minutes, without the hassle of setting up hardware in different geographies.
Let us now focus on AWS, the largest cloud provider out there (33% market share according to data from Synergy Research Group, Q2 2019). Let’s see how AWS has built its worldwide IT infrastructure in a resilient manner to actually deliver what we described.
AWS infrastructure setup
So far, AWS has defined 22 regions throughout the world (see map below) where it has installed clusters of IT infrastructure. Each cluster is called an Availability Zone (AZ) and there should be at least two per region (but usually three). In turn, each AZ is made of multiple data centers. Finally, each data center is made of potentially thousands of servers.
For instance, one of the regions AWS has defined is Singapore where I reside. There are currently three AZs, let’s assume they are located in Changi (East), the business district (South) and Jurong (West). Each one has its own building, guarded 24/7. It also has separate power and network connectivity in order to limit disruption in case the region is affected by a storm or say, lightning for Singapore. However, the three AZs are connected to each other through low-latency (optic fiber) links. This way your app can be physically duplicated and serve more traffic. If you choose this setup, your app is said to be a High Availability app. This redundancy makes your app more resilient as well, with traffic being routed to any remaining functional AZ in case of an outage.
What if now, your user is located in the Philippines? There is no AZ there. Your app may not be as fast as in Singapore. That’s why AWS has also built data centers at many Edge Locations. An Edge Location is basically a cache data center that brings static content closer to the users for low latency connectivity. There are more than 180 such locations today all over the world allowing you to quickly download your Netflix movies even when you are far from an AWS region. This service is made possible by Amazon CloudFront, a Content Delivery Network (CDN).
By looking at the map, you can still notice areas with no AWS presence. AWS network is still a work in progress and it is obviously driven by market forces to develop its regions. However, it is worth mentioning that it has some stringent requirements regarding for instance power providers which bars it from installing regions anywhere. This is the result of high levels of service guaranteed by AWS. For instance, its storage service S3 (Simple Storage Service) boasts a 99.999999999% durability. That concretely means that if you store 10 million objects on S3 (each object can contain up to 5TB of data), you can expect to lose one object every 10,000 years. With these levels of guarantees, it makes sense to build regions on solid foundations.
Let’s now explore the AWS services that are needed to build a web application.
Basic setup
In the simplest setting, you would place your web application, your database, etc. on a single Elastic Cloud Compute (EC2) instance. This is the most popular service on AWS and is basically a virtual computer running on a server in one of AWS’s regions. It is said to be Elastic because you can start and stop your instance (or put simply your computer) as needed. You are able to select the hardware specifications (memory, CPU, GPU…) as well as the software configuration called an Amazon Machine Image (AMI) including the Operating System (Windows, Linux, etc.). After setting up your compute instance, you will need to take care of a few other things as well, mainly Networking, Storage and Security. We will explore in more detail the Networking and Security aspects in my next post, stay tuned!
Once your EC2 instance is up and running, you will launch your app on a specific port (say 5000 for a Flask app). Knowing the IP address of your properly configured instance, people should be able to access your app from the internet now. However, in a browser, we usually use human-readable addresses like medium.com. That is why you need to set up a Domain Name System (DNS), a web service that will translate the site name (medium.com) into an IP address (104.16.120.127). As with everything in the cloud, AWS provides this service and it is called Route 53.
Solving the availability problem
This is great, however, if you need to shut down your EC2 instance or in case of a power outage, you will lose all your data generated by your web application without mentioning that your website will be down. How to make your data persistent then? This is where you want to use a database instance, separated from the initial EC2 instance. You could use a second EC2 instance and set up everything yourself, manage the database, etc. but this is troublesome. Instead, AWS has fully-managed database services. For structured data (like tabular data), you could choose to go for Amazon Relational Database Service (Amazon RDS) on which you can run several possible database engines like MySQL, PostgreSQL or Amazon Aurora (said to be 3 times faster than PostgreSQL).
So with this, your data is safe, awesome. However, your website is still down in case of an outage. Let’s make your app more resilient, or HA (High Availability) in AWS jargon.
If your EC2 instance is down, it means that the AZ in which it is sitting is experiencing issues, right? Then, let’s use a second AZ, less likely to be affected as well. You are going to simply replicate the previous architecture on several AZs inside a region. The main tool to stitch your architectures together is a load balancer. And it is not any load balancer, it is an Elastic Load Balancer (ELB) as in anything cloudy. As its name suggests, the ELB will distribute traffic to your web servers as well as perform some health checks (are your servers running, at what speed, etc.). In case one server is down, the ELB will automatically route the traffic to the others. This is also useful to evenly spread traffic on your instances in order to avoid overloading in case of a sudden surge in traffic. You may also use two databases, with one being actively used while the other maintains a copy at all times.
So far, you have made your app resilient and highly available. What if now, you want your app to be performant and efficient, even with millions of users?
Solving the volume problem
It is time to take advantage of the Edge locations we talked about earlier! On your website, you may have large static content like movies or pictures, that is not going to be edited but needs to be highly available and highly durable (you don’t wanna lose that uncle Brad’s pic). We are going to store this kind of data in a Simple Storage Service (S3) bucket and have it located closer to your users, thanks to Amazon CloudFront (Content Delivery Network). The content will be stored in your initial region of choice (e.g. Singapore) and replicated in the edge locations, all this managed by CloudFront for you. The advantage of using S3 is that it is an object storage solution, as opposed to a block storage solution used in your EC2 instance (called Elastic Block Storage or EBS, that you can think of as a hard drive). Without going into the details of the difference between the two, you can remember that object storage came as a solution to the explosion of data being generated these past few years:
- It is much more durable than block storage and it is easier to increase storage capacity over time while mitigating costs
- It is well suited for storing objects that don’t need incremental changes like text (ideal for backups for instance)
- The storage cost is reduced when accessing data infrequently. As an example, a normal S3 bucket costs around USD2 cents per GB per month, while an S3 Glacier bucket (for long-term archiving) costs USD0.2 cents (costs vary depending on the regions).
Congrats! Your application is now robust and can serve your users with large content and low-latency connection throughout the world!
What happens now when your needs vary over time? A sudden hype puts your web application under the spotlight and thousands of users flow on your website. Or inversely, you just passed a seasonal peak and face a drastic decrease in daily users. That’s when you need some automation!
Autoscaling
Varying traffic on your website will translate into very concrete changes in some metrics: storage and CPU usage for instance. With Amazon CloudWatch, you are able to monitor these metrics on a predefined group, say for instance the one below with four EC2 instances over two AZs:
In a simple case, you want to be able to add or remove this group depending on current needs. This is easily done using AutoScaling with which you can set up alarms. From there, you can define a scaling policy. For instance, let’s say you define an alarm that triggers when the average CPU utilization goes beyond 80% of the available capacity. You may choose to add a 30% capacity to your existing CPU resources, in case this alarm goes off. This is this flexibility that allows you to adapt to any variation in user traffic.
Here you go with the whole architecture of your web application:
This is a lot to process and there are even more AWS services out there! Take storage for instance: you could be using DynamoDB for storing unstructured data or ElastiCache for extremely fast access to data. You could automate deployment using Elastic Beanstalk and so on. We cannot cover them all here but feel free to browse Amazon documentation to know more about them.