Cloud Fundamentals for Solutions Architects
To recap our intro module, solutions architects (SAs) are key drivers of the sales cycle for IAAS / PAAS/ SAAS companies, as they ensure potential customers understand the value of the features and benefits of a new enterprise product. SAs come from diverse backgrounds, including engineering, consulting, and even sales.
Irrespective of the background, however, every SA must know cloud basics. Demonstrated cloud experience signals that an SA has the required foundation on which to build product and industry-specific knowledge needed to sell technical solutions.
Why Should I Learn About Cloud Architecture?
Beginner SAs benefit from learning cloud architecture because most of the products SAs sell are deployed on the cloud. For more experienced SAs, specific insight into how big, complex organizations leverage cloud capabilities to satisfy business goals cements their image as “trusted advisors” for customers.
Tip: Check out the cloud architecture recap in Exponent's Fundamentals of System Design Module for more in-depth coverage of the reasons big companies switch to cloud solutions.
An early-career SA candidate might be asked general questions on commonly used cloud services like storage, compute products, etc. An experienced candidate might be asked more in-depth or more solution-specific questions.
For example, an experienced SA interviewing for Databricks might first be asked how to deploy a distributed data pipeline on Databricks. From there, they may get follow-up questions on how to make the pipeline resilient and secure, how to ensure high availability, how to do proper fault-handling, and how to ensure data at rest and in motion is secure.
Before we deep-dive into specifics, let’s get a feel for how cloud technologies work.
Why Use Cloud Solutions?
Before cloud computing, every organization used to host their data and servers in their own private data center (a physical location hosting servers that provide computing power.) Depending on requirements, a data center might host just a few servers or millions. As you might imagine, this was expensive, logistically difficult, and posed challenges to scalability. Companies with widely varying usage might not be able to handle traffic spikes with just a few servers. During low-usage periods, excess capacity might sit idle.
Cloud providers like AWS, Azure, and GCP solve these problems by providing centralized data centers in multiple locations around the world. These locations are called regions. Any given region typically has three or four availability zones, or, smaller data centers hosting a fraction of total server capacity in the region. Why split them up? To ensure business continuity.
When single availability zones go down or don’t perform to expected standards (due to natural disasters, electrical outages, technical reasons, etc.), other availability zones can take on the load of the affected availability zone and continue serving traffic.
The main benefit of using cloud computing is that each organization does not need to procure and maintain its own servers. Instead, they pay cloud providers for exactly what’s needed; adding or removing capacity as traffic grows and shrinks, as well as take advantage of built-in security and leverage the various out-of-the-box services like AI/ML or advanced analytics.
An Introduction to Cloud Architecture
Let's take a look at a high-level diagram of one of the most common cloud architectures.

We'll review these components one by one.
Storage
One of the key differentiators of cloud systems is the cheap, almost-infinite storage cloud providers such as AWS, Azure and GCP provide. Leveraging economies of scale, cloud solutions offer cheap, distributed, and resilient storage. Regardless of the volume of data you need, cloud providers can handle it. It is very easy and simple to connect to storage provided by cloud providers, either through APIs or through other means, with granular access control, enabling data access to only those who require it.
AWS’s cloud storage product is called S3 (Simple Storage Service). Azure’s is “Blob storage” and GCP is “cloud storage”. These can be used to store everything from audio and video files, text files, .csv files – almost anything you can think of.
-
Further reading for beginners: Explore the storage capabilities provided by AWS here. Check out the various kinds of storage, and take a moment to brainstorm use cases for each type.
-
Further reading for experienced candidates: Check out these engineering blogs from LinkedIn, Meta, and Twitter to learn how each company effectively stores petabyte-scale data.
Databases
Cloud vendors take the complexity out of running and managing your own database, no matter what your needs are. In a few clicks, you can provision your database, create schemas, enable security, and connect to these databases. All cloud providers like AWS, Azure, and GCP provide their own database products and support running other databases like MongoDB on their platforms, thus ensuring a minimal amount of friction when an existing service needs to be moved to the cloud.
Both SQL and NoSQL databases by cloud vendors themselves and third parties are widely available on cloud platforms, thus enabling running any workload seamlessly on the cloud.
Tip: Need a refresher on SQL vs. NoSQL? Check out Exponent's refresher here.
-
Further reading for beginners: Continue exploring AWS here. This time, think about when you'd want to store data in a SQL vs. NoSQL database. As a bonus, check out more exotic databases (like graph databases) and brainstorm use cases for those.
-
Further reading for experienced candidates: Read these two Uber engineering blogs covering common database migration projects from start to finish. The first covers Uber's switch from Postgres to MySQL, and the second covers a later switch from the default MySQL storage engine to MyRocks.
Compute
On-demand server capacity is a key driver of the widespread adoption of cloud platforms. Before cloud providers were available, procuring servers took weeks (if not months.) The cycle of requesting servers, physical delivery to a company-owned data center, and configuration / maintenance of these servers took lots of time and money. With cloud providers, server capacity is available in a matter of minutes instead of weeks/months, with the option of choosing from a wide range of configurations.
-
Further reading for beginners: Check out the various compute products offered by AWS at here. Think about the benefits/disadvantages of traditional compute (such as EC2) vs. newer offerings like Serverless and brainstorm ideal use cases for both. Bonus: Read up on containerization and when it should be used.
-
Further reading for experienced candidates: Read these engineering blogs from Airbnb and Meta.
Tip: Airbnb makes use of Kubernetes, an open-source orchestration engine to unify and scale its architecture. Unfamiliar with orchestration services like Kubernetes? Start with the Exponent explainer for a quick overview, then dive deep with Kubernetes documentation.
Web Servers
Servers are a category of compute technology, and still the most widely-used type of compute. For beginners, it is beneficial to understand the various types of servers and usage of the servers in various situations. Web servers are the first point of contact for a request from a requesting party. For example, think of a very simple website with one-page static content. Typically, the website would use a web server in the background which will contain the static HTML page responded with when it receives a request. When accessing a website over the web, the requesting application is the browser. It is the web server that typically contains the information which you eventually see on your browser.
For a high-traffic website, a single web server cannot handle all traffic, thus it is essential to provision multiple web servers that can handle the traffic in a combined way.
-
Further reading for beginners: Check out this great guide to EC2 servers and their different features.
-
Further reading for experienced candidates: Check out this blog by Uber Engineering to understand how Uber manages hundreds of thousands of servers to manage infrastructure across cloud and on-premise.
API Gateways
API Gateways have proven to be one of the most essential components of any cloud architecture. API Gateways ensure that the services built on the cloud can be managed from a central location, rather than each service having to implement capabilities like security, surge protection, etc., on their own, thus making service management much simpler across the entire organization.
With API Gateways, service monetization is also greatly simplified. For example, Twitter uses API Gateways to monetize its APIs, with different tiers linked to service usage.
-
Further reading for beginners: Check out this great introduction to API Gateways on AWS, detailing reference architectures on how API Gateways can be used in the real world.
-
Further reading for experienced candidates: Learn how LinkedIn and Uber use API gateways to handle millions of API requests.
Caching
Caching has proved to be another essential component of any large-scale distributed system. Accessing systems such as databases is typically very computationally expensive, thus caches can store the responses to frequent queries made by databases, thus preventing the database from being overloaded due to a high number of requests. Accessing and maintaining a cache is usually quite cheap(computationally), and caches are very lightweight, meaning that they do not require as many resources to run as a database.
Caches come in multiple flavors, depending on the use case. Some caches store only key-value pairs, while others might use more complex information. Two of the most used caches are Redis and Memcached.
Tip: Review caching in more detail here.
-
Further reading for beginners: Check out this overview of what caching is and the options available in AWS to enable caching for your applications.
-
Further reading for experienced candidates: Learn how Netflix and Airbnb use caching to deliver a great user experience.
Load Balancers
Load balancers are used to distribute incoming requests to a number of servers. In the previous point, we talked about the need for multiple web servers to handle incoming requests to a high-traffic website. To effectively manage all web servers, a load balancer is typically used. The load balancer ensures that requests are evenly distributed to all web servers, and keeps a tab on the health of each web server through automated checks.
-
Further reading for beginners: Check out this comprehensive guide to load balancing in AWS.
-
Further reading for experienced candidates: Learn how Meta and LinkedIn leverage load-balancers to service high loads with minimum latency.
High Availability
Earlier we talked about availability zones. Almost all organizations deploy their applications/services in multiple availability zones to ensure business continuity in case of a high availability outage. Some organizations have the requirement for multi-region high availability, i.e., applications are deployed not just in multiple availability zones across a region, but also within multiple regions, like US-east-1 and US-west-1. This ensures business continuity even in the case of an entire region(large geographic area) being affected by a natural calamity or technical problems.
-
Further reading for beginners: Check out this great blog on how to get started with high availability in AWS.
-
Further reading for experienced candidates: Learn how Meta and Netflix ensure high availability.
Scalability
One of the most important reasons to use cloud solutions is to increase scalability, which is a critical concern for modern tech companies.
Earlier, we talked about how easy it is to request server capacity in a cloud platform. Let’s talk about a retailer who needs 5x server capacity to handle transactions on its website during a once-a-year “Black Friday'' (the biggest shopping day in the US) sale. What would they do with that capacity for the rest of the year? Cloud providers unlock the ability to automatically increase/decrease server capacity, ensuring that organizations do not overspend on server capacity. Scalability can be either vertical (increasing the CPU power, or memory) of a server, or horizontal (multiple servers working as a group to handle incoming requests).
Tip: Dive deeper into what scalability means in tech here.
-
Further reading for beginners: Check out this comprehensive blog. on how AWS suggests handling scalability when designing and deploying applications. Take a moment to brainstorm relevant scenarios to use horizontal or vertical scaling.
-
Further reading for experienced candidates: Check out these engineering blogs from Spotify on solutions for scaling its event-delivery system. Part 1 gives an overview of Spotify's "road to the cloud" and Part 2 covers the specific challenges of autoscaling Pub/Sub messaging.
How Can I Learn More about Cloud Architecture?
A great way to learn the details about cloud solutions is to work towards cloud certification(s).
Cloud providers like AWS, Azure, and GCP offer different certification paths, which enable you to start with the basics and choose a specialty as you go forward, like networking, security, etc.
For beginner SAs, we recommend starting with a “Cloud Fundamentals” certification and then achieving the “Cloud Architect” certification. Cloud fundamentals get you comfortable with using cloud products and cloud architecture certifications give you a complete overview of the often-used services in the cloud, enabling you to create a great knowledge base for SA roles. We suggest starting with either AWS or Azure, these are the two most widely used cloud providers.
AWS Certifications
Azure Certifications
Once you've completed one (or more) of these, the next step is for you to determine the domain in which you want to specialize.
For example, if you want to apply for SA roles in data engineering and AI/ML companies like Snowflake and Databricks, it would be great to have some data engineering and AI/ML certifications/knowledge under your belt.
Another option can be to consider Solutions Architect roles in AWS / Azure / GCP, these roles would require you to go deeper into how cloud products work and how they are being used by organizations to solve business problems.