Like the on-premises solution, there is no completely standard cloud solution, but many cloud solutions have some typical cloud-based storage systems and experience Big Data type pressures. Many architectures then incorporate layers like a data lake and features like Azure Data Factory for ingestion or Azure Databricks for data preparation.
When dealing with big data scenarios, we often see a cloud-based implementation incorporating a data lake. While there are technically variations of a data lake (and even Apache Spark) that can be on-premises, implementation is typically much more complicated. Some related services like Azure Data Factory only exist in the cloud, even if they can be integrated into on-premises data sources.
For data warehouse and data visualization layers, often there are comparable cloud resources that would match on-prem. For example, you can run SQL Server Database and Analysis Services on-prem, or Azure SQL Database and Azure Analysis Services in the cloud. However, there is not complete parity yet (although support for SSIS in Azure Data Factory and SSRS in Power BI service help close this gap).
Pros of the Cloud Solution
Now that we have a feel for basic cloud architecture, we can dig into some of the pros and cons of this solution.
Time to Market
One of the biggest benefits of a cloud solution is time to market. You can quickly spin up environments. If you need another QA environment, especially if it’s only a temporary environment, your ability to provision that resource almost immediately is a great asset. This can also be useful for longer-term feature development.
Scalability
One of the other biggest pros is scalability. That’s not just the ability to scale up as demand grows, but it’s also the ability to scale down to regulate costs. Often, we think about scalability as the long-term growth of an application. What’s great about the cloud is you can also think of this in a smaller sense. You might need a higher scale during the day, but you don’t at night. Or maybe during the week you need a higher scale, but on the weekends you don’t. You can start to think of scalability, not just in terms of this year vs. last year, as finding that on-demand need of your system.
Consumption-Based Pricing
With the traditional on-premises solution and some cloud payment plans, you have capacity-based pricing. That means you’re provisioning a server that has a certain amount of bandwidth or processing power, and you’re paying for that capacity whether you use it or not. With some Platform-as-a-Service or Software-as-a-Service offerings, you get consumption-based pricing – so you only pay for what you use. With Azure Data Factory, for example, if you’re not sending any data through that pipeline, then you’re not paying anything to have it defined. The cloud can give you cost effective solutions.
Security
In my previous blog, I mentioned that on-premises is perceived as being more secure. However, the cloud can make a strong claim for security as well. We’ve all seen news headlines about companies having data breaches, and Microsoft and other cloud providers don’t want to be in those headlines. Microsoft takes a lot of steps to make sure its datacenters are secure. Although cloud gives you the option to open everything up to make access easier, it starts with security in mind and provides tools to help evaluate and monitor security. Ultimately, your architect or designer is responsible to maintain security.
Maintenance
With the various services that are in the cloud, you don’t have to worry about getting and implementing software updates. Instead, a lot of that maintenance comes automatically with the service.
Service and Feature Options
No matter what kind of service or feature you need, there is a cloud service specialized to handle it. Big Data need? Low-cost type of solution? Data science? Data ingestion? Microsoft Azure has 150+ services right now designed to meet different needs.
Cons of the Cloud Solution
These are impressive advantages, but the cloud approach comes with some challenges as well. One of those is simply the fact that it’s a new paradigm.
New Paradigm
We’ve talked about on-premises as acquiring infrastructure or hardware. Now we have options like Infrastructure-as-a-Service, Platform-as-a-Service, and Software-as-a-Service. That means we don’t have to provision the entire server; we might have options for consumption-based pricing. But making those changes can be a challenge.
Options
I put options as a positive, but it can also be a negative. With 150+ services, sometimes it can be confusing to know which service best balances cost and your performance/functionality needs.
Skillset
Again, due to the growing number of tools in the cloud, there is also a growing number of skillsets that might be required to work with those different tools.
Automatic Updates
Automatic updates are another point that shows up as a pro and a con. With these, staying up-to-date happens automatically, which can be great from a security and functionality perspective. In general, most of those updates are also backward-compatible. However, there can be backward-breaking updates. In those cases, notices are usually sent out with adequate time to address those issues. But, in many cases, you aren’t given a choice to update because the cloud is a shared service.
And the Winner Is…
Despite these drawbacks (which you absolutely should be careful about), in my opinion the cloud is the clear winner. I can be a bit biased, but its scalability, affordability, and dynamic performance make it a good choice for many processes.