By Ashish Dubey, VP Solutions Architecture, Qubole
As more and more businesses move from on-premise infrastructure to the cloud to benefit from cost saving, efficiency, speed and a democratization of data access, CFOs are glancing nervously at the rising cost of cloud overruns. Cloud sprawl, an unfettered proliferation of your organization’s cloud instances due to lack of real time control of cloud computing resources is a real problem plaguing data-driven organizations.
While recent competition between various cloud service providers (Amazon, Google and Microsoft) have benefited customers, failing financial governance of cloud computing costs threaten to dilute the delta gains companies have been experiencing as a result of harnessing the potential of big data like improved customer experience and product development roadmaps.
The Case for Strong Financial Governance
A marked difference between on-premise infrastructure costs (large upfront commitments for long term savings) versus cloud infrastructure is the on demand, per instance usage of cloud computing resources. A rather simplified comparison is signing up for a highly optimized data package from your ISP but burning through large usage pools of bandwidth without real time checks and filters. This can lead to unwanted surprises in your cloud bill. Governance is what keeps the checks and balances in place and is essentially a series of everyday tasks that are critical to keeping accountability and control on cloud spends.
Moving to the cloud has fewer risks today. The move, done with proper planning and POCs is easy and not very time consuming. Most cloud payment models are pay as you go and on-demand so organizations do not see a hefty upfront bill. Howeveras cloud projects mature, use cases and instances get layered, more complex; the danger for a runaway cloud bill goes up. It’s easy to bucket some of the reasons for this. As application requests are not known in advance, server allocations are made in advance thereby increasing server running time. Most web applications are engineered to reduce latency (for better customer experience) rather than costs. This means we have to forego the on-demand advantage that cloud servers allow for changing workloads and leads to poor performance optimization.
While most applications are designed assuming gradual increase and decrease of data processing requirements, in the real world data can represent burstiness which sharply increases the need for more servers. This is complementary to the concept of idle time as well. Most web applications can have a steady traffic flow but large workloads can be scattered through the day, leading to idle times when usage is much lower.
How Can Organisations Strengthen Financial Governance?
One IDC report showed that cloud infrastructure spending went up 23.8% in 2019 from the year before and estimates that it could further go up by as much another 43% by 2022. Whilst traceability and predictability are important elements in financial governance policies, cost control, and expense reduction is usually the starting focus of any financial governance exercise. There are ways for organizations to mitigate these costs:
– Optimize for performance while accounting for speed of query execution as well as timeliness of execution
– By prioritizing capacity management as an ongoing exercise by building systems that let teams build faster by nor worrying about unexpected costs by having financial guard rails tomaintain traceability and predictability on user, cluster and job cost metrics level.
Adopting Data Platforms with Built-In Financial Governance Metrics
More mature applications can leverage modern technology platforms involving AI/ML to drive stronger governance. Enterprises should look at platforms which enable Workload Aware Autoscaling in order to strengthen the financial governance within an organization. This will help support multiple teams run big data in a shared cloud environment or separate ones which can be combined to deliver more savings without compromising performance. Additionally, it also needs to include the strong tenets of Optimized Upscaling to reclaim and reallocate unused resources, Aggressive Downscalingto prevent cost overruns due to idle nodes,Container Packing as a resource allocation strategy and Diversified Spot which reduces the chances of bulk interruption of Spot nodes by your Cloud Provider.
To summarise, it is important to have a granular visibility of your infrastructure spend at a job, cluster, or cluster instance levels. This adds myriad benefits to track costs, monitor show-back, justify business plans, prepare budgets, and build ROI analyses. The foundation of robust financial governance of data costs is built by providing data teams the tools to view and make corrections to infrastructure needs immediately irrespective of application complexity and data analytics requirements.
If you have an interesting article / experience / case study to share, please get in touch with us at [email protected]