Keeping operational costs optimized in the Cloud is one of the main pillars for cloud based companies success. This goes along with resilience, agility, and security. There are many resources discussing cost optimization, but they often lack the critical component—the human factor. Your people and processes will determine your financial success much more than the tools you use. Today we’ll focus on the former.
Let’s look at the basic scenario happening at your company as well: The FinOps (Financial Operations) team taking care of cost control and optimization for your cloud operations picks up on a component or a process that requires a change to reduce cost and improve operations. For example, it could be changing your instance types from the existing generation to the latest one. Moving to the new instance type can cut cost in up to 20%.
The FinOps team reaches out to the relevant development (or DevOps) teams with recommendations to change the instance types they are using. This is where the optimization process can break. Sometimes the development team simply does not respond, or they could state they have challenges (technical, time related or other) to address the required changes. There are cases where development teams accept and execute changes in a timely manner, but there is lack of visibility for the change and its impact on cost. Practically, The cost cut disappears into the “void”. Maybe it will be back sometime, probably too late, same as the passengers of the TV Drama “Manifest” of flight 828 have—and it did not end up well (so far at season 3 :-))
During the many journeys I lead with companies, I found the following process valuable in addressing the cost optimization challenges described here.
First, make sure you have continuous monitoring of your application workloads across their environments (development, QA, Staging, Pre-Production, Production). In each environment, your application workload (resources it uses) may change. Tag your application deployments with their change number (revision, version etc.). That way, you can observe the performance and resource use of your workloads across their change numbers and compare them.
Your FinOps team should be able to open a software/configuration change request for the development teams, when they find a workload that requires that change to improve its cost. That change request should enter the development team tasks. The task should have a cost-reduction estimation assigned to it. You can compute the estimated cost reduction according to the workload current cost, and then applying the estimated cost cut.
If you are a developer, you may say, “but my development team’s tasks don’t have any financial figure attached to them—how would I prioritize the cost reduction task versus a marketing or customer ask task?”. My answer is that most of your tasks could be set with a financial gain figure to them. So adding functionality would have a “5% income gain” assigned to it and could be compared to a “cost cut of 10%”. Lacking that, you should still strive to find a way to prioritize cost-cut related tasks.
Once the development team priorities allow it, they develop the code to implement the cost reduction. Since the task is in the development queue everyone knows about it, FinOps team as well- and we have visibility and accountability. Once the code change deploys into an environment (provided it is NOT mixed with other changes) you can clearly observe its effect on cost in your monitoring system: cost of your workload before and after the change.
Of course, this process is still challenging in many cases and having all the components I described can take time and effort to set and maintain, but I believe aiming to it brings valuable benefits.
We are all eager to learn from your experience on that topic, so please share!
Jacky Bezalel, Senior Technical Leader at Amazon Web Services ; Teams and Senior Management Career Coach.