Optimizing operational costs through cloud (opens in new tab)
Coupang’s Finance and Engineering teams collaborated to optimize cloud expenditures by focusing on resource efficiency and the company's "Hate Waste" leadership principle. Through a dedicated optimization project team and the implementation of data-driven analytics, the company successfully reduced on-demand costs by millions of dollars without compromising business growth. This initiative transformed cloud management from a reactive expense into a proactive engineering culture centered on financial accountability and technical efficiency. ### Forming the Optimization Project Team * A specialized team consisting of Cloud Infrastructure Engineers and Technical Program Managers (TPMs) was established to bridge the gap between finance and engineering. * The project team focused on educating domain teams about the variable cost model of cloud services, moving away from a fixed-cost mindset. * Technical experts helped domain teams identify opportunities to use cost-efficient technologies, such as ARM-based AWS Graviton processors and AWS Spot Instances for data processing. * The initiative established clear ownership, ensuring that each domain team understood and managed their specific cloud resource usage. ### Analytics and Dashboards for Visibility * Engineers developed custom dashboards using Amazon Athena to process Amazon CloudWatch data, providing deep insights into resource performance. * The team utilized AWS Cost & Usage Reports (CUR) within internal Business Intelligence (BI) tools to provide granular visibility into spending patterns. * Finance teams worked alongside engineers to align technical roadmaps with monthly and quarterly budget goals, making cost management a shared responsibility. ### Strategies for Usage and Cost Reduction * **Spend Less (Usage Reduction):** Coupang implemented automation to ensure that non-production environment resources were only active when needed, resulting in a 25% cost saving for those environments. * **Pay Less (Right-sizing):** The team analyzed usage patterns to manually identify and decommission unused EC2 resources across all domain teams. * **Instance and Storage Optimization:** The project prioritized migrating workloads to the latest instance generations and optimizing Amazon S3 storage structures to reduce costs for data at rest. To achieve sustainable cloud efficiency, organizations should move beyond simple monitoring and foster an engineering culture where resource management is a core technical discipline. Prioritizing automated resource scheduling and adopting modern, high-efficiency hardware like Graviton instances are essential steps for any large-scale cloud operation looking to maximize its return on investment.