All cloud objects are elastic and ephemeral. It is a real problem to understand, analyze and predict their behavior for Cost optimization and Capacity management. The raw data is collected by observability tools, but it is big and messy. We explain how to turn that mess into information.
The essential requirement to do cloud Cost optimization and Capacity management is the system performance data about object such as clusters, containers, serverless objects, databases, and virtual discs. The presentation is to explain and demonstrate how the data should be cleaned by anomaly and change point detection without generating false negatives like seasonality. How that should be aggregated addressing the issue of jumping workload from one cluster to another due to “rehydration”, releases, and failovers. How to summarize the data to avoid sinking in granularity. How to interpret the data to do cost and capacity usage assessments. Finally, how to use that clean, aggregated, and summarized data for Capacity/Cost Planning by using ML/Predictive analytics.
Presented by
Igor Trubin, Lead Data Engineer at CapitalOne Bank
Igor Trubin has started in 1979 as an IBM/370 system engineer. In 1986 he got his PhD.in Robotics at St. Petersburg Technical University (Russia) and then worked as a professor teaching CAD/CAM, Robotics for about 12 years. He published 30papers and made several presentations for conferences related to the Robotics and Artificial Intelligent fields. In 1999 he moved to the US and worked at Capital One bank as a Capacity Planner. His first www.CMG.org paper was written and presented in 2001. The next one, “Exception Detection System Based on MASF Technique” won a Best Paper award at CMG 2002 and was presented at UKCMG 2003 in Oxford, England. He made other tech. presentations at IBM z/Series Expo, Southern, Central Europe CMG and ICPE/WOSP-C 2020 in Canada. and ran several workshops covering his original method of Anomaly and Change Point Detection (www.Perfomalist.com). He is an author of the online class “Performance Anomaly Detection”. After working more than 2 years as the Capacity team lead for IBM, he had worked for SunTrust Bank for 3 years and then at IBM for 2+ years as Sr. IT Architect. Now he works for Capital One bank as IT Manager at the Cloud Engineering department and since 2015 he is a member of www.CMG.org Board of Directors. He runs his tech blog at www.Trub.in and YouTube channel https://www.youtube.com/iTrubin
Interview:
IMPACT 2023 Proceeding Session Video: