How to Apply Modeling and Optimization to Select the Appropriate Cloud Platform

Save MSUs and Reduce Run-Times for Analytics and MXG Reporting
April 22, 2020
Roundtable Discussion Recap: Observability
April 23, 2020
Save MSUs and Reduce Run-Times for Analytics and MXG Reporting
April 22, 2020
Roundtable Discussion Recap: Observability
April 23, 2020

How to Apply Modeling and Optimization to Select the Appropriate Cloud Platform

From Southwest CMG virtual event April 20, 2020

Speaker: Dr. Boris Zibitsker, CEO of BEZNext

Abstract: Organizations want to take advantage of the flexibility and scalability of Cloud platforms. By migrating to the Cloud, they hope to develop and implement new applications faster with lower cost. Amazon AWS, Microsoft Azure, Google, IBM, Oracle and others Cloud providers support different DBMS like Snowflake, Redshift, Teradata Vantage, and others. These platforms have different architecture, mechanism of allocation and management of resources, and sophistication of DBMS optimizers which affect performance, scalability and cost. As a result, the response time, CPU Service Time and the number of I/Os for the same query, accessing the similar table in the Cloud could be significantly different than On Prem.

In order to select the appropriate Cloud platform, we use modeling and optimization.

  • First, we perform a Workload Characterization for On Prem Data Warehouse. Each Data Warehouse workload represents a specific line of business and includes activity of many users generating concurrently simple and complex queries accessing data from different tables. Each workload has different demand for resources and different Response Time and Throughput Service Level Goals.
  • Secondly, we must collect measurement data for standard TPC-DS benchmark tests performed in AWS Vantage, Redshift and Snowflake Cloud platform for different sizes of the data sets and different number of concurrent users.
  • During third step we use the results of the workload characterization and measurement data collected during the benchmark to modify BEZNext On Prem Closed Queueing model to model individual Clouds.
  • And finally, during the fourth step we use the Model to take into consideration differences in concurrency, priorities and resource allocation to different workloads. BEZNext Capacity Planning optimization algorithms incorporate Graduate search mechanism to find the AWS instance type and minimum number of instances which will be required to meet SLGs for each of the workloads. Publicly available information about the cost of the different AWS instances is used to predict the cost of supporting workloads in the Cloud month by month during next 12 months.

About the Speaker: Dr. Boris Zibitsker is a CEO of BEZNext. His focus is on the development of performance assurance, performance engineering, dynamic performance management and long-term capacity planning software tools for big data, data warehouse and cloud applications. He is a member of SPEC Big Data Research Group. Boris consults with many Fortune 500 companies, and he manages Capstone projects for graduate students in MS in Analytics at University of Chicago. Boris a Honorable Doctor of BGUIR and during last 5 years he was a co-chairman of Big Data Advanced Analytics Conference.


To view the video you must have a CMG membership. Sign up today!

For existing members sign in here.