Resolution of Hidden Issues of ML Algorithms Using Geometrical Projection of the Data Transformations

Business Continuity and Contingency Planning
IMPACT2020 | Business Continuity & Contingency Planning or, Who Turned out the Lights -Jonathan Gladstone
September 27, 2022
How to prepare your business for the Metaverse and for Web 3.0
September 27, 2022
Business Continuity and Contingency Planning
IMPACT2020 | Business Continuity & Contingency Planning or, Who Turned out the Lights -Jonathan Gladstone
September 27, 2022
How to prepare your business for the Metaverse and for Web 3.0
September 27, 2022

Resolution of Hidden Issues of ML Algorithms Using Geometrical Projection of the Data Transformations

Geometrical projections of the transformations of the data space during the execution of the families of algorithms represented by Isolation Forest, Random Forest, and Neural Networks are explained to expose the hidden weaknesses of the conventional approach using hyperplanes as decision boundary.

In this session, the family of algorithms represented by Isolation Forest is presented as a case study to highlight the hidden weaknesses that would manifest in the false identification of outliers in numerical data, such as performance metrics.

Synthetic data is used to amplify the hidden weaknesses of the conventional use of hyperplanes as decision boundaries and the improvements by the proposed modifications of using geometrically appropriate decision boundaries are confirmed by comparing the results from applying the algorithms to an online credit card fraud dataset.

Using only 7 features of the credit card fraud dataset with properties suitable for the metric, the modified algorithm outperforms the conventional algorithms using hyperplanes as decision boundaries with all the 28 features as input, demonstrating the strength of the proposed modifications.


Presented by

Jayanta Choudhury, Principal Data Scientist, Ericsson

Jayanta Choudhury have published several research articles on the topic of Universal Scalability Law with theoretical explanations of issues of the model as well as the parameter estimation methods during the CMG International Conference 2012 to 2014. In 2021, he presented results of successful application of a hybrid method consisting of Deep Neural Network and Self-Organizing Map to detect anomaly in Packet Core Network performance data. He is a Principal Data Scientist at Ericsson Global AI Accelerator (GAIA), Santa Clara, California, USA. He has published several peer reviewed research articles in journals and conference proceedings. He has more than 10 years of experience in applying AI/ML techniques to solve problems relevant for industry applicationin various data science related roles. He holds a PhD in Applied Mathematics and an MS in Computer Engineering.

Chenhua Shi, Data Scientist, Ericsson

Chenhua Shi is a contributor to the development of the hybrid method consisting of Deep Neural Network and Self-Organizing Maps to detect anomaly in Packet Core Network Performance data. He is a Data Scientist at Ericsson Global AI Accelerator (GAIA), Santa Clara, California, USA. He has worked on several critical projects as member of various GAIA teams. He is a recipient of Best Employee award at Ericsson in 2020


Interview:

IMPACT 2023 Proceeding Session Video:
To view the proceeding session video you must have a CMG Membership. Sign up today!

For existing members sign in here.