Hi! I am Rohan Gonjari.
A Data Scientist, Data Analyst & Machine Learning Engineer.
With a Master's in Data Science & proven track record of 4 years in Data Analytics roles, I have worked on optimising pricing models and customer retention in the finance and insurance sector as a Data Scientist. I have specialized in imbalanced classification of Multi-Modal data & hyper-tuning deep neural network models as an ML Researcher. As a Data Analyst, I leveraged regression models & tableau dashboards to identify targets to increase sales. I have executed projects encompassing end-to-end implementations of ML pipelines, designing database schemas, conducting statistical tests & building interactive visualizations. Throughout my research, I have contributed to research publications on Multimodal Classification using GNNs.
Experience
Data Scientist
Legal & General America
Contract | Remote
Oct 2023 - Present
• Conducted statistical A/B testing (t-tests, ANOVA) to evaluate the effectiveness of dynamic pricing strategies.
• Utilized Bayesian statistics, Causal Inference, XGB models for dynamic pricing, to raise $5 million in additional ARR.
• Played a role in improving ad-hoc SQL queries & reporting to optimize pricing adjustments & strategic decisions.
• Optimized ETL pipelines with GCP tools (Cloud Storage, Dataflow) & Apache Spark for scalable healthcare &
financial data management.
• Leveraged Snowflake & MS-SQL for data querying & management, improving data quality & analysis speed.
• Utilized Power BI to develop interactive dashboards & automated data processes to optimize report generation.
ML Researcher
University of Massachusetts Dartmouth - MIND Lab
Full-time | Dartmouth, MA
Aug 2022 - Sep 2023
• Utilized GNNs, Neural Networks, K-means clustering, Support Vector Machines (SVMs), & Decision Tree models to implement supervised machine learning using graph data.
• Performed dimensionality reduction (PCA, t-SNE) to help visualize graph nodes and edges using Seaborn.
• Designed ML architectures to efficiently fuse information for multimodal data (EEG, fNIRS) to improve BCI-systems.
• Proposed model showcased a notable improvement in classification by 16.25% & 21.65% in two distinct studies indicating potential impacts on patient care (Master’s Thesis).
Data Analyst
Destek Infosolutions
Full-time | India
Aug 2020 - July 2022
• Collaborated with 120+ clients to implement GA4 via GTM to meet project requirements with a 95% success rate.
• Implemented A/B testing to ensure accuracy & reliability of data collected in GA4 when updating event triggers.
• Led a data sourcing project to establish data pipelines & data warehouse, utilizing GCP services & SQLite.
• Applied regression models for targeted customer segmentation, resulting in a substantial 18% sales boost.
• Developed different Tableau dashboards to have more visibility of companies’ sales portfolio & other KPIs.
Projects
Sentiment Analysis of 2022 FIFA World Cup
Extracted real-time sentiment data from Twitter's API, categorized FIFA World Cup tweets using VADER sentiment analysis, and deployed a scalable data pipeline on Amazon Airflow & EC2 for processing, storing results on S3.
- Python
- Airflow
- EC2
- S3
Hospital Management System
Established MySQL data architecture for Health Management System, performed ETL using Selenium for NHS surveys, and transformed prescription data with NumPy and Pandas for loading into the HMS database.
- Python
- Selenium
- MySQL
Evaluating Medical Condition
Diagnosed patient health based on predicted health scores using EDA and modeling. Predicted scores using regression model with Cross-Validation & Recursive Feature Elimination with significant & engineered features.
- R
- XGBoost
Visualizing Olympics Performance
Leveraged D3.js, HTML, and CSS to create a visualization featuring interactive geospatial and scatter plots of Olympics athlete data, facilitating insights into medal-winning factors and country-level correlations.
- D3.js
- HTML
- CSS
- Javascript
Skills
Technologies
- Python
- MATLAB
- R
- SQL
- MySQL
- SAS
- Tableau
- Git
- Power BI
- CUDA
- Docker
- CI/CD
- HTML
- CSS
- PowerShell
- JavaScript
- TypeScript
- Google Analytics
- Google Tag Manager
- Linux
Libraries
- PyTorch
- TensorFlow
- Pandas
- NumPy
- Spark
- XGBoost
- NLTK
- OpenCV
- ggplot
- Selenium
Cloud
- AWS
- SageMaker
- S3
- EC2
- Airflow
Expertise
- Statistical Modeling
- Market Mix Modeling
- Predictive Analytics
- ETL Tools
- Deep Learning
- Data Wrangling
- Data Analysis
- Demand Forecasting
Education
Master's of Science
Data Science - 2023
• Coursework: High-Performance Parallel Computing, Advanced Data Mining, Deep Learning, Data Visualization, Data Architecture & Design, Business Analytics, Graph Neural Networks
Bachelor's of Technology
Electronics & Communication Eng - 2016
• Coursework: Numerical Analysis, Discrete Mathematics, Data Structures & Algorithms, Statistical Analysis