About Me
I'm a Data Scientist and Machine Learning Engineer pursuing my Master's in Data Science at the University of Texas at Arlington.
With expertise in AI/ML, Deep Learning, and Cloud Technologies, I specialize in building scalable solutions that transform raw data into actionable insights.
My work spans healthcare analytics, NLP, computer vision, and MLOps, with a passion for deploying production-ready machine learning systems that make a real-world impact.
Featured Projects
Sepsis Prediction System
Advanced ML model using XGBoost and ensemble methods to predict sepsis onset early, trained on a comprehensive synthetic dataset. Achieves high accuracy in early detection critical for ICU patient care and treatment planning.
Edukrishnaa - Career Guidance System
AI-powered career guidance system for students (10th, 12th, UG) using psychometric and aptitude tests with multiclass classification to recommend career paths, job roles, roadmaps, and higher study options. Published research paper in Springer Nature.
CI/CD Pipeline Dashboard
Real-time analytics dashboard for monitoring CI/CD pipelines with Docker containerization, AWS cloud infrastructure integration, automated testing workflows, deployment metrics, and performance analytics for DevOps teams.
KSAT Quest - Regression Runoff
Predicts soil's saturated hydraulic conductivity using UKSAT data via preprocessing, feature selection, and modeling, evaluated with RMSLE/R² to aid sustainable hydrological modeling.
NYC Taxi Trip Duration Prediction
Predictive modeling project forecasting taxi trip durations using historical NYC taxi data. Includes comprehensive data preprocessing, exploratory data analysis (EDA), feature engineering, and XGBoost-based predictive model to explore urban taxi trip dynamics.
Disaster Impact Prediction Model
Comprehensive ML model predicting reconstruction costs, injury risks, regional impact, first responder requirements, resource availability, evacuation plans, and shelter needs during disasters.
E-Voting dApp - Blockchain Voting
Completely decentralized e-voting system built on Ethereum blockchain, ensuring transparency, security, and immutability of voting records using smart contracts.
Diabetes Retinopathy Diagnosis
ML-based medical diagnosis system for detecting diabetic retinopathy from retinal images, aiding early detection and treatment of this vision-threatening condition.
Breast Cancer Cell Classification
Machine learning model for classifying breast cancer cells, helping in early diagnosis and treatment planning through automated cell analysis.
An Eye for Sightless
Computer vision system for visually impaired individuals using Kinect sensor to identify known persons, provide audio navigation, and assist in corridor navigation through familiar or unfamiliar spaces.
Research & Publications
Early Sepsis Prediction Using Machine Learning in ICU Patients
2024Yash Joshi, et al.
IEEE Conference on Healthcare Informatics 2024
Novel approach combining XGBoost and LSTM for early sepsis detection with 98.7% accuracy...
Read PaperNLP-Based Career Recommendation System Using BERT Embeddings
2023Yash Joshi, et al.
ACM Conference on Recommender Systems 2023
Deep learning approach for personalized career guidance achieving 89% match accuracy...
Read PaperMy Journey
Master of Science in Data Science
The University of Texas at Arlington
CGPA: 4.0/4.0. Coursework: Big Data Management, Data Science, Data Visualization, Machine Learning, R Programming, Statistics.
Hackathon UTA - 1st Place
Team Lead
Led team to first place in University of Texas at Arlington hackathon competition.
Data Engineer
Larsen and Toubro Private LTD.
Configured data ingestion pipelines connecting SAP ERP, SQL Server, and flat-file feeds. Engineered ETL workflows in Python (Pandas, SQLAlchemy) and Airflow, reducing manual reporting time by 35%. Optimized pipeline performance, cutting data refresh latency from 2 hours to 25 minutes.
Bachelor's in Computer Science
Mumbai University
GPA: 8.5/9.0. Comprehensive foundation in computer science principles including algorithms, data structures, software engineering, database systems, operating systems, and computer networks. Developed strong problem-solving skills and technical expertise.
Smart India Hackathon - Finalist (2nd Place)
Team Leader
Led team as finalist achieving 2nd place in national-level Smart India Hackathon competition.
AI & Machine Learning Engineer
Universal Recycle Solutions
Designed and deployed Power BI dashboards connected to SQL Server. Developed ML pipeline (Random Forest, Gradient Boosting) to forecast waste collection volume 7-14 days ahead. Cut overtime costs by an estimated 12% per quarter through predictive analytics.
Hack Overflow - 1st Place
Team Lead
Led team to first place in Hack Overflow hackathon competition.
State Level Paper Presentation - 1st Place
Presenter
Achieved first place in state-level paper presentation competition.
Technical Diploma in Computer Science
Diploma Program
Percentile: 92.00%. Intensive technical training in computer science fundamentals including programming languages, software development, web technologies, and system administration. Built strong technical foundation through hands-on projects and practical applications.