Rahul Shelke profile picture

Hello, I'm

Rahul Shelke

Full Stack Data Scientist

I'm passionate about turning raw data into actionable insights and building scalable AI solutions that solve real-world problems.

My LinkedIn profile My Github profile

Get To Know More

About Me

Experience icon

Experience

1.7 years
Data Scientist

Education icon

Education

B.E (CS)
University of Mumbai


I’m Rahul Shelke, a skilled Data Scientist with hands-on experience in Python, NLP, and machine learning technologies. I specialize in developing advanced models and enhancing product functionality through data-driven solutions.


With a strong academic foundation in Computer Science and Engineering from the University of Mumbai, and practical experience across diverse projects, I bring robust analytical and problem-solving skills to the table. I’m passionate about transforming data into actionable insights and am eager to contribute to cutting-edge projects within a dynamic, collaborative team environment.


Tools & Technologies

Data Collection

SeleniumSelenium
Beautiful SoupBS4
KafkaKafka

Data Engineering

AirflowAirflow
PySparkPySpark

Data Storage

S3S3
MongoDBMongoDB
PostgreSQLPostgreSQL

EDA & Stats

NumPyNumPy
PandasPandas
SciPySciPy
StatsmodelsStatsmodels

Visualization & Reporting

MatplotlibMatplotlib
PlotlyPlotly
TableauTableau

Natural Language Processing

NLTKNLTK
spaCySpaCy
HuggingFaceHuggingFace

Computer Vision

OpenCVOpenCV
YoLoYoLo

Model Development

PythonPython
scikit-learnscikit-learn
TensorFlowTensorFlow
PyTorchPyTorch
OptunaOptuna

MLOps

MLflowMLflow
DVCDVC
GitHub ActionsCI/CD
DockerDocker
KubernetesKubernetes

Deployment

FastAPIFastAPI
FlaskFlask
StreamlitStreamlit

Monitoring

PrometheusPrometheus
GrafanaGrafana
LokiLoki
Evidently AIEvidently AI

Generative AI (Models)

GeminiGemini
OpenAIOpenAI
MistralMistral
LlamaLlama

Vector Databases

FAISSFAISS
PineconePinecone
ChromaDBChromaDB

LLM Integration Tools

LangchainLangchain
ChainlitChainlit

Cloud & Infrastructure / Middleware

AWSAWS
RabbitMQRabbitMQ
TerraformTerraform

Other Tools

GitGit
JupyterJupyter
VsCodeVsCode
PyTestPyTest
MkDocsMkDocs
Draw.ioDraw.io
ShellShell

Browse My Recent

Projects

churn prediction

Heart Stroke Prediction

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

Sensor Fault Prediction

Sensor Fault Prediction

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

Shipment Price Prediction

Shipment Price Prediction

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

Customer Segmentation

Customer Segmentation

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

NYC Taxi Fare Prediction

NYC Taxi Fare Prediction

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

Churn Prediction

Churn Prediction

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

News ETL Pipeline

News ETL Pipeline

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

Weather Data Pipeline

Weather ETL Pipeline

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

Market Data Pipeline

Market Data Pipeline

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

Train Data Pipeline

Train Scrapper + ETL

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

E-commerce ETL Pipeline

E-commerce ETL Pipeline

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

YT Data Harvester

YT Data Harvester

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

InstaETL

InstaETL

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

Synthetic IoT Generator

Synthetic IoT Generator

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

PillLens

PillLens

Project Overview

  • End-to-end ML project predicting heart-stroke risk
  • Automated pipeline: data → model → deployment → monitoring
  • Production-grade with containerized deployment and CI/CD

Highlights

  • Accuracy: 92%, F1-Score: 0.89
  • Automated CI/CD & containerized deployment
  • End-to-end ML pipeline demonstration on portfolio

Key Contributions

  • Data acquisition & cleaning (MongoDB, PySpark, Pandas)
  • EDA & feature engineering (Matplotlib, Seaborn, SciPy)
  • Model building & tuning (scikit-learn, FastAPI)
  • Deployment & monitoring (Docker, Terraform, MLflow, GitHub Actions)

Challenges / Design Patterns

  • Handled imbalanced dataset using SMOTE.
  • Implemented modular code with MVC design pattern.

Data Collection / Storage

MongoDB AWS

Data Cleaning / Preprocessing

Pandas NumPy SciPy

EDA & Visualization

Matplotlib Seaborn

Feature Engineering

Pandas NumPy

Modeling & Training

Python Scikit-learn PySpark

Model Tuning & Evaluation

MLflow Optuna

Deployment / CI-CD / Automation

FastAPI Docker Terraform GitHub Actions Pytest YAML Shell Script

Monitoring & Logging

Evidently AI Prometheus Grafana Loki Mimir Alloy

Design & Architecture

Project Deployment Demo

Impact & Performance

Accuracy: 92% F1-Score: 0.89 Training Time: 12 min
Model Performance

Developed an end-to-end ML pipeline with automated CI/CD, containerized deployment, and monitoring scripts. Achieved 92% accuracy, outperforming baseline models by 15%.

SQL Data Warehouse

SQL Data Warehouse

    📂 Domain : Retail / E-commerce

    🎯 Learning : Data Warehousing,
    SQL Queries, ETL, Analytics

    🔢 Type : OLAP (Analytical Project)

🛠 Technologies Used :

SQL Notion draw.io Git SQL Server

SQL Exploratory Data Analysis

Exploratory Data Analysis

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

SQL Notion draw.io Git SQL Server

Advanced Analytics

Advanced Analytics

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

SQL Notion draw.io Git SQL Server

Sales Insights

Sales Insights

    📂 Domain : Retail /
    Business Intelligence

    🎯 Learning : SQL, Power BI,
    Data Modeling, Visualization

    🔢 Type : Dashboard /
    Exploratory Data Analysis

🛠 Technologies Used :

PowerBI SQL SQL Server

ANN From Scratch

ANN From Scratch

    📂 Domain : Health Care

    🎯 Learning : Supervised

    🔢 Type : Binary Classification

🛠 Technologies Used :

Python Scikit-learn Flask Pandas Numpy Matplotlib Seaborn SciPy MongoDB AWS EC2 AWS ECR AWS S3

RNN From Scratch

RNN From Scratch

    📂 Domain : Health Care

    🎯 Learning : Supervised

    🔢 Type : Binary Classification

🛠 Technologies Used :

Python

CNN From Scratch

CNN From Scratch

    📂 Domain : Health Care

    🎯 Learning : Supervised

    🔢 Type : Binary Classification

🛠 Technologies Used :

Python

Foundational Topics

Foundational Topics

    📂 Domain : Health Care

    🎯 Learning : Supervised

    🔢 Type : Binary Classification

🛠 Technologies Used :

Python

Intermediate Topics

Intermediate Topics

    📂 Domain : Health Care

    🎯 Learning : Supervised

    🔢 Type : Binary Classification

🛠 Technologies Used :

Python

Advanced Topics

Advanced Topics

    📂 Domain : Health Care

    🎯 Learning : Supervised

    🔢 Type : Binary Classification

🛠 Technologies Used :

Python

Classification Using RNN

Classification Using RNN

    Domain : Humen Resource

    Learning : Supervised

    Type : Binary Classification

Technologies Used : Python, Transformer, Machine Learning, AWS EC2, AWS ECR, AWS S3

Classification Using CNN

Classification Using CNN

    📂 Domain : Health Care

    🎯 Learning : Supervised

    🔢 Type : Binary Classification

🛠 Technologies Used :

Python

Transformers

Transformers

    📂 Domain : Health Care

    🎯 Learning : Supervised

    🔢 Type : Binary Classification

🛠 Technologies Used :

Python

Complaint Classification

Complaint Classification

  • Title : Financial Dispute

  • Learning : Text Classification

  • Heighlights :

  • Technologies :

    Python, PySpark, FastAPI, SciPy,
    Pandas, Numpy, Matplotlib,
    Seaborn,Docker, MongoDB, AWS,
    Github Actions, Airflow

Sentiment Analysis

Text Classification

  • Title : BBC News classification

  • Learning : Multi-class Classification

  • Heighlights :

  • Technologies :

    Python, Tensorflow, Pandas,
    Numpy, GloVe/ Word2Vec,
    RNNs, Docker, MongoDB, AWS,
    Github Actions, Streamlit

Named Entity Recognition

Sentiment Analysis

  • Title : Twitter Sentiment Analysis

  • Learning : Transfer Learning

  • Heighlights :

    • X% improvement over baseline
    • reduced model size (~X%)
    • faster inference (~Y% speedup)
  • Technologies :

    Python, BERT, Transformers (🤗),
    Tensorflow, Docker, Streamlit,
    MongoDB, Github Actions, AWS

Machine Translation

Named Entity Recognition

  • Title : Named Entity Recognition

  • Learning : Sequence Labeling

  • Heighlights :

  • Technologies :

    [None]

Text Generation

Machine Translation

  • Title : English to Hinglish

  • Learning : Seq-2-Seq Learning

  • Heighlights :

  • Technologies :

    [None]

Q&A System

Text Generation

  • Title : Code Auto-Complition

  • Learning : Sequence Generation

  • Heighlights :

  • Technologies :

    [None]

Text Summarization

Q&A System

  • Title : QnA ChatBot

  • Learning : Conversational AI

  • Heighlights :

  • Technologies :

    [None]

Text to Speech

Text Summarization

  • Title : Text Summarizer

  • Learning : Sequence Summarization

  • Heighlights :

  • Technologies :

    [None]

Content Based Filtering

Speech to Text

  • Title : Hinglish to English

  • Learning : Sequence Transduction

  • Heighlights :

  • Technologies :

    [None]

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

[project title]

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 101

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 102

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 103

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 104

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 105

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 106

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 107

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 108

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

Project 109

[project title]

    📂 Domain : [None]

    🎯 Learning : [None]

    🔢 Type : [None]

🛠 Technologies Used :

Python

My Recent

Certificates

Course Certificate

Full Stack Data Science Masters

iNeuron (by Krish Naik) — 2024

View Certificate

Get in Touch

Contact Me

Thank you for visiting my portfolio! I'm always excited to connect with fellow professionals, collaborators, and enthusiasts in the fields of technology. You can reach me through the following platforms: