Projects

Current Project: MCP Server-based LLM-based AI agent for Diabetes Monitoring(2025-present)

Built a Diabetes Monitoring AI agent that connects real patient-generated data from Apple Health to an Amazon RDS MySQL backend and exposes it to AI through the Model Context Protocol. The server provides five callable MCP tools that enable AI clients like Claude Desktop to retrieve glucose, sleep, and exercise records, detect temporal and behavioral patterns, and compute correlations between lifestyle factors and glucose outcomes using techniques such as Pearson’s correlation. It uses Python, SQLAlchemy ORM, PyMySQL, and environment-based configuration to securely query timestamped CGM readings, sleep-stage breakdowns, and workout durations. The system enables end-to-end analytics, Apple Health data ingestion, typed MySQL storage, and AI-driven insight generation—without dashboards or manual preprocessing, accelerating experimentation in HealthTech workflows.
[GitHub]

Producer-Consumer (Dec 2025)
A production-style producer–consumer system built using Apache Kafka and Go, running Kafka in Docker and explicitly using KRaft (Kafka Raft) for metadata management. The system leverages partitioned Kafka topics, consumer groups, and hash-based message partitioning to enable parallel, load-balanced processing. To account for Kafka’s eventual consistency in KRaft mode, the design includes broker readiness checks, controller-aware topic initialization, and robust retry and backoff logic for transient coordinator failures. Compared to an in-memory queue, this architecture provides durable, disk-backed storage, fault tolerance across restarts, and the ability for consumers to resume processing from committed offsets or replay data when needed.
[GitHub] Screenshot 1

Screenshot 2

Health Data Export System (Aug 2025 - Present)
Health Data Export System is a scalable data ingestion and analytics pipeline designed to process real-time health data exported from the iPhone Auto Export (Apple Health) app. The system supports two ingestion modes: a Flask-based REST API server for direct JSON uploads and a FastAPI gateway that forwards requests to a backend gRPC service for high-throughput, low-latency processing. Incoming health metrics are validated, transformed, and written to an AWS RDS MySQL database, with Pandas used for time-series processing of glucose and insulin events.
[GitHub]

AI-powered HackerRank Candidate Screening Agent (Nov 2025 - Dec 2025)
Built an AI-powered HackerRank Candidate Screening Agent using MCP, an end-to-end automation system that streamlines technical hiring by orchestrating real production workflows through Anthropic’s Model Context Protocol. The agent fetches and paginates candidate results via HackerRank’s REST APIs, applies configurable pass thresholds and ranking logic, automatically advances top performers to harder assessments, sends personalized email notifications, and schedules recruiter interviews using Google Calendar. Core screening actions are exposed as MCP tools, enabling assistants like Claude Desktop to safely trigger backend workflows.
[GitHub]

Interactive Data Visualization Platform for U.S. Suicide Statistics
An interactive web-based data visualization platform was developed to explore U.S. suicide statistics from 2014–2023 at the state level, with an emphasis on clarity, comparability, and responsible communication of sensitive public-health data. The system integrates multiple coordinated views, including an interactive choropleth map, time-series comparisons, animated bar and line chart races, scroll-driven map narratives, scatter plots, heatmaps, and percentage-change analysis to reveal trends and disparities over time. All visualizations default to age-adjusted rates to enable fair comparisons across states with differing population structures. Built using D3.js, TopoJSON, and vanilla JavaScript, the platform is fully static, easily deployable on services such as GitHub Pages, and designed to make complex public-health data more accessible and interpretable.
[GitHub] [Live Demo]

Platform Migration (2024-2025)
This project examines the impact of platform migration on the specific community during their transition from Twitter to alternative platforms like Parler and Dotwin following initial widespread account bans on Twitter. We examine how community dynamics—including user roles and activities—evolved amidst this migration by analyzing their activities on Twitter, Parler, and Dotwin in the months leading up to their ban. We assess user engagement and influence changes, categorizing users into five distinct roles: ’common users,’ ‘broadcasters,’ ’influentials,’ ’hidden influentials,’ and ’lurkers.’ Conducted temporal analyses of weekly fluctuations in user activity, revealing significant trends and patterns in engagement across mainstream and alternative platforms. Introduced lexical analysis to examine the nature of conversations, categorizing content into themes such as violence and conspiracy to uncover underlying narratives within the specific community.
[GitHub]

The findings of this project were selected for a poster presentation at ACM GROUP 2025 [PDF], and a full paper was submitted to ACM CSCW 2025 [PDF], which is under review.

Machine Learning project (2024)
Achieved a 22nd rank out of 142 students in a Kaggle competition centered on the Old Bailey dataset, a binary classification problem. Implemented various machine learning models, including perceptrons, support vector machines (SVMs), logistic regression, and ensemble methods like AdaBoost. Focused on data preprocessing by integrating multiple datasets (e.g., bag-of-words, TF-IDF, and GloVe embeddings), handling missing values, and applying label encoding to categorical features. Tuned hyperparameters extensively and optimized performance by combining datasets and utilizing advanced ensemble techniques. This project demonstrated my ability to apply machine learning concepts to a complex real-world dataset.
[GitHub]

Assessing the Influence of the Covid 19 Pandemic on Indian Pharmaceutical Companies (2022)
Our objective is to study, analyze and draw inferences on the movement of the stock prices of Indian pharmaceutical companies solely based on the COVID-19 pandemic in India. We specifically targeted pharmaceutical stocks because their share price is more directly dependent on the COVID-19 pandemic than companies in other sectors. As the demand for the primary products sold by pharmaceutical companies, i.e., medicines, is directly dependent on the COVID-19 pandemic, a common hypothesis is that the stock prices of pharmaceutical companies at a given time are significantly contingent upon the COVID-19 pandemic situation. We have tested this hypothesis by calculating the correlation between pharmaceutical stock prices and COVID-19 variables that measure the severity and provide an outline of the COVID-19 pandemic. Furthermore, as human emotion plays a significant part in deciding the share prices, we have considered public fear and awareness by considering the frequency by which the terms “Covid 19” and “Covid medicines” are searched on Google. The data we have considered for our study belongs to the period from 15th March 2020 to 17th February 2022.
[GitHub] [PDF]

The findings of this project were selected for a poster presentation at ACM GROUP 2025[PDF], and a full paper was submitted to ACM CSCW 2025[PDF], which is under review.

The findings of this project were selected for a poster presentation at ACM GROUP 2025 [PDF], and a full paper was submitted to ACM CSCW 2025 [PDF], which is under review.