Selected Projects

RAG for Accreditation (Semi‑Automated)

Mar 2025 – Present

Automated data extraction & verification to accelerate non‑credit program accreditation reviews. Vue UI + TypeScript services orchestrate Retrieval‑Augmented Generation with Neo4j & Pinecone, cross‑checking user submissions against external sources.

TypeScriptVueAWSNeo4jPineconeLLMs

View code on GitHub →

End‑to‑End ML Pipeline for Classification

Dec 2024

Data prep → feature engineering → hyperparameter tuning → evaluation with Gradient Boosting, Random Forest, and Logistic Regression. Included cross‑validation and feature importance to ensure robust generalization.

PythonPandasscikit‑learnXGBoost

Repository / notebooks →

Chicago Crime — Geospatial & Temporal Analytics

Nov 2024

Processed a decade of data (2.5M rows), engineered features, and delivered Tableau dashboards mapping spatial clusters and seasonal trends to inform policy and resource allocation.

PythonPandasTableauGeo

Dashboards / code →

Advanced Analytics & Predictive Modeling

Jun 2024

Regression and classification with feature engineering, scaling, encoding; improved accuracy by ~15‑20% via iterative tuning and metrics‑driven validation.

Pythonscikit‑learnNumPyMatplotlib

Notebooks →

Impact of Vaccination on COVID‑19 Mortality (R)

May 2024

Built a data analytics pipeline in R to transform messy datasets and model outcomes; increased processing efficiency and model accuracy using statistical tests and visualization.

Rtidyverseggplot2

R scripts →

Customer Data Warehouse — Market Basket & Time Series

Apr 2024

Managed 10k+ customer records and surfaced behavior insights via SQL analytics including basket analysis and basic time‑series techniques.

PythonSQLPandas

Code / ERD →

Monthly Billing Automation

Dec 2023

Automated a recurring billing process (‑90 minutes) with REST API ingestion and Excel transformations to standardize outputs for finance.

PythonRESTExcel

Scripts →

IBM Message Queues — Metrics Plugin

Sep 2022

Python + JSON plugin to pull queue metrics hourly and publish to dashboards; reduced anomaly triage time during P1 events.

PythonJSONAutomation

Source →

Hybrid Infrastructure Transition — BI Dashboards

2019–2021

Analyzed migration candidates for a hybrid cloud model and delivered Power BI visualizations after cleansing large server datasets.

Power BIPythonData Cleaning

Artifacts →