As a data scientist and researcher, I bridge the gap between complex business problems and data solutions.
My data science career has spanned the healthcare, finance, and technology industries, with projects ranging from ad hoc statistical analysis to production-grade machine learning systems.
As a Ph.D. student specializing in deep learning-based computer vision and image compression, my goal is to build ethical and energy-conscious machine learning solutions to help autonomous systems better perceive the world around them.
If you’d like to learn more about my approach to technical problems, project design, or team development, please feel free to reach out.
Education | Work | Instruction | Toolbox
Education
-
University of Missouri - Kansas City
Research Area: Multi-sensor fusion for deep learning-based image compression
Courses:
Computer Vision
Deep Learning
Compression
Algorithms
NLP
-
Kansas State University
Thesis: The Application and Interpretation of the Two-Parameter Item Response Model in the Context of Replicated Preference Testing
Courses:
Statistical Theory
Statistical Computing
Experimental Design
Consulting
Linear Models
Survival Analysis
-
Kansas State University
Research Programs:
Summer Institute for Training in Biostatistics (University of Pittsburgh)
Mount Holyoke College REU
Courses:
Statistics
Calculus
C++ Programming
Linear Algebra
Experimental Design
Real Analysis
Topology
Number Theory
Engineering Physics
Work Experience
-
LADDER
Summary: Device error remediation through a scalable system with multiple standalone components - includes anomaly detection for 15+ metrics, risk modeling for 4 failure types, risk attribution, and proactive remediation
Role: Architect, Project Manager, & Lead Developer
Tech: PySpark, Pandas, scikit-survival, Airflow, EMR
Smart DEX
Summary: Reimagining of device experience score from weighted averages to supervised user sentiment prediction
Role: Architect & Lead Developer
Tech: PySpark, PyTorch, EMR
Personas
Summary: Software/hardware usage clustering for HP device recommendation
Role: Architect & Project Manager
Tech: PySpark, MLlib, Airflow, EMR
Fleet Simulator
Summary: Suite of telemetry and failure models to simulate/optimize performance based on device specifications
Role: Architect, Project Manager, & Lead Developer
Tech: PySpark, Pandas, statsmodels, scikit-learn, hyperopt, matplotlib, Airflow, EMR
xlr8
Summary: Utilities for data science acceleration, including feature engineering, data I/O, and logging
Role: Architect & Lead Developer
Tech: PySpark, Pandas, AWS SDK, Keras, pymsteams
-
Algorithm Optimization
Summary: Research and experimentation on constrained 2-D Knapsack Problem for revenue optimization - discovered small adjustment worth $350k/year, identified future research paths for transformational improvements
Role: Lead Developer
Tech: Python, Redshift
Customer Targeting
Summary: Refreshed targeting system with improved conversion rates and efficiency - this required a paradigm shift in customer prioritization from predicted conversion likelihood to predicted effects of engagement
Role: Architect & Lead Developer
Tech: Python, SageMaker AutoPilot, Pandas, Redshift, Salesforce
Revenue Forecasting
Summary: Ensemble forecasting model to improve on manual budgeting process
Role: Architect & Lead Developer
Tech: XGBoost, Prophet, sktime, Pandas, Redshift
-
Cognitive Processing Engine
Summary: Data extraction from unstructured PDFs, using CNN-based clustering and object detection - projected to reduce operational costs by more than $1M per year
Role: Architect & Lead Developer
Tech: Python, Azure Cognitive Services, Keras, scikit-learn, tkinter, SQL Server
Customer Targeting
Summary: Tableau dashboard and analysis to prioritize customers with greatest predicted engagement impact
Role: Analyst
Tech: Tableau, SQL Server, R
Sales Assistant
Summary: Automated prospecting, web scraping, and data aggregation for pre-sales intelligence
Role: Lead Developer
Tech: SQL Server, Python, Salesforce
-
Affordable Care Act Utilization
Summary: Intervention analysis to investigate the effects of the ACA (prepared for the Clinton 2016 campaign)
Role: Developer
Tech: R, SQL Server
Executive-Level Claims Dashboard
Summary: Comprehensive claims and utilization dashboard for large healthcare organizations - this was Epic's first Tableau product offering and won an internal award for achievement
Role: Data Specialist (Simulated dataset for product demo)
Tech: Tableau, Excel
Instructional Experience
-
Kansas State University - Salina
Advise on the structure and desired outcomes of the Machine Learning and Autonomous Systems program
-
Kansas State University - Salina
Instructed two short courses comprising the Applied Data Science & Machine Learning Professional Certificate, designed to teach industry professionals the fundamentals of DS/ML
Introduction to Data Science - Data management, analysis, and feature engineering using SQL
Introduction to Machine Learning - Supervised and unsupervised learning with Python
-
Kansas State University - College of Business
Led a series of workshops for students in the M.S. Data Analytics program
Machine Learning - Classification (2022)
Data Wrangling (2021)
Careers in Data Science (2020-2021)
Machine Learning - Forecasting (2020)
-
Kansas State University - Dept. of Statistics
Advised university researchers on a wide variety of statistical topics through the Statistical Consulting Lab, including experimental design, repeated measures analysis, logistic regression analysis, and model interpretation
-
Kansas State University - Dept. of Statistics
Designed curriculum, instructed, and tutored for Business and Economics Statistics I & II
-
Kansas State University
Undergraduate-level tutoring for various mathematics and statistics courses, through the Scholars Assisting Scholars program, the Statistics Help Lab, and the K-State Athletic Department
-
Kansas State University - Choral Music Division
Served as Lead Counselor for the highly selective Summer Choral Institute, designed to build leadership and musical skills in top-performing high school students
Toolbox
Traditional Data Science Methods
Linear Models (Polynomial, Logistic, Probit)
Tree-Based Models (Random Forest, XGBoost)
Classification (kNN, SVM)
Forecasting (ARIMA, LSTM, Decomposition)
Clustering (K-Means, Hierarchical, DBSCAN, GMM)
Anomaly Detection (Isolation Forest, Mahalanobis)
Dimensionality Reduction (PCA, t-SNE, Autoencoder)
Scoring (Predictive, Normalization, Weighted Avg)
Hypothesis Testing (t-test, ANOVA, Chi-Square)
Feature Importance (Shapley, LIME, Permutation)
Ensembles (Bagging, Boosting, Stacking, Blending)
Advanced Techniques
Deep Learning (CNN, RNN, LSTM, GAN, AE, Transformer)
Computer Vision (Detection, Tracking, Classification)
NLP (Classification, Sentiment, Topic Modeling)
Image Compression (DL-Based Codecs, JPEG, HEIF)
Risk Modeling (Cox Model, Kaplain-Meier Curve)
Ordinal Regression (GLM, CORAL)
Cardinality Reduction (Embedding, Encoding)
Optimization (Gradient Descent, Weighted Sum, TPE)
Data Augmentation (SMOTE, Image Transforms, Noise)
Recommendations (Clustering, Predictive Modeling)
Web Scraping (HTML, XPath, Regex)
Tech Stack

Python

GitHub

VS Code

PyTorch

scikit-learn

Keras

Jupyter

AWS

SageMaker

PySpark

Pandas

NumPy

R

PostgreSQL

SQL Server

OpenCV

XGBoost

statsmodels

Tableau

Airflow

Jira

SciPy

Prophet

NLTK

Arc

Obsidian

Figma

matplotlib

BeautifulSoup

Selenium