Portfolio | Westyn Hilliard

Automated Freight Label OCR & Routing Pipeline (UPS)

Description: Developed and deployed a full OCR pipeline using Google Cloud Vision and OpenCV to extract key fields (e.g., airport codes, tracking numbers) from scanned shipping labels, used Gemini to classify fields, and automatically assign routing decisions across UPS freight operations.

Tools/Technologies Used: - Python, OpenCV, Pandas - Google Cloud Vision API - Vertex AI, Flask (for API deployment)

Methodology: 1. Preprocessed scanned label images using image cleaning and contour detection. 2. Extracted text using Google OCR and parsed it into structured data. 3. Applied business logic and Gemini-based interpretation to determine routing codes. 4. Packaged into a deployable API for automated sort decisions.

Results: Automated 90%+ of routing tasks for freight labels, improving accuracy and operational speed at UPS sort hubs.

Voice-Powered Flow Controller Assistant (UPS)

Description: Built a voice-enabled AI Agent assistant that allows UPS Freight Flow Controllers to ask natural language questions like “Where is trailer 123456?” and receive spoken answers with real-time trailer/bay assignments based on scaping data from the TMS.

Tools/Technologies Used: - Google Speech-to-Text & Text-to-Speech APIs - Gemini 1.5 Flash (LLM) for intent parsing - Vertex AI, Python, FastAPI

Methodology: 1. Converted audio input to text using Google STT with custom context boosting. 2. Parsed intent and entities (e.g., trailer numbers) with Gemini. 3. Queried UPS TMS system and responded via Google TTS. 4. Deployed as a cloud API and tested across real hub workflows.

Results: Reduced manual trailer lookup time by over 70%, enabled hands-free interaction for control room staff, and served as a proof-of-concept for conversational AI in logistics.

Exploratory Data Analysis on Student Performance

Description: This project explores student performance data to identify relationships between parental education, economic indicators, and prior academic results, aiming to understand factors influencing graduation success.

Tools/Technologies Used: - Python, Pandas, NumPy - Matplotlib, Seaborn, Statsmodels

Methodology: 1. Cleaned and prepared the data by handling missing values and outliers. 2. Conducted exploratory data analysis (EDA) to identify patterns and distributions. 3. Performed regression analysis to evaluate relationships between features and graduation rates.

Results: Parental education and economic indicators significantly impact student graduation rates.

View Project

Fraud Detection in Financial Transactions

Description: This project uses machine learning models to detect fraudulent financial transactions by analyzing patterns and identifying anomalies.

Tools/Technologies Used: - Python, Scikit-learn, Pandas, NumPy - Matplotlib

Methodology: 1. Cleaned and combined credit card, online retail, and mobile transaction datasets. 2. Implemented classification models such as Logistic Regression and Random Forest. 3. Evaluated model performance using precision, recall, and AUC scores.

Results: The Random Forest model achieved an AUC score of 0.98, effectively detecting fraudulent activities.

View Project

Video Game Sales Analysis

Description: Analyzes video game sales data to identify top-performing games, platforms, and regional trends.

Tools/Technologies Used: - Python, Pandas, Matplotlib

Methodology: 1. Cleaned the dataset and removed irrelevant entries. 2. Performed EDA to analyze sales trends across platforms, regions, and publishers. 3. Visualized insights using bar charts and line graphs.

Results: The analysis showed Nintendo as a dominant publisher, with the Wii leading in platform-specific global sales.

View Project

Automotive Data Analysis and Prediction

Description: Explores relationships between vehicle specifications (horsepower, weight, fuel efficiency) and builds a predictive model for fuel efficiency.

Tools/Technologies Used: - Python, Pandas, Matplotlib, Seaborn

Methodology: 1. Cleaned and preprocessed automotive data. 2. Explored relationships between features using heatmaps and scatter plots. 3. Built a regression model to predict fuel efficiency (mpg).

Results: The regression model highlighted weight and horsepower as key factors negatively impacting fuel efficiency.

View Project

Boston Housing Data Analysis

Description: Explores the Boston Housing dataset to identify socioeconomic factors influencing housing prices.

Tools/Technologies Used: - Python, Pandas, Matplotlib, Seaborn

Methodology: 1. Performed data cleaning and handled missing values. 2. Conducted EDA to find correlations between crime rate, rooms per dwelling, and housing prices.

Results: The number of rooms per dwelling had a strong positive correlation with housing prices, while the crime rate negatively impacted prices.

View Project

Web Scraping for Top 100 Ebooks

Description: Uses web scraping techniques to extract the top 100 eBooks from Project Gutenberg.

Tools/Technologies Used: - Python, BeautifulSoup, Requests

Methodology: 1. Scraped data (titles, authors, and book IDs) from Project Gutenberg. 2. Cleaned and formatted the data using regular expressions.

Results: Successfully extracted and cleaned a list of the top 100 eBooks, providing a ready-to-use dataset.

View Project

TSA Complaint Analysis

Description: Analyzes TSA complaint data to uncover trends and patterns across categories and airports.

Tools/Technologies Used: - Python, Pandas, Seaborn, Matplotlib

Methodology: 1. Cleaned and combined TSA-Compliant datasets. 2. Visualized complaints by airport, time period, and complaint category.

Results: The analysis revealed mishandling of passenger property as the most frequent complaint.

Files Available:

View Project

Predicting US Retail Sales Trends

Description: Uses time series analysis to model and predict monthly retail sales trends in the United States, identifying key patterns, trends, and disruptions.

Tools/Technologies Used: - Python, Pandas, Statsmodels - ARIMA (AutoRegressive Integrated Moving Average), Matplotlib

Methodology: 1. Cleaned monthly retail sales data and handled inconsistencies. 2. Visualized retail sales trends over time. 3. Applied ARIMA models for time series forecasting. 4. Evaluated model accuracy.

Results: Successfully predicted retail sales trends, highlighting major economic disruptions like the 2008 financial crisis and COVID-19.

View Project

Customer Churn Prediction for Telecom

Description: Builds a predictive model to identify customers likely to churn from a telecom service provider.

Tools/Technologies Used: - Python, Pandas, NumPy, Scikit-learn - Logistic Regression, Gradient Boosting, Matplotlib, Seaborn

Methodology: 1. Cleaned data, handled missing values, and prepared features for modeling. 2. Visualized usage patterns and billing trends across churned and retained customers. 3. Implemented Logistic Regression and Gradient Boosting models. 4. Evaluated performance using accuracy, precision, recall, and AUC metrics.

Results: The Gradient Boosting model achieved 85% accuracy with a strong ROC-AUC score, successfully identifying high-risk churn customers.

View Project

Mental Health Data Analysis

Description: Analyzes global mental health data to identify prevalence rates, risk factors, and trends in disorders like depression and anxiety.

Tools/Technologies Used: - R, ggplot2, dplyr, caret

Methodology: 1. Cleaned and organized global mental health datasets. 2. Visualized prevalence rates across demographics, regions, and time periods. 3. Applied regression models to identify risk factors influencing mental health disorders.

Results: Significant relationships were found between economic conditions, unemployment, and mental health disorder prevalence.

View Project

Childcare Costs Analysis

Description: Explores U.S. childcare costs from 2008 to 2018, highlighting trends, regional disparities, and their socioeconomic impacts.

Tools/Technologies Used: - Python, Tableau, Pandas, Excel

Methodology: 1. Cleaned missing values and preprocessed variables in the childcare costs dataset. 2. Conducted EDA to identify key trends and regional disparities. 3. Designed a Tableau dashboard for interactive exploration. 4. Documented findings in a final report.

Results: Childcare costs increased by 21% over the decade, with the Northeast and West Coast having the highest costs.

View Project

P O R T F O L I O

Automated Freight Label OCR & Routing Pipeline (UPS)

Voice-Powered Flow Controller Assistant (UPS)

Exploratory Data Analysis on Student Performance

Fraud Detection in Financial Transactions

Video Game Sales Analysis

Automotive Data Analysis and Prediction

Boston Housing Data Analysis

Web Scraping for Top 100 Ebooks

TSA Complaint Analysis

Predicting US Retail Sales Trends

Customer Churn Prediction for Telecom

Mental Health Data Analysis

Childcare Costs Analysis