Marketing Analytics
This page showcases my marketing analytics and data engineering work — from building cloud-to-local pipelines on live campaign data, to statistical modeling of conversion drivers, to interactive operational analytics for stakeholder decision-making. My target audience is analytics managers, data-driven marketing teams, and academic search committees looking for someone who can own the full analytics stack.
GA4 → BigQuery → DuckDB Conversion Modeling Pipeline
This covers the data engineering and modeling layer of the Rising Stars engagement. For the website build, GTM/GA4 implementation, and campaign analytics, see Digital Marketing → Rising Stars.
For the Rising Stars Muay Thai engagement, I designed and built a multi-stage data pipeline to process raw GA4 event exports at scale and power downstream predictive modeling.
Pipeline Architecture
GA4 Raw Events
↓
BigQuery (Cloud Export)
↓
Apache Arrow (In-Memory Transfer)
↓
DuckDB (Local Query Engine)
↓
Session-Level Aggregations → R Modeling Environment
- BigQuery export — automated extraction of raw GA4 event data (sessions, events, user properties, ecommerce) to cloud storage.
- GA4 Data API backfill — used the GA4 Data API to recover pre-BigQuery-export data, ensuring complete event-window coverage across both event cycles.
- Apache Arrow transfer — zero-copy in-memory transport between cloud and local environments, eliminating intermediate CSV serialization.
- DuckDB local engine — SQL-based session-level aggregation, feature engineering (page depth, scroll events, device type, traffic medium), and query optimization for fast iteration during modeling.
Predictive Modeling — Conversion Drivers
Built three logistic regression models — ticket conversion, PPV conversion, and combined conversion — to quantify what independently predicts a purchase.
- Features: page depth, device category, traffic medium, pre/post-event timing, content engagement signals
- Outputs: odds ratios and average marginal effects for each predictor — giving stakeholders an interpretable answer to “what moves the needle?”
- Validation: compared in-sample fit against holdout periods to assess model stability across event cycles
All client metrics, session counts, and conversion figures are confidential. The pipeline architecture and modeling approach are described without proprietary data.
Operations Analytics — Ontario International Airport Parking
As part of the Cal Poly Pomona CEO Business Challenge, I conducted a multi-year operational time-series analysis for Ontario International Airport (ONT), exploring how parking revenue, passenger volumes, and ride-hailing (TNC) activity interact — with the goal of informing future-proof parking strategy and non-aeronautical revenue growth.
Data Engineering
Working under strict data governance requirements (NDA), I:
- Integrated multiple datasets: parking revenue and transactions by lot, passenger counts by airline and direction, TNC trip volumes, and lot occupancy.
- Reshaped disparate sources into consistent monthly and weekly time series with engineered date features.
- Standardized categories across inconsistent vendor naming conventions.
- Generated a non-sensitive YAML schema describing table structures and missingness — enabling safe internal documentation without exposing raw values.
Interactive Visualizations
Built in R with Plotly and Quarto:
| Visualization | Purpose |
|---|---|
| Stacked time-series (monthly revenue by lot) | Reveal product mix shifts over time |
| Rank trajectory chart | Track lot performance standings across periods |
| Inbound vs. outbound flow chart | Identify directional asymmetries in passenger volumes |
| TNC Sankey diagram | Map ride-hail company → trip type relationships |
| Weekly occupancy heatmap | Expose day-of-week and seasonal occupancy patterns |
All ONT data and figures are confidential; this describes methods and visualization types, not proprietary results.
Geospatial Analytics — Market Area Mapping with Shiny & APIs
In my role as an Associate Analyst at Robert D. Niehaus, Inc., I replaced a manual, safety-compromised field data collection process with a Shiny R application that automates GPS-based market area delineation for military installations.
The Problem
Determining housing “market areas” for 300+ U.S. military installations previously required analysts to manually note observations at major intersections — a slow, inaccurate, and unsafe process.
The Solution
- Strava API integration — the app captures GPS data passively during field drives, eliminating manual recording.
sf+leafletmapping — automatically generates an interactive map of the driven routes for immediate review and QA.- HERE API integration — provides independent market area estimates as a verification layer against the GPS-driven boundaries.
- Deployed as a self-contained Shiny app used operationally across the Housing Market Analysis team.
Impact: eliminated safety risks, improved spatial accuracy, and compressed the field data collection cycle for 300+ installation analyses.
Team Performance Dashboard
While managing a large flagging and quality-control workflow at Robert D. Niehaus, I built a real-time progress dashboard in R to address declining team morale from an overwhelming, opaque workload.
The dashboard visualized:
- Total flagged records resolved vs. remaining
- Team throughput over time (daily and weekly trends)
- Individual contribution breakdowns
Outcome: within days of deployment, the team had clear visibility into their progress. Morale improved measurably, and throughput increased as the path to completion became concrete and visible. A small analytics intervention with a direct operational impact.
Analytics Stack
| Layer | Tools |
|---|---|
| Tracking & Tag Management | Google Tag Manager, GA4 |
| Cloud Data | BigQuery, GA4 Data API |
| Data Transport | Apache Arrow, Parquet |
| Local Query Engine | DuckDB, SQL |
| Analysis & Modeling | R (tidyverse, lubridate, broom), Python |
| Visualization | Plotly, ggplot2, leaflet, DT |
| Reporting | Quarto (HTML, PDF, interactive) |
| Dashboards | Shiny |