Skip to content
View gem09lo's full-sized avatar

Block or report gem09lo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
gem09lo/README.md

Hey there, I'm Gem Lo πŸ‘‹

I'm a data-driven individual who is passionate about turning raw data into actionable insights. Trained at Sigma Labs XYZ, I build end-to-end data pipelines and analytical tools using Python, SQL, and AWS. I enjoy working across the data lifecycle, from processing and analysis to visualisation and stakeholder reporting, using tools like Tableau and Streamlit to solve real-world problems and support data-driven decision-making.

🎯 Key Projects

πŸ› Liverpool Museum Visitor Feedback Pipeline

Goal: Improve visitor engagement and operational efficiency at a natural history museum by analysing real-time and historical survey data.

Task:

  • Built two parallel ETL pipelines to process kiosk feedback: one for historical data (Python + PostgreSQL) and one for real-time streams (Kafka + AWS EC2).
  • Provisioned cloud infrastructure using Terraform and stored data in AWS RDS.
  • Created an interactive Tableau dashboard to provide museum staff with real-time insights and recommendations on exhibit popularity and visitor satisfaction.

Key Tools: Python, SQL, Kafka, Tableau, AWS (EC2, RDS), Terraform

πŸ”— View Repository


πŸ” Tasty Truck Treats – Automated Sales Analytics

Goal: Help a mobile catering business use data to optimise menus, routes, and marketing strategy.

Task:

  • Designed a two-layer pipeline: one for processing historical sales data, another for ingesting daily transactions in real time using AWS Lambda and Step Functions.
  • Built a Streamlit dashboard to track performance KPIs and configured AWS SES to email daily reports to leadership.
  • Cleaned and transformed data using Pandas, and stored insights in Amazon Redshift.

Key Tools: Python (Pandas), SQL, AWS (S3, Lambda, Redshift, SES, Step Functions), Terraform, Docker, Streamlit

πŸ”— View Repository


πŸ›οΈ It’s On Sale – Price Tracking & Alert System

Role: Project Manager

Goal: Empower users to get notified when products they care about go on sale.

Task:

  • Built a web scraping tool (Python + BeautifulSoup) to track price changes across multiple e-commerce sites.
  • Developed a Streamlit front end for user input and settings.
  • Integrated AWS SES for real-time email alerts when tracked products dropped below user-defined thresholds.
  • Presented a live demo to non-technical stakeholders and discussed how businesses could use the tool for competitor pricing analysis.

Key Tools: Python, Streamlit, BeautifulSoup, Pandas, AWS SES, Terraform

πŸ”— View Repository


🧬 PubMed Articles – Institution Matching & Metadata Extraction (Machine Learning)

Goal: Automate the extraction and matching of institutional affiliations from PubMed research articles to support pharmaceutical research.

Task:

  • Built a multi-step ETL pipeline to extract XML data from S3, transform and clean it using spaCy and RapidFuzz, and match institutions to the GRID dataset.
  • Processed over 1 million PubMed articles with author metadata, keywords, and institutional affiliations.
  • Integrated NLP (spaCy) for named entity recognition (GPE and ORG) and applied fuzzy matching (RapidFuzz) to identify institutions.
  • Dockerised the pipeline and deployed it via AWS ECS Fargate, with automated task triggering using CloudWatch and EventBridge.
  • Managed infrastructure using Terraform.

Key Tools: Python (Pandas, spaCy, RapidFuzz), XML, AWS (S3, ECS Fargate, EventBridge, SES), Docker, Terraform

πŸ”— View Repository



πŸ“š Additional Projects

Software Development Fundamentals

  • πŸƒ Blackjack Game - Procedural - CLI game demonstrating clean code principles, TDD with pytest, and Python best practices (9.5/10 Pylint score)
  • πŸƒ Blackjack Game - OOP Refactor - Extension of the procedural Blackjack game, rewritten using object-oriented principles. Focuses on modular class design (Card, Deck, Hand), unit testing, and clean architecture for future gameplay expansion.
  • πŸŽ₯ Blockbuster Rental System - OOP - Simulates a video rental store using object-oriented design (Video, Customer, Rental, VideoStore). Demonstrates clean code, TDD with pytest, and real-world logic (due dates, fines, age validation).

Networking Fundamentals

  • ✈️ Airport Departure Board CLI - Real-time flight board that pulls live data from two public APIs (Airlabs & WeatherAPI) to show upcoming departures and weather at each destination. Built using Python and requests, with error handling, CLI arguments, and rich for styled terminal output. Final output can be exported to JSON or HTML.

Key Tools: Python, RESTful APIs, HTTP requests, HTTP responses, argparse, requests, rich, JSON/HTML


Backend Fundamentals

  • πŸ“¦ PokΓ©mon API - Flask-based RESTful API that serves data on PokΓ©mon species and their types. Includes full CRUD functionality, robust filtering/search logic, and detailed unit tests. Ideal for exploring REST principles and dynamic routing with Python.

Key Tools: Python, Flask, psycopg2-binary, REST API, JSON, Unit Testing, CRUD

  • πŸ“° Social News Website API - Reddit-style social news aggregator built with Flask. Users can submit stories, vote, search, and sort posts via a front-end web interface. Includes BBC web scraper, persistent JSON storage, and full test coverage with pytest.

Key Tools: Python, Flask, HTML/CSS (frontend), REST API, JSON, BeautifulSoup, Pytest

Popular repositories Loading

  1. T3-Trucks-Project T3-Trucks-Project Public

    Python 1

  2. PriceSlashTrack PriceSlashTrack Public

    This repository is a price tracking and notification system designed to help users monitor product prices and receive alerts when prices drop below their specified thresholds. By integrating with A…

    Python 1 3

  3. Woke-Content-Detector-Analysis Woke-Content-Detector-Analysis Public

    Forked from indiah444/Woke-Content-Detector-Analysis

    Jupyter Notebook 1

  4. Liverpool-Museum-of-Natural-History-LMNH-Project Liverpool-Museum-of-Natural-History-LMNH-Project Public

    Python 1

  5. PubMed-Articles PubMed-Articles Public

    Jupyter Notebook 1

  6. hello-world hello-world Public

    Git-it challenge