I'm a data-driven individual who is passionate about turning raw data into actionable insights. Trained at Sigma Labs XYZ, I build end-to-end data pipelines and analytical tools using Python, SQL, and AWS. I enjoy working across the data lifecycle, from processing and analysis to visualisation and stakeholder reporting, using tools like Tableau and Streamlit to solve real-world problems and support data-driven decision-making.
Goal: Improve visitor engagement and operational efficiency at a natural history museum by analysing real-time and historical survey data.
Task:
- Built two parallel ETL pipelines to process kiosk feedback: one for historical data (Python + PostgreSQL) and one for real-time streams (Kafka + AWS EC2).
- Provisioned cloud infrastructure using Terraform and stored data in AWS RDS.
- Created an interactive Tableau dashboard to provide museum staff with real-time insights and recommendations on exhibit popularity and visitor satisfaction.
Key Tools: Python, SQL, Kafka, Tableau, AWS (EC2, RDS), Terraform
Goal: Help a mobile catering business use data to optimise menus, routes, and marketing strategy.
Task:
- Designed a two-layer pipeline: one for processing historical sales data, another for ingesting daily transactions in real time using AWS Lambda and Step Functions.
- Built a Streamlit dashboard to track performance KPIs and configured AWS SES to email daily reports to leadership.
- Cleaned and transformed data using Pandas, and stored insights in Amazon Redshift.
Key Tools: Python (Pandas), SQL, AWS (S3, Lambda, Redshift, SES, Step Functions), Terraform, Docker, Streamlit
Role: Project Manager
Goal: Empower users to get notified when products they care about go on sale.
Task:
- Built a web scraping tool (Python + BeautifulSoup) to track price changes across multiple e-commerce sites.
- Developed a Streamlit front end for user input and settings.
- Integrated AWS SES for real-time email alerts when tracked products dropped below user-defined thresholds.
- Presented a live demo to non-technical stakeholders and discussed how businesses could use the tool for competitor pricing analysis.
Key Tools: Python, Streamlit, BeautifulSoup, Pandas, AWS SES, Terraform
Goal: Automate the extraction and matching of institutional affiliations from PubMed research articles to support pharmaceutical research.
Task:
- Built a multi-step ETL pipeline to extract XML data from S3, transform and clean it using spaCy and RapidFuzz, and match institutions to the GRID dataset.
- Processed over 1 million PubMed articles with author metadata, keywords, and institutional affiliations.
- Integrated NLP (spaCy) for named entity recognition (GPE and ORG) and applied fuzzy matching (RapidFuzz) to identify institutions.
- Dockerised the pipeline and deployed it via AWS ECS Fargate, with automated task triggering using CloudWatch and EventBridge.
- Managed infrastructure using Terraform.
Key Tools: Python (Pandas, spaCy, RapidFuzz), XML, AWS (S3, ECS Fargate, EventBridge, SES), Docker, Terraform
- π Blackjack Game - Procedural - CLI game demonstrating clean code principles, TDD with pytest, and Python best practices (9.5/10 Pylint score)
- π Blackjack Game - OOP Refactor - Extension of the procedural Blackjack game, rewritten using object-oriented principles. Focuses on modular class design (Card, Deck, Hand), unit testing, and clean architecture for future gameplay expansion.
- π₯ Blockbuster Rental System - OOP - Simulates a video rental store using object-oriented design (Video, Customer, Rental, VideoStore). Demonstrates clean code, TDD with pytest, and real-world logic (due dates, fines, age validation).
βοΈ Airport Departure Board CLI - Real-time flight board that pulls live data from two public APIs (Airlabs & WeatherAPI) to show upcoming departures and weather at each destination. Built using Python and requests, with error handling, CLI arguments, and rich for styled terminal output. Final output can be exported to JSON or HTML.
Key Tools: Python, RESTful APIs, HTTP requests, HTTP responses, argparse, requests, rich, JSON/HTML
- π¦ PokΓ©mon API - Flask-based RESTful API that serves data on PokΓ©mon species and their types. Includes full CRUD functionality, robust filtering/search logic, and detailed unit tests. Ideal for exploring REST principles and dynamic routing with Python.
Key Tools: Python, Flask, psycopg2-binary, REST API, JSON, Unit Testing, CRUD
- π° Social News Website API - Reddit-style social news aggregator built with Flask. Users can submit stories, vote, search, and sort posts via a front-end web interface. Includes BBC web scraper, persistent JSON storage, and full test coverage with pytest.
Key Tools: Python, Flask, HTML/CSS (frontend), REST API, JSON, BeautifulSoup, Pytest


