Data Engineer | Spark β’ Flink β’ Airflow β’ AWS β’ Streaming β’ AI Orchestration
I design and operate reliable, scalable batch and real-time data platforms with a strong focus on:
- correctness
- performance
- failure handling
- automation
I enjoy turning complex distributed systems into predictable, operable platforms.
- Large-scale batch & streaming pipelines
- Retry-safe, idempotent data workflows
- Low-latency analytics systems
- AI-driven orchestration using metadata & MCP
- Cloud-native data platforms on AWS
Languages
Python Β· SQL Β· Java
Data & Streaming
Apache Spark Β· Apache Flink Β· Kafka Β· Iceberg
Orchestration & Cloud
Airflow Β· AWS (S3, EMR, EMR Serverless, Glue, Athena) Β· Docker
AI & Agents
Model Context Protocol (MCP) Β· AI Agents Β· LLMs
%%{init: { 'theme': 'dark', 'themeVariables': { 'pie1': '#7aa2f7', 'pie2': '#bb9af7', 'pie3': '#7dcfff', 'pie4': '#ff9e64', 'pieTitleTextSize': '20px', 'pieLegendTextSize': '16px', 'fontFamily': 'Fira Code' } } }%%
pie showData
title STACK ALLOCATION
"π Python" : 65
"π SQL" : 30
"β Java" : 5
