⚡ LiveSense QoE: Real-time Livestream Analytics & AI Moderation System

"Turning Chaos into Insights" — Hệ thống phân tích thời gian thực giúp Streamer và Moderator thấu hiểu khán giả, phát hiện toxic và nắm bắt khoảnh khắc viral ngay lập tức.

📖 Tổng quan dự án (Project Overview)

LiveSense QoE (Quality of Experience) là một giải pháp MLOps toàn diện được thiết kế để giải quyết bài toán quá tải thông tin trong các buổi livestream quy mô lớn. Thay vì để Streamer bị "trôi chat" hoặc Moderator phải căng mắt đọc từng dòng tin nhắn, hệ thống tự động thu thập, phân tích và chuyển đổi hàng ngàn tin nhắn mỗi giây thành các Tín hiệu vận hành (Operational Signals) trực quan.

🎯 Mục tiêu cốt lõi:

Real-time Monitoring: Cung cấp Dashboard thời gian thực với độ trễ thấp (< 5s).
AI-Powered Moderation: Tự động phát hiện và cảnh báo các cuộc tấn công ngôn từ (Toxic Attack).
Engagement Tracking: Nhận diện khoảnh khắc "đỉnh cao" (Viral Moments) để hỗ trợ đội ngũ Editor.
Historical Analysis: Lưu trữ dữ liệu dài hạn để phân tích xu hướng khán giả theo thời gian.

🏗️ 2. Kiến trúc hệ thống (System Architecture)

Sơ đồ kiến trúc tổng thể luồng dữ liệu: Producer -> Kafka -> Spark Streaming -> Redis/PostgreSQL -> Dashboard/Metabase.

📺 Dashboard Preview

Giao diện giám sát real-time 6 tín hiệu vận hành từ Redis và Spark Streaming.

🛠️ Tech Stack

Layer	Technology	Purpose
Ingestion	Apache Kafka (KRaft mode)	Event streaming & message broker
Processing	Apache Spark 3.5+ (Structured Streaming)	Distributed stream processing with ML integration
ML/AI	ONNX Runtime, Transformers	Real-time toxicity & emotion classification
Storage (Hot)	Redis	In-memory cache for real-time dashboard
Storage (Cold)	PostgreSQL	Time-series data & historical analytics
Visualization	Streamlit, Metabase	Real-time dashboard & BI analytics
Infrastructure	Docker Compose	Containerized microservices orchestration
Runtime	Python 3.9+, PySpark	Data pipeline, transformations & ML inference

📊 3. Hệ thống tín hiệu (The 6 Operational Signals)

Đây là "trái tim" của LiveSense, giúp định lượng cảm xúc và hành vi khán giả thành các con số biết nói.

Signal	Tên gọi	Ý nghĩa & Ứng dụng	Công thức (Demo)
S1	Chat Load	"Nhịp tim của Stream". Đo lường tốc độ tin nhắn đổ về. Giúp nhận biết độ "nóng" tổng quan của buổi live.	`Total_Msg / 60s`
S2	Tech Health	"Bác sĩ kỹ thuật". Phát hiện khi người xem phàn nàn về lag, mất tiếng, drop frame.	`% Technical_Issue`
S3	Demand Pressure	"Áp lực yêu cầu". Đo lường mức độ đòi hỏi của khán giả (yêu cầu chơi game khác, đổi nhạc...).	`Request_Count / 60s`
S4	Backseat Pressure	"Chỉ số dạy đời". Đo lường mức độ khán giả chỉ trích hoặc chỉ đạo cách chơi game (Backseating).	`% Performance_Feedback`
S5	Toxic Pressure	"Hệ thống an ninh". Cảnh báo ĐỎ khi xuất hiện làn sóng tấn công, chửi bới, xúc phạm.	`Toxic_Count / 60s`
S6	Engagement Heat	"Máy dò Highlight". Nhận diện khoảnh khắc bùng nổ cảm xúc (Viral), hỗ trợ cắt clip highlight tự động.	`Excitement_Count / 60s`

📚 Documentation

Tài liệu chi tiết dự án: docs/SE363_Q11.pdf

🚀 Installation & Usage

⚡ One-Command Setup (Windows PowerShell)

powershell -ExecutionPolicy Bypass -File .\setup.ps1 -ModelRepoId "Phatthachdau123/livesense-qoe-models"

Hoac neu da khai bao MODEL_REPO_ID trong .env, chi can chay:

powershell -ExecutionPolicy Bypass -File .\setup.ps1

Lệnh trên sẽ tự động:

Tạo .env từ .env.example (nếu chưa có)
Cài dependencies từ requirements.txt
Tải onnx_models/ từ Hugging Face model repo

Sau đó tiếp tục chạy hạ tầng và pipeline ở các bước bên dưới.

Yêu cầu tiên quyết (Prerequisites):

Docker & Docker Compose
Python 3.9+
Git

Bước 1: Khởi tạo môi trường hạ tầng

Dựng toàn bộ các services (Spark, Kafka, Redis, Postgres, Metabase) bằng Docker.

# Tại thư mục gốc dự án
docker-compose up -d

Chờ khoảng 30s - 1 phút để các container khởi động hoàn toàn.

Bước 2: Cài đặt thư viện Python (Client Side)

Cài đặt các thư viện cần thiết để chạy Producer và Dashboard ở máy local.

pip install -r requirements.txt

Bước 3: Kích hoạt hệ thống (Theo thứ tự)

1. Khởi chạy Spark Consumer (Bộ não xử lý): Consumer sẽ lắng nghe Kafka, xử lý dữ liệu và đẩy vào Redis/Postgres.

docker exec -it spark-master python3 /app/consumer.py --topic test --trigger-seconds 2

2. Khởi chạy Streamlit Dashboard (Màn hình theo dõi): Mở một terminal mới:

streamlit run dashboard.py

Truy cập: http://localhost:8501

3. Bắt đầu giả lập dữ liệu (Data Generator): Mở một terminal mới để bắn dữ liệu giả lập vào hệ thống:

python producer.py --video_id <youtube_url_or_id> --topic test --server localhost:9092

Lưu ý: Kafka topic ở Producer, Consumer và Dashboard phải giống nhau (ví dụ test).

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
docs		docs
img		img
scraper		scraper
.env.example		.env.example
.gitignore		.gitignore
Dockerfile.spark		Dockerfile.spark
Final_model.ipynb		Final_model.ipynb
README.md		README.md
consumer.py		consumer.py
convert_onnx.py		convert_onnx.py
dashboard.py		dashboard.py
docker-compose.yml		docker-compose.yml
download_models.py		download_models.py
onnx_inference.py		onnx_inference.py
process_batch_integration.py		process_batch_integration.py
producer.py		producer.py
requirements.txt		requirements.txt
setup.ps1		setup.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ LiveSense QoE: Real-time Livestream Analytics & AI Moderation System

📖 Tổng quan dự án (Project Overview)

🎯 Mục tiêu cốt lõi:

🏗️ 2. Kiến trúc hệ thống (System Architecture)

📺 Dashboard Preview

🛠️ Tech Stack

📊 3. Hệ thống tín hiệu (The 6 Operational Signals)

📚 Documentation

🚀 Installation & Usage

⚡ One-Command Setup (Windows PowerShell)

Yêu cầu tiên quyết (Prerequisites):

Bước 1: Khởi tạo môi trường hạ tầng

Bước 2: Cài đặt thư viện Python (Client Side)

Bước 3: Kích hoạt hệ thống (Theo thứ tự)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ LiveSense QoE: Real-time Livestream Analytics & AI Moderation System

📖 Tổng quan dự án (Project Overview)

🎯 Mục tiêu cốt lõi:

🏗️ 2. Kiến trúc hệ thống (System Architecture)

📺 Dashboard Preview

🛠️ Tech Stack

📊 3. Hệ thống tín hiệu (The 6 Operational Signals)

📚 Documentation

🚀 Installation & Usage

⚡ One-Command Setup (Windows PowerShell)

Yêu cầu tiên quyết (Prerequisites):

Bước 1: Khởi tạo môi trường hạ tầng

Bước 2: Cài đặt thư viện Python (Client Side)

Bước 3: Kích hoạt hệ thống (Theo thứ tự)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages