VCAT (Video Content Analysis Tool) is a digital forensic solution designed to assist analysts and law enforcement in processing and analyzing video evidence using AI-powered modules. It integrates object detection, OCR, and speech recognition into one streamlined pipeline.
- Object Detection via GroundingDINO – Detects visual targets in video frames using text-based prompts.
- Optical Character Recognition (OCR) via PaddleOCR – Extracts text from video frames (e.g., signs, billboards).
- Speech Recognition via Whisper – Transcribes spoken content in video audio.
- AI Prompt Search – Use natural language prompts to focus search on specific events or objects.
- Court-Admissible Reporting – Structured output with timestamps, GPS, confidence score, and hash verification.
- Built on Google Colab for easy deployment and reproducibility.
- Input
- Case info, forensic image, analyst data
- Keywords/prompts like:
"billboard, white car, #StopTheGenocide, Palestine"
- Video Processing
- Extract frames and audio
- Run through AI modules (GroundingDINO, PaddleOCR, Whisper)
- Filtering & Grouping
- Based on frame index and confidence score
- Output Report
- Includes parsed results: timestamps, artifacts, confidence, GPS
- Format aligned with forensic standards (NIST, SWGDE)
- methodology*
- Google Colab form interface
- Input fields for paths, case info, and prompts
- Checkboxes to activate modules
- One-click report generation
- Video files (from forensic image)
- Analyst and case metadata
- Prompt keywords for search focus
- JSON/CSV structured result files
- Auto-generated forensic report via ReportLab
- Confidence-ranked evidence per frame/audio
- All hashes are calculated on video files to ensure authenticity.
- Adheres to NIST 800-86 and SWGDE guidelines.
- Supports chain-of-custody documentation.
This tool is part of the master's thesis:
“A Proposed Digital Forensic Tool for Video Content Analysis in the Investigation Process”
by Ruwa’ Fayeq Suleiman Abu Hweidi – PTUK, 2024
[V-CAT](Forensic_Tool_for_Video_Content_Analysis_Ruwa_thesis (20).pdf)
This project is licensed under the MIT License – see the LICENSE file for details.