-
Notifications
You must be signed in to change notification settings - Fork 1
[4.0.1] Pipeline architecture, multi-series isolation & search CLI #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dam2452
wants to merge
90
commits into
main
Choose a base branch
from
Multi-Series-Support-&-Data-Isolation-(Preprocessor)
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
90 commits
Select commit
Hold shift + click to select a range
a93f4d8
Add multi-series output paths and refactor processors
dam2452 a05c153
Hoist imports to top-level; adjust pylint formatting
dam2452 2804a73
Add safe resize and use in reference processor
dam2452 540336f
refaktor
dam2452 9cfce98
.
dam2452 e622921
Restructure preprocessor and add ES mappings
dam2452 3afc6c0
Lower scene min length and raise beam size
dam2452 0250bd2
Support per-series configs and selective pipelines
dam2452 fa146b2
Make FFmpegWrapper helpers private; fix typings
dam2452 023720b
Add interlacing detection and refactor scene code
dam2452 5ea6a7c
Add force_deinterlace and improve detection
dam2452 6dba0a5
Refactor pipeline and add search CLI
dam2452 1589b50
Switch to Qwen3-VL-Embedding & update descriptions
dam2452 de75fc8
Remove redundant docstrings and comments
dam2452 289e04d
Privatize helper methods & cleanup dead code
dam2452 a7b6193
Add dataclass fixer; refactor pipeline and configs
dam2452 6e245e0
Refactor BaseProcessor flow and defaults
dam2452 5b5599f
Restructure packages and update processors
dam2452 641bf2f
Move lib to services and add validation step
dam2452 84bcf48
Standardize step module names with _step suffix
dam2452 e3cd018
Add resolution analysis step and refactor CLI search
dam2452 5e8e75b
Support global pipeline steps; drop _executed flags
dam2452 deeb26b
Refactor resolution analysis and update step modules
dam2452 b7dc97e
Improve FFmpeg, interlace detection & transcode
dam2452 83ee831
Update kiepscy.json
dam2452 9043390
Refactor config, IO and search CLI
dam2452 2c8b165
Update transcoding_step.py
dam2452 078fa85
Refactor pipeline, executor, and CLI internals
dam2452 8663d2b
Refactor: static methods, typing and renames
dam2452 a163ee8
Add batch processing and model pool support
dam2452 1860252
Use attribute and fix DDGS import
dam2452 5924d49
Use config.video_bitrate_mbps property
dam2452 c677d6e
Add thread-safety and multi in-progress state
dam2452 98c7114
Refine bitrate scaling; lower parallel episodes
dam2452 e02b0fd
Increase default max_parallel_episodes to 3
dam2452 2f114ba
Parallelize video resolution scanning
dam2452 320a9d8
Refactor transcode bitrates & add resolution check
dam2452 4588e37
Add OutputDescriptor system and refactor steps
dam2452 56e1704
Set default frame export to 1 frame & 1080p
dam2452 937c218
Register object_detections earlier in pipeline
dam2452 aa146d6
Refactor PipelineStep flow and caching
dam2452 0ffaa8d
Add source_video_path and fix threadpool order
dam2452 b1e44f6
Add artifact registry and timestamp frames
dam2452 22f7653
Add state sync CLI and filesystem reconstructor
dam2452 2232aef
Make get_output_descriptors public
dam2452 61fe081
Make output subdir optional and snap to keyframes
dam2452 7cb1d7a
Refactor FFmpeg usage and bitrate config
dam2452 692385a
Improve ffmpeg logging and add batch info log
dam2452 b737e88
Silence ffmpeg and improve interlace logs
dam2452 de9dacf
Merge branch 'main' into Multi-Series-Support-&-Data-Isolation-(Prepr…
dam2452 bc15db4
Remove ES index mappings and add local types
dam2452 410c27a
Parallelize frame export, improve ffmpeg
dam2452 4a55ea1
Reduce default frames_per_scene to 1
dam2452 86915b9
Add segment filter steps and refactor transcription
dam2452 32bac73
Add image hasher device & hex hashes
dam2452 d72f841
Skip completed episode steps and load cache
dam2452 3753a3f
Add char ref processor and infra updates
dam2452 48b97da
Add embedding steps, face clusterer, and episode fallbacks
dam2452 8fa2611
Replace Qwen-VL with vLLM embedding backend
dam2452 ed5abe6
Use pooling runner and allow remote code in LLM
dam2452 3a4261c
Use embed() instead of encode() for embeddings
dam2452 0b00a3e
Split transcriptions, document outputs & archives
dam2452 9877625
Fix cv2 lint, hashing path, and segment_range
dam2452 5172ac0
Add deploy_to_nas script and package init
dam2452 fb9752d
Update deploy_to_nas.py
dam2452 e06b918
vLLM: switch model, adjust sampling, 256K context
dam2452 759c5e3
Add transcription import step and config
dam2452 1ac92d4
Preserve bitrate for same-res, improve logs
dam2452 92f8646
Update vLLM install, client and configs
dam2452 98ac7da
Skip character images; improve models & clustering
dam2452 7f32017
Use per-episode file layout & refactor validators
dam2452 b1ddf23
Add BaseTranscriptionStep and refactor steps
dam2452 3bf23be
Add sejm_demo series config
dam2452 b9f0ed9
Add pipeline_mode and lower missing-image error
dam2452 f8fdd0c
Add global-completion flag and exhausted marker handling
dam2452 03fa082
Update reference_downloader.py
dam2452 fd1611b
Add search_query_template to scraping config
dam2452 eee5f11
Use RapidAPI for Google Search; add SerpAPI
dam2452 69e21ab
Add search engines and lower min image size
dam2452 f26e38f
Refactor image search and reference downloader
dam2452 c4a8a46
Use browser-based DuckDuckGo, refactor image scraping
dam2452 2dc3ffa
Add chunked Whisper transcription for long audio
dam2452 07617a6
Remove output_data dirs from preprocessor Dockerfile
dam2452 9a519ee
Add script to split double-episode videos
dam2452 d770797
Add scribe comparison script and ElevenLabs tweaks
dam2452 04c8a56
Add series face clustering and cluster-based refs
dam2452 ef0afb9
Remove legacy face clustering step and validator
dam2452 439156f
Update config.py
dam2452 9512662
Face clustering: stricter filters, noise & labels
dam2452 e3132ef
Update cluster_folder_manager.py
dam2452 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| * text=auto | ||
|
|
||
| *.sh text eol=lf | ||
| *.py text eol=lf | ||
| Dockerfile text eol=lf | ||
| *.dockerignore text eol=lf |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -42,3 +42,4 @@ cookies.txt | |
| test_episodes.json | ||
| /models | ||
| /tmp | ||
| /preprocessor/scripts/scribe_compare | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| 4.0.1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.