-
Notifications
You must be signed in to change notification settings - Fork 662
[CI]【Hackathon 9th Sprint Example NO 16】功能模块 fastdeploy/input/ernie4_5_vl_processor/process.py 单测补充 #5264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
本PR为FastDeploy项目中的fastdeploy/input/ernie4_5_vl_processor/process.py模块补充了全面的单元测试,这是Hackathon 9th Sprint的第16号任务。测试覆盖了DataProcessor类的主要功能,包括文本处理、图像处理、视频处理和多模态输入处理。
Key Changes
- 新增了约1858行的综合单元测试,涵盖DataProcessor类的核心功能
- 测试包括正常场景、边界情况和错误处理
- 使用了Mock对象来隔离依赖,确保测试的独立性
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
tests/input/test_process.py
Outdated
| with patch.object(self.processor, "update_processor_cache") as mock_update_cache: | ||
| outputs = self.processor.request2ids(request) | ||
| self.assertIn("input_ids", outputs) | ||
| # Should call update_processor_cache, as img1 is not in missing_idx |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Comment in Chinese. Please translate to English:
# Should call update_processor_cache, as img1 is not in missing_idx
tests/input/test_process.py
Outdated
| def test_text2ids_with_video_placeholder(self): | ||
| """Test text conversion with video placeholder""" | ||
| text = "Hello <|video@placeholder|> world" | ||
| mock_frames = [Image.new("RGB", (224, 224)) for _ in range(4)] | ||
| with patch("fastdeploy.input.ernie4_5_vl_processor.process.read_video_decord") as mock_read: | ||
| mock_read.return_value = (None, {"duration": 2.0}, "test_path") | ||
| with patch("fastdeploy.input.ernie4_5_vl_processor.process.read_frames_decord") as mock_frames_read: | ||
| mock_frames_read.return_value = ( | ||
| [np.array(f) for f in mock_frames], | ||
| None, | ||
| [0.0, 0.5, 1.0, 1.5], | ||
| ) | ||
| with patch("fastdeploy.input.ernie4_5_vl_processor.process.render_frame_timestamp") as mock_render: | ||
| mock_render.side_effect = lambda img, ts: ( | ||
| Image.fromarray(img) if isinstance(img, np.ndarray) else img | ||
| ) | ||
| # Mock preprocess to return correct keys | ||
| self.mock_image_preprocessor.preprocess.return_value = { | ||
| "pixel_values_videos": np.random.rand(4, 256, 3 * 14 * 14).astype(np.float32), | ||
| "video_grid_thw": np.array([[4, 16, 16]]), | ||
| } | ||
| outputs = self.processor.text2ids(text, videos=["test_video.mp4"]) | ||
| self.assertGreater(len(outputs["input_ids"]), 0) | ||
|
|
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test contains significant code duplication with other video tests. The pattern of patching read_video_decord, read_frames_decord, and render_frame_timestamp is repeated many times throughout the test file.
Consider extracting this into a helper method or fixture to improve maintainability:
@contextmanager
def mock_video_processing(self, num_frames=4, duration=2.0):
"""Context manager to mock video processing with standard defaults"""
mock_frames = [Image.new("RGB", (224, 224)) for _ in range(num_frames)]
with patch("fastdeploy.input.ernie4_5_vl_processor.process.read_video_decord") as mock_read, \
patch("fastdeploy.input.ernie4_5_vl_processor.process.read_frames_decord") as mock_frames_read, \
patch("fastdeploy.input.ernie4_5_vl_processor.process.render_frame_timestamp") as mock_render:
mock_read.return_value = (None, {"duration": duration}, "test_path")
mock_frames_read.return_value = (
[np.array(f) for f in mock_frames],
None,
[i * duration / num_frames for i in range(num_frames)],
)
mock_render.side_effect = lambda img, ts: (
Image.fromarray(img) if isinstance(img, np.ndarray) else img
)
yieldThis would reduce the ~300+ lines of repeated mocking code.
tests/input/test_process.py
Outdated
| self.processor._add_video(frames, outputs, None) | ||
| self.assertGreater(len(outputs["input_ids"]), 0) | ||
| self.assertGreater(len(outputs["images"]), 0) | ||
| # image_type_ids is extended with [1] * num_frames for each frame |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Comment in Chinese. Please translate to English:
# image_type_ids is extended with [1] * num_frames for each frame
tests/input/test_process.py
Outdated
| def test_set_video_frame_args_target_frames_positive_fps_error(self): | ||
| """Test _set_video_frame_args with target_frames > 0 and fps >= 0 (line 514)""" | ||
| video_meta = {"duration": 10.0} | ||
| video_frame_args = { | ||
| "target_frames": 10, | ||
| "fps": 2, # Positive fps should raise error | ||
| "min_frames": 1, | ||
| "max_frames": 100, | ||
| } | ||
| with self.assertRaises(ValueError) as context: | ||
| self.processor._set_video_frame_args(video_frame_args, video_meta) | ||
| self.assertIn("fps must be negative", str(context.exception)) | ||
|
|
||
| def test_set_video_frame_args_target_frames_below_min(self): | ||
| """Test _set_video_frame_args with target_frames < min_frames (line 519)""" | ||
| video_meta = {"duration": 10.0} | ||
| video_frame_args = { | ||
| "target_frames": 5, | ||
| "fps": -1, | ||
| "min_frames": 10, | ||
| "max_frames": 100, | ||
| } | ||
| with self.assertRaises(ValueError) as context: | ||
| self.processor._set_video_frame_args(video_frame_args, video_meta) | ||
| self.assertIn("target_frames must be larger", str(context.exception)) | ||
|
|
||
| def test_set_video_frame_args_target_frames_above_max(self): | ||
| """Test _set_video_frame_args with target_frames > max_frames (line 523)""" | ||
| video_meta = {"duration": 10.0} | ||
| video_frame_args = { | ||
| "target_frames": 200, | ||
| "fps": -1, | ||
| "min_frames": 1, | ||
| "max_frames": 100, | ||
| } | ||
| with self.assertRaises(ValueError) as context: | ||
| self.processor._set_video_frame_args(video_frame_args, video_meta) | ||
| self.assertIn("target_frames must be smaller", str(context.exception)) | ||
|
|
||
| def test_set_video_frame_args_fps_negative_without_target_frames(self): | ||
| """Test _set_video_frame_args with fps < 0 and target_frames <= 0 (line 527)""" | ||
| video_meta = {"duration": 10.0} | ||
| video_frame_args = { | ||
| "target_frames": -1, | ||
| "fps": -1, | ||
| "min_frames": 1, | ||
| "max_frames": 100, | ||
| } | ||
| with self.assertRaises(ValueError) as context: | ||
| self.processor._set_video_frame_args(video_frame_args, video_meta) | ||
| self.assertIn("Must provide either positive target_fps", str(context.exception)) | ||
|
|
||
| def test_set_video_frame_args_min_max_invalid(self): | ||
| """Test _set_video_frame_args with min_frames > max_frames (line 535)""" | ||
| video_meta = {"duration": 10.0} | ||
| video_frame_args = { | ||
| "target_frames": -1, | ||
| "fps": 2, | ||
| "min_frames": 100, | ||
| "max_frames": 10, | ||
| } | ||
| with self.assertRaises(ValueError) as context: | ||
| self.processor._set_video_frame_args(video_frame_args, video_meta) | ||
| self.assertIn("min_frames must be smaller", str(context.exception)) | ||
|
|
||
| def test_set_video_frame_args_frames_too_few_adjustment(self): | ||
| """Test _set_video_frame_args when frames_to_extract < min_frames (line 538)""" | ||
| video_meta = {"duration": 1.0} # Short duration | ||
| video_frame_args = { | ||
| "target_frames": -1, | ||
| "fps": 1, # Will extract only 1 frame | ||
| "min_frames": 10, # But min is 10 | ||
| "max_frames": 100, | ||
| } | ||
| result = self.processor._set_video_frame_args(video_frame_args, video_meta) | ||
| self.assertEqual(result["target_frames"], 10) | ||
| self.assertEqual(result["fps"], -1) | ||
|
|
||
| def test_set_video_frame_args_frames_too_many_adjustment(self): | ||
| """Test _set_video_frame_args when frames_to_extract > max_frames (line 541)""" | ||
| video_meta = {"duration": 100.0} # Long duration | ||
| video_frame_args = { | ||
| "target_frames": -1, | ||
| "fps": 10, # Will extract 1000 frames | ||
| "min_frames": 1, | ||
| "max_frames": 100, # But max is 100 | ||
| } | ||
| result = self.processor._set_video_frame_args(video_frame_args, video_meta) | ||
| self.assertEqual(result["target_frames"], 100) | ||
| self.assertEqual(result["fps"], -1) | ||
|
|
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test file structure shows significant duplication in test patterns. Multiple tests follow nearly identical patterns with only minor variations. Consider using parameterized tests to reduce duplication and improve maintainability.
For example, the video frame argument tests (test_set_video_frame_args_*) could be consolidated:
@parameterized.expand([
("target_frames_with_positive_fps", {"target_frames": 10, "fps": 2, ...}, ValueError, "fps must be negative"),
("target_frames_below_min", {"target_frames": 5, "fps": -1, ...}, ValueError, "must be larger"),
("target_frames_above_max", {"target_frames": 200, "fps": -1, ...}, ValueError, "must be smaller"),
# ... more cases
])
def test_set_video_frame_args_validation(self, name, args, expected_error, expected_msg):
"""Test video frame argument validation with various error conditions"""
with self.assertRaises(expected_error) as context:
self.processor._set_video_frame_args(args, {"duration": 10.0})
self.assertIn(expected_msg, str(context.exception))This would reduce ~200 lines of similar test code while improving clarity.
tests/input/test_process.py
Outdated
| self.assertIn("not found in processor cache", str(context.exception)) | ||
|
|
||
| def test_request2ids_video_type(self): | ||
| """Test request2ids handling video type (lines 293-295)""" |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test docstring references specific line numbers (e.g., "lines 293-295"). This creates maintenance burden as these line numbers become outdated when source code changes. Use descriptive test names without line references:
def test_request2ids_video_type(self):
"""Test request2ids properly handles video content type in messages"""This pattern appears in many tests throughout the file and should be corrected consistently.
| """Test request2ids handling video type (lines 293-295)""" | |
| """Test that request2ids properly handles messages containing video content type.""" |
tests/input/test_process.py
Outdated
| # Note: This test requires dependencies like paddleformers, paddle, etc. | ||
| # In CI environment, these should be available. | ||
| # For local testing without dependencies, you may need to install them or use CI. |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This comment mentions "paddleformers" but should be "paddlenformers" or similar package name based on the project structure. However, more importantly, this comment is too specific about implementation details. Consider making it more general:
# Note: This test requires dependencies like paddleformers, paddle, etc.
# Ensure these are available in the test environment.| # Note: This test requires dependencies like paddleformers, paddle, etc. | |
| # In CI environment, these should be available. | |
| # For local testing without dependencies, you may need to install them or use CI. | |
| # Note: This test requires all necessary dependencies to be installed in the test environment. |
tests/input/test_process.py
Outdated
| "cur_position": 0, | ||
| } | ||
| mock_frames = [Image.new("RGB", (224, 224)) for _ in range(4)] | ||
| # Mock preprocess 返回正确的键 |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Comment in Chinese mixed with English code. While the code works correctly, for consistency and maintainability, consider translating the comment to English:
# Mock preprocess to return correct keysThis keeps the codebase language consistent.
| # Mock preprocess 返回正确的键 | |
| # Mock preprocess to return correct keys |
tests/input/test_process.py
Outdated
| ], | ||
| } | ||
| ] | ||
| # Mock get_processor_cache, as request2ids will check missing_hashes |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Comment in Chinese. For consistency with the rest of the English codebase, please translate:
# Mock get_processor_cache, as request2ids will check missing_hashes
tests/input/test_process.py
Outdated
| pos_ids = self.processor._compute_3d_positions(t=2, h=32, w=32, start_idx=100) | ||
| gh = 32 // self.processor.spatial_conv_size | ||
| gw = 32 // self.processor.spatial_conv_size | ||
| t_eff = 2 // self.processor.temporal_conv_size if 2 != 1 else 1 |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparison of constants; use 'True' or 'False' instead.
| t_eff = 2 // self.processor.temporal_conv_size if 2 != 1 else 1 | |
| t_eff = t // self.processor.temporal_conv_size if t != 1 else 1 |
tests/input/test_process.py
Outdated
| pos_ids = self.processor._compute_3d_positions(t=2, h=32, w=32, start_idx=100) | ||
| gh = 32 // self.processor.spatial_conv_size | ||
| gw = 32 // self.processor.spatial_conv_size | ||
| t_eff = 2 // self.processor.temporal_conv_size if 2 != 1 else 1 |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing a constant will always give the same result.
| t_eff = 2 // self.processor.temporal_conv_size if 2 != 1 else 1 | |
| t_eff = 2 // self.processor.temporal_conv_size |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #5264 +/- ##
==========================================
Coverage ? 59.74%
==========================================
Files ? 325
Lines ? 40220
Branches ? 6089
==========================================
Hits ? 24031
Misses ? 14354
Partials ? 1835
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@luotao1 涛姐这样算过吗 |
|
CI过了后,不要频繁点merge develop,请耐心等待review即可。 |
|
感谢贡献! |
已完成修改 |
|
补充些review |
68a3c77 to
69f919e
Compare
已经精简了一版 |
69f919e to
26ca5a8
Compare

Motivation
NO.16功能模块 fastdeploy/input/ernie4_5_vl_processor/process.py 单测补充
develop 分支:覆盖率47%,Miss行数206(47-70, 169, 189-246, 249-294, 303-336, 371, 383, 395, 407, 413, 428, 430, 433-443, 448-453, 456-464, 497, 549, 587-602, 605-632, 639-671, 696->700, 697->696, 706-712, 718-720)
当前pr覆盖率94%,Miss行数19行
完成单测覆盖行数206-19=187
Modifications
add unittest tests/input/test_ernie4_5_vl_processor_process.py
Usage or Command
no need
Accuracy Tests
no need
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.