in_tail: add Windows UTF-8 path encoding mode#12020
Conversation
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
🚧 Files skipped from review as they are similar to previous changes (6)
📝 WalkthroughWalkthroughAdds a ChangesWindows UTF-8 Path Encoding for in_tail
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 94eb294748
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugins/in_tail/tail_file.c`:
- Around line 2369-2371: The wide-path prefix stripping in tail_file.c is
copying too many wchar_t elements from wide_buf after detecting the "\\\\?\\"
prefix, which can read past the allocated PATH_MAX buffer. Update the memmove in
the wide-path handling block to copy only the remaining string including the
null terminator (from wide_buf + 4 into wide_buf), using the correct element
count derived from len so the source range stays within bounds.
In `@plugins/in_tail/win32/io.c`:
- Around line 56-79: win32_open_utf8 currently returns -1 without setting errno
on failure, so it should mirror the error handling used by win32_stat_utf8 and
win32_lstat_utf8. Update win32_open_utf8 to translate a NULL result from
win32_utf8_to_wide into EINVAL, and to call propagate_last_error_to_errno() when
CreateFileW returns INVALID_HANDLE_VALUE before returning -1. Keep the
successful _open_osfhandle path unchanged.
In `@tests/integration/scenarios/in_tail/tests/test_in_tail_001.py`:
- Line 955: Remove the `@skip_on_windows` decorator from the two affected in_tail
tests so they run on Windows as well. Specifically, update
test_in_tail_db_schema_upgrade_is_automatic and
test_in_tail_ignore_older_skips_stale_files in test_in_tail_001.py to no longer
be skipped, since their bodies exercise the stat/discovery path and do not rely
on Windows-incompatible rename, symlink, or permission behavior.
- Around line 354-360: The UTF-8 Windows path test in test_in_tail_001 still
uses characters that CP932 can encode, so it does not uniquely exercise the utf8
path handling. Update the setup in this scenario to use a log directory and/or
filename with characters outside the legacy ANSI/CP932 repertoire, or add a
separate negative case that confirms discovery fails under ansi; keep the
changes localized to the existing path-encoding test setup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 91efb3a8-2b8e-48a9-a36b-9d1cc9acef85
📒 Files selected for processing (12)
plugins/in_tail/CMakeLists.txtplugins/in_tail/tail.cplugins/in_tail/tail_config.cplugins/in_tail/tail_config.hplugins/in_tail/tail_file.cplugins/in_tail/tail_fs_stat.cplugins/in_tail/tail_scan_win32.cplugins/in_tail/win32/interface.hplugins/in_tail/win32/io.cplugins/in_tail/win32/path.cplugins/in_tail/win32/stat.ctests/integration/scenarios/in_tail/tests/test_in_tail_001.py
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
… path Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Summary
This adds an opt-in Windows path encoding mode for the
tailinput:By default, Windows keeps the existing ANSI-code-page behavior (
ansi) for compatibility with CP932 and other legacy Windows locales. Whenutf-8is selected,in_tailtreats configured paths as UTF-8, converts them to UTF-16, and uses Windows wide-character file APIs for discovery, open/stat, exclude matching, and final path resolution.Implementation
windows.path_encodingwith defaultansi.FindFirstFileW/FindNextFileWPathMatchSpecWCreateFileWGetFinalPathNameByHandleWpath_keyand DB filename comparison remain consistent.Tests
windows.path_encoding: utf-8.in_tailintegration scenarios as skipped where they depend on POSIX rotation, symlink, permission, or inode behavior.Verification
cmake --build build --target flb-plugin-in_tail: passedcmake --build build --target fluent-bit-bin: passedtests\integration\.venv\Scripts\python.exe -m pytest tests\integration\scenarios\in_tail\tests\test_in_tail_001.py -q:15 passed, 15 skippedvalgrindis not available.Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
windows.path_encoding(defaultansi, withutf-8/utf8support).Bug Fixes