-
Notifications
You must be signed in to change notification settings - Fork 16
[CmdOpt] add utf8 support check #1190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @tdavidcl, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a new feature to the shamcmdopt library, enabling it to detect and report whether the current terminal environment supports UTF-8 encoding. This is achieved by checking relevant locale environment variables like LC_ALL, LC_CTYPE, and LANG, providing users with immediate feedback on their terminal's character encoding capabilities.
Highlights
- UTF-8 Support Detection: A new function, check_utf8_support(), has been added to determine if the terminal environment is configured for UTF-8.
- Locale Variable Inspection: The check_utf8_support() function inspects LC_ALL, LC_CTYPE, and LANG environment variables, prioritizing them in that order, to identify UTF-8 indicators.
- Output Integration: The shamcmdopt output now includes a line indicating whether UTF-8 is "supported" or "not supported" based on the new check.
- Environment Variable Documentation: The LC_ALL, LC_CTYPE, and LANG environment variables are now registered for documentation within the command-line options processing.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a check for UTF-8 support in the terminal by examining locale environment variables. The implementation has a correctness issue in how it handles the precedence of these variables (LC_ALL, LC_CTYPE, LANG), which could lead to inaccurate results. I've provided a suggestion to fix this and also to make the check more robust by being case-insensitive. The rest of the changes for declaring and using the new function look good.
| auto check_utf8_in_locale = [](const std::optional<std::string> &locale_str) -> bool { | ||
| if (!locale_str) { | ||
| return false; | ||
| } | ||
| const std::string &str = *locale_str; | ||
| return str.find("utf8") != std::string::npos || str.find("utf-8") != std::string::npos | ||
| || str.find("UTF8") != std::string::npos | ||
| || str.find("UTF-8") != std::string::npos; | ||
| }; | ||
|
|
||
| // LC_ALL takes precedence over LC_CTYPE, which takes precedence over LANG | ||
| if (lc_all && check_utf8_in_locale(lc_all)) { | ||
| return true; | ||
| } | ||
| if (lc_ctype && check_utf8_in_locale(lc_ctype)) { | ||
| return true; | ||
| } | ||
| if (lang && check_utf8_in_locale(lang)) { | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of check_utf8_support has a couple of areas for improvement:
-
Incorrect Precedence Logic: The current logic for checking locale environment variables doesn't correctly handle precedence. If
LC_ALLis set but does not specify UTF-8, the function incorrectly proceeds to checkLC_CTYPEandLANG. According to POSIX standards, ifLC_ALLis set, its value is definitive. The same rule applies toLC_CTYPEoverLANG. This can lead to incorrect detection of UTF-8 support. -
Verbose and Case-Sensitive Check: The check for "UTF-8" is case-sensitive and verbose. It can be made more robust and maintainable by converting the locale string to lowercase before checking.
The suggested change below addresses both points by correcting the precedence logic and implementing a case-insensitive check. Note that this change requires including <algorithm> and <cctype> headers at the top of the file.
// Check if any of these contain UTF-8 in a case-insensitive way
auto check_utf8_in_locale = [](const std::string &locale_str) -> bool {
std::string lower_str = locale_str;
std::transform(lower_str.begin(), lower_str.end(), lower_str.begin(),
[](unsigned char c){ return std::tolower(c); });
return lower_str.find("utf-8") != std::string::npos || lower_str.find("utf8") != std::string::npos;
};
// LC_ALL takes precedence over LC_CTYPE, which takes precedence over LANG
if (lc_all) {
return check_utf8_in_locale(*lc_all);
}
if (lc_ctype) {
return check_utf8_in_locale(*lc_ctype);
}
if (lang) {
return check_utf8_in_locale(*lang);
}
Workflow reportworkflow report corresponding to commit 9e50ee8 Pre-commit check reportSome failures were detected in base source checks checks. ❌ trailing-whitespaceSuggested changesDetailed changes :diff --git a/src/shamcmdopt/src/details/generic_opts.cpp b/src/shamcmdopt/src/details/generic_opts.cpp
index 697b7cfc..823073ca 100644
--- a/src/shamcmdopt/src/details/generic_opts.cpp
+++ b/src/shamcmdopt/src/details/generic_opts.cpp
@@ -202,11 +202,11 @@ namespace shamcmdopt {
bool check_utf8_support() {
static bool result_cached = false;
static bool cached_result = false;
-
+
if (result_cached) {
return cached_result;
}
-
+
cached_result = check_utf8_support_internal();
result_cached = true;
return cached_result;
|
No description provided.