-
Notifications
You must be signed in to change notification settings - Fork 6.6k
feat(genai): Add local tokenizer samples for Count and Compute #13602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @msampathkumar, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces new samples for local tokenization with GenAI, including adding the sentencepiece
dependency and new test cases. The changes are generally good, but I've identified a couple of areas for improvement in the new test code. Specifically, the function names in the new sample modules are identical to the module names, which is unconventional and can be confusing. I've suggested renaming them to be more descriptive and consistent with the existing codebase. This will improve code readability and maintainability.
|
||
|
||
def test_counttoken_localtokenizer_with_txt() -> None: | ||
assert counttoken_localtokenizer_with_txt.counttoken_localtokenizer_with_txt() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name counttoken_localtokenizer_with_txt
is the same as its module name, which can be confusing. For better readability and consistency with other examples in this file (e.g., counttoken_with_txt.count_tokens()
), consider renaming the function within the counttoken_localtokenizer_with_txt
module to something more descriptive, like count_tokens()
.
|
||
|
||
def test_counttoken_localtokenizer_compute_with_txt() -> None: | ||
assert counttoken_localtokenizer_compute_with_txt.counttoken_localtokenizer_compute_with_txt() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the other new test, the function name counttoken_localtokenizer_compute_with_txt
is identical to its module name. This is unconventional and reduces clarity. To maintain consistency with other examples like counttoken_compute_with_txt.compute_tokens_example()
, please consider renaming the function inside the counttoken_localtokenizer_compute_with_txt
module to something like compute_tokens_example()
or compute_tokens_locally()
.
Here is the summary of changes. You are about to add 2 region tags.
This comment is generated by snippet-bot.
|
Description
Fixes b_443755237
Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.
Checklist
nox -s py-3.9
(see Test Environment Setup)nox -s lint
(see Test Environment Setup)