Skip to content

Conversation

@luiztosk
Copy link

@luiztosk luiztosk commented Nov 18, 2025

I generated this list using Fabricio C Zuardi's ptbr-wordlist. It takes the PT-BR dictionary from the LatinIME (AOSP) project, filters short, accented and too similar words and keeps the most frequent 10k words.

AOSP is under the Apache License, and as I understand it, it is compatible with the BSD License. A copy of the License is provided in the file LICENSE-ptbr-aosp-10k.md, along with a description of the edits in the list.

This is a very handy improvement for PT-BR speakers. Thank you @redacted for this project, and let me know if I need to make any changes to this PR. If not possible to include the list, I can write a helper script to download and filter it (perhaps during setup?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant