Skip to content

luiztosk/pyredact

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyRedact

A tool to redact personal info from text files.

Instalation

This project uses the python utility uv to install two scripts to the user's $PATH, the main script pyredact and a bash wrapper, redact, that simply pipes the output of pyredact's diff into less for convenience.

Clone the repository with

git clone https://github.com/luiztosk/pyredact

and install it using

uv tool install ./pyredact

Usage

usage: pyredact [-h] [-pi PERSONAL_INFO_FILE] filename

positional arguments:
  filename

options:
  -h, --help            show this help message and exit
  -pi, --personal_info_file PERSONAL_INFO_FILE

It reads a list of pairs of strings to be replaced, and tags to replace them with, from a file in the user's home directory. The default location is at ~/.config/pyredact/personal_info.yml. This can be changed in config.py, or by using the command line argument, -pi [PERSONAL INFO FILE].

Typically you would create the dir, then copy the file sample_files/personal_info.yml there, and add the info you want redacted. Pay attention to the order, larger strings should come first, since if you put at the top shorter strings, containing for example only your first name, it would be impossible to replace your full name later.

A required argument is the filename for the file being redacted, which can be in relative or absolute paths, or in the current directory.

Example

When using the sample files provided in the sample_files directory, the personal_info.yml looks like so:

- ["gonzaga_jr_dev", "github username"]
- ["Luiz Gonzaga do Nascimento Júnior", "full name"]
- ["Luiz Gonzaga Jr", "first and last name"]
- ["[email protected]", "email address"]
- ["Luiz", "first name"]
- ["Odaléia Guedes dos Santos", "full mother's name"]
- ["Luiz Gonzaga do Nascimento", "full father's name"]
- ["simples_desejo.dev", "domain name"]
- ["Moleque Doido Inc", "business name"]
- ["Asa Branca Systems", "company name"]
- ["Rio de Janeiro", "city"]
- ["UniPirapora", "university"]
- ["September 22, 1945", "birth date"]
- ["22/09/1945", "birth date"]

and we can run pyredact by invoking:

redact -pi ./sample_files/personal_info.yml ./sample_files/original_text.txt

It will write the redacted file, containing the text in original_text.txt minus the replacements, into ./sample_files/REDACTED_original_text.txt. In our terminal, less will show a diff with the changes:

Com a chegada da internet comercial nos anos 90, [-Luiz Gonzaga do Nascimento Júnior-] {+[full name]+} viu uma nova fronteira. Ele percebeu cedo que a identidade digital seria tão importante quanto a identidade civil. Foi um dos primeiros a registrar domínios e estabelecer uma presença online. Seu portfólio pessoal, onde publicava manifestos sobre código aberto e liberdade digital, foi hospedado no domínio [-simples_desejo.dev.-] {+[domain name].+} Até hoje, o [-simples_desejo.dev-] {+[domain name]+} serve como um arquivo histórico de seus pensamentos, contendo tutoriais que vão desde Assembly até as modernas arquiteturas de microsserviços.

About

A tool to redact personal info from text files.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published