A Python library for anonymizing, masking, and encrypting sensitive data with a small, focused API.
- Text and pattern anonymization (free-form text replacement, IPv4, email, phone)
- Localized identifiers (US SSN, Brazil CPF/CNPJ)
- Symmetric encryption and decryption (Fernet)
- PII detection via spaCy (optional extra)
Planned: richer masking helpers and reversible transforms.
pip install shadow_dataOptional spaCy support:
pip install shadow_data[spacy]spaCy models are downloaded automatically at runtime when needed. To install manually:
python -m spacy download en_core_web_trffrom shadow_data.anonymization import (
EmailAnonymization,
Ipv4Anonymization,
PhoneNumberAnonymization,
TextProcessor,
)
from shadow_data.cryptohash.symmetric_cipher import Symmetric
from shadow_data.l10n.usa import IdentifierAnonymizer
text = "Contact me at [email protected] or 415-555-0199. Server: 10.0.0.1"
anonymized_text = Ipv4Anonymization.anonymize_ipv4(text)
anonymized_text = TextProcessor.replace_text("Contact", "Reach", anonymized_text)
email = EmailAnonymization.anonymize_email("[email protected]")
phone = PhoneNumberAnonymization.anonymize_phone_number("415-555-0199")
print(anonymized_text, email, phone)
ssn = "Billy's SSN is 479-92-5042."
ssn_anonymizer = IdentifierAnonymizer(ssn)
ssn_anonymizer.anonymize()
print(ssn_anonymizer.cleaned_content)
symmetric = Symmetric()
key = symmetric.create_key()
ciphertext = symmetric.encrypt("hello")
plaintext = symmetric.decrypt(ciphertext)
print(key, ciphertext, plaintext)docs/README.mddocs/usage.mddocs/cryptography.mddocs/pii.md
examples/quickstart.pyexamples/anonymization.mdexamples/i10n_us.mdexamples/i10n_brazil.mdexamples/pii_nlp.mdexamples/symmetric_cipher.md
poetry run pytest -vvv- Fork the repository.
- Create a new branch for your feature (
git checkout -b my-new-feature). - Commit your changes (
git commit -am 'Add new feature'). - Push the branch (
git push origin my-new-feature). - Open a pull request.
This project is licensed under the MIT License - see LICENSE for details.