TrasgoDP implements different mechanims for ε-differential privacy and (ε, δ)-differential privacy. The mechanisms are implemented for being used under a local approach, adding noise directly to the raw data. Two types of mechanims are implemented:
- For numerical records: Laplace and Gaussian mechanisms. The implementation includes a final clipping applyied on the data with DP.
- For categorical records: Exponential mechanism and Randomized Response (both for binary attributes and the k-ary version).
This library provides dedicated function designed for being applied on both pandas dataframes and lists/numpy arrays.
You can install trasgoDP using pip. We recommend to use Python3 with virtualenv:
virtualenv .venv -p python3
source .venv/bin/activate
pip install trasgoDP| Mechanism | Type of the attribute | Function in trasgoDP |
|---|---|---|
| Laplace | Numerical | numerical.dp_clip_laplace() |
| Gaussian | Numerical | numerical.dp_clip_gaussian() |
| Exponential | Categorical | categorical.dp_exponential() |
| Randomized response | Categorical (binary) | categorical.dp_randomized_response_binary() |
| k-ary randomized response | Categorical | categorical.dp_randomized_response_kary() |
For applying DP mechanisms to a column of a dataframe you need to introduce:
- The pandas dataframe with the data.
- The column in the dataframe to be privatized.
- The privacy budget (ε).
- The probability of exceeding the privacy budget (δ) in case of numerical attributes and the Gaussian mechanism.
- The uper and lower bounds for numerical attributes (optional).
Example: apply DP to the adult dataset with the Laplace mechanism for the column age and the Exponential mechanism for the column workclass:
import pandas as pd
from trasgodp.numerical import dp_clip_laplace
from trasgodp.categorical import dp_exponential
# Read and process the data
data = pd.read_csv("examples/adult.csv")
data.columns = data.columns.str.strip()
cols = [
"workclass",
"education",
"marital-status",
"occupation",
"sex",
"native-country",
]
for col in cols:
data[col] = data[col].str.strip()
# Apply DP for the attribute age:
column_num = "age"
epsilon1 = 10
df = dp_clip_laplace(data, column_num, epsilon1, new_column=True)
# Apply DP for the attribute workclass:
column_cat = "workclass"
epsilon2 = 5
df = dp_exponential(data, column_cat, epsilon2, new_column=True)This project is under active development.
This project is licensed under the Apache 2.0 license.
If you are using trasgoDP, you may also be interested in:
- pyCANON: a Python library for checking the level of anonymity of a dataset.
- anjana: a Python library for anonymizing tabular datasets.
This work is funded by European Union through the SIESTA project (Horizon Europe) under Grant number 101131957.


