🧠 AI Fairness Data Generation and Question Answering System

Transparent tools and standardized benchmarks for fair, explainable, and accountable generative AI.

The rapid expansion of GenAI magnifies long-standing concerns around bias, fairness, and representation. This project enables systematic, controlled experimentation so researchers can identify when and why bias occurs, and test what mitigates it.

🌍 What is the project about?

The AI Fairness Data Generation and Question Answering System is part of Vector Institute's contribution to the broader AIXPERT Project, a multi-institutional initiative, to develop tools and benchmarks for fairness-aware data generation and evaluation in generative AI.

It provides:

🧩 Controlled synthetic datasets — safely isolate bias-inducing factors.
🤖 Agentic automation using CrewAI and custom LLM agents.
📊 Fairness metrics & explainers to visualize disparities.
⚙️ Configurable, reproducible pipelines for responsible AI research.

📘 Documentation: Project website

📂 Data: Hugging Face

🧮 Code: GitHub Page

🧱 Repository Structure

Path	Description
`src/aixpert/controlled_images/`	Controlled image generation (baseline vs fairness-aware).
`src/aixpert/data_generation/synthetic_data_generation/images/`	Domain- and risk-specific image + VQA generation.
`src/aixpert/data_generation/synthetic_data_generation/nlp/`	Domain- and risk-specific Scene + MCQ generation.
`src/aixpert/data_generation/synthetic_data_generation/videos/`	Video synthesis using Google Veo / Gemini API.
`src/aixpert/data_generation/agent_pipeline/`	Single-agent CrewAI pipeline for multimodal orchestration.
`src/aixpert/toxicity_fairness_analysis/`	Fairness metrics and zero-shot explainability (integrated gradients).
`docs/`	MkDocs documentation sources.
`tests/`	Tests using `pytest`.

🚀 Getting Started

New to the project? Follow the steps below to set up your development environment and explore key modules.

Prerequisites

Ensure you have uv installed (recommended environment manager).

Quick setup

# 1) Create the environment
uv sync
source .venv/bin/activate

# 2) (Optional) Install dev tools
uv sync --dev

# 3) (Optional) Install and Serve docs
uv sync --no-group docs
uv run mkdocs serve

Module quick start (one-liners + deep links)

Each module below has its own README with exact commands, configs, and outputs.

Controlled Images — Generate matched baseline vs fairness-aware images across professions. ➜ src/aixpert/controlled_images/README.md
Agent Pipeline (CrewAI) — Single-agent orchestration for prompt/image/metadata generation. ➜ src/aixpert/data_generation/agent_pipeline/README.md
Synthetic Data · Images — Domain/risk-specific image prompts and VQA pairs. ➜ src/aixpert/data_generation/synthetic_data_generation/images/README.md
Synthetic Data · NLP — Scene descriptions and MCQ generation for text pipelines. ➜ src/aixpert/data_generation/synthetic_data_generation/nlp/README.md

Fairness & Explainability (Toxicity fairness analysis) — Metrics (Statistical Parity, Equal Opportunity) + zero-shot explainers (integrated gradients). ➜ src/aixpert/toxicity_fairness_analysis/README.md
Documentation — MkDocs site sources; how to extend and publish docs. ➜ CONTRIBUTING.md

🧠 Key Components

🎨 Controlled Image Generation: Produces matched baseline vs fairness-aware images across professions.
🤖 Agentic AI (CrewAI): LLM-based prompt, image, and metadata orchestration.
🧾 Synthetic Data Generation: Domain/risk-specific image prompts, VQA pairs, scenes, and MCQs.
🎬 Video Generation: Uses Google Veo/Gemini APIs with checkpoint and resume logic.
⚖️ Fairness Metrics & Explainability: Statistical Parity, Equal Opportunity, and zero-shot explainers with integrated gradients.

🧪 Testing & CI/CD

Unit and integration tests via pytest.
Code quality enforced via pre-commit hooks:
- ruff — linting & formatting
- mypy — type checks
- typos — spell checks
- nbQA — notebook linting
Continuous checks through GitHub Actions (see badges above).

📚 Publications & Outputs

🧩 Bias in the Picture: Benchmarking VLMs with Social-Cue News Images, NeurIPS LLM Evals Workshop 2025
📜 TRiSM for Agentic AI, Preprint
📘 Responsible Agentic Reasoning and AI Agents, TechRxiv
🧠 Single-Agent TRiSM Poster (NeurIPS LAW Workshop 2025)

🤝 Contributing

We welcome community contributions! See CONTRIBUTING.md for coding standards, dev setup, and workflow conventions.

📄 License

This code in this repo is released under the MIT License.

💡 About AIXPERT

The AIXPERT Project unites 17 partners across Europe and Canada under the EU Horizon Europe Programme (Grant No. 101214389) and the Swiss SERI to advance explainable, fair, and accountable AI.

🌐 Project Website · LinkedIn · X/Twitter · YouTube

💰 Funding Acknowledgment

The AIXPERT Project has received funding from the European Union’s Horizon Europe Research and Innovation Programme under Grant No. 101214389, and from the Swiss State Secretariat for Education, Research and Innovation (SERI). Views expressed are those of the authors and do not necessarily reflect those of the European Union or funding authorities.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github		.github
docs		docs
src/aixpert		src/aixpert
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
_typos.toml		_typos.toml
codecov.yml		codecov.yml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 AI Fairness Data Generation and Question Answering System

🌍 What is the project about?

🧱 Repository Structure

🚀 Getting Started

Prerequisites

Quick setup

Module quick start (one-liners + deep links)

🧠 Key Components

🧪 Testing & CI/CD

📚 Publications & Outputs

🤝 Contributing

📄 License

💡 About AIXPERT

💰 Funding Acknowledgment

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

License

VectorInstitute/vector-aixpert

Folders and files

Latest commit

History

Repository files navigation

🧠 AI Fairness Data Generation and Question Answering System

🌍 What is the project about?

🧱 Repository Structure

🚀 Getting Started

Prerequisites

Quick setup

Module quick start (one-liners + deep links)

🧠 Key Components

🧪 Testing & CI/CD

📚 Publications & Outputs

🤝 Contributing

📄 License

💡 About AIXPERT

💰 Funding Acknowledgment

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages