Skip to content

VectorInstitute/vector-aixpert

🧠 AI Fairness Data Generation and Question Answering System

code checks unit tests integration tests docs


Transparent tools and standardized benchmarks for fair, explainable, and accountable generative AI.

The rapid expansion of GenAI magnifies long-standing concerns around bias, fairness, and representation. This project enables systematic, controlled experimentation so researchers can identify when and why bias occurs, and test what mitigates it.


🌍 What is the project about?

The AI Fairness Data Generation and Question Answering System is part of Vector Institute's contribution to the broader AIXPERT Project, a multi-institutional initiative, to develop tools and benchmarks for fairness-aware data generation and evaluation in generative AI.

It provides:

  • 🧩 Controlled synthetic datasets — safely isolate bias-inducing factors.
  • 🤖 Agentic automation using CrewAI and custom LLM agents.
  • 📊 Fairness metrics & explainers to visualize disparities.
  • ⚙️ Configurable, reproducible pipelines for responsible AI research.

📘 Documentation: Project website

📂 Data: Hugging Face

🧮 Code: GitHub Page


🧱 Repository Structure

Path Description
src/aixpert/controlled_images/ Controlled image generation (baseline vs fairness-aware).
src/aixpert/data_generation/synthetic_data_generation/images/ Domain- and risk-specific image + VQA generation.
src/aixpert/data_generation/synthetic_data_generation/nlp/ Domain- and risk-specific Scene + MCQ generation.
src/aixpert/data_generation/synthetic_data_generation/videos/ Video synthesis using Google Veo / Gemini API.
src/aixpert/data_generation/agent_pipeline/ Single-agent CrewAI pipeline for multimodal orchestration.
src/aixpert/toxicity_fairness_analysis/ Fairness metrics and zero-shot explainability (integrated gradients).
docs/ MkDocs documentation sources.
tests/ Tests using pytest.

🚀 Getting Started

New to the project? Follow the steps below to set up your development environment and explore key modules.

Prerequisites

Ensure you have uv installed (recommended environment manager).

Quick setup

# 1) Create the environment
uv sync
source .venv/bin/activate

# 2) (Optional) Install dev tools
uv sync --dev

# 3) (Optional) Install and Serve docs
uv sync --no-group docs
uv run mkdocs serve

Module quick start (one-liners + deep links)

Each module below has its own README with exact commands, configs, and outputs.


🧠 Key Components

  • 🎨 Controlled Image Generation: Produces matched baseline vs fairness-aware images across professions.
  • 🤖 Agentic AI (CrewAI): LLM-based prompt, image, and metadata orchestration.
  • 🧾 Synthetic Data Generation: Domain/risk-specific image prompts, VQA pairs, scenes, and MCQs.
  • 🎬 Video Generation: Uses Google Veo/Gemini APIs with checkpoint and resume logic.
  • ⚖️ Fairness Metrics & Explainability: Statistical Parity, Equal Opportunity, and zero-shot explainers with integrated gradients.

🧪 Testing & CI/CD

  • Unit and integration tests via pytest.

  • Code quality enforced via pre-commit hooks:

    • ruff — linting & formatting
    • mypy — type checks
    • typos — spell checks
    • nbQA — notebook linting
  • Continuous checks through GitHub Actions (see badges above).


📚 Publications & Outputs


🤝 Contributing

We welcome community contributions! See CONTRIBUTING.md for coding standards, dev setup, and workflow conventions.


📄 License

This code in this repo is released under the MIT License.


💡 About AIXPERT

The AIXPERT Project unites 17 partners across Europe and Canada under the EU Horizon Europe Programme (Grant No. 101214389) and the Swiss SERI to advance explainable, fair, and accountable AI.

🌐 Project Website · LinkedIn · X/Twitter · YouTube


💰 Funding Acknowledgment

The AIXPERT Project has received funding from the European Union’s Horizon Europe Research and Innovation Programme under Grant No. 101214389, and from the Swiss State Secretariat for Education, Research and Innovation (SERI). Views expressed are those of the authors and do not necessarily reflect those of the European Union or funding authorities.

About

AI Fairness Data Generation and Question Answering System

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages