Skip to content

nik21hil/auto-audience-generator

Repository files navigation

🧠 Auto Audience Generator

The Auto Audience Generator is a smart, LLM-powered tool designed to automatically generate targeted user audiences from natural language prompts using a structured Knowledge Graph (KG), rule-based filtering, and semantic matching.

Streamlit App


✨ Overview

This tool allows marketers, analysts, and product teams to:

  • Input natural-language prompts like:
    "Find crypto enthusiasts"
    "Show users interested in fitness and wellness"
    "Find near graduting studnets"

  • Behind the scenes, it:

    1. Builds a Knowledge Graph using user-product-content interaction data.
    2. Uses LLM to extract logical audience filtering rules (e.g., age > 18, tag IN [“crypto”, “blockchain”]).
    3. Expands matching fields using semantic embeddings for robust keyword-to-field-value matching.
    4. Applies rules on the KG and outputs the list of matched users.
    5. Visualizes user-item relationships using a dynamic subgraph display.

⚙️ Key Components

Module Description
graph_builder.py Constructs the Knowledge Graph from CSVs based on a JSON schema
prompt_to_rules.py Uses OpenRouter + LLM to turn prompts into executable logical rules
graph_queries.py Evaluates logical rules (AND/OR/nested) on the KG to select audience
semantic_matcher.py Uses embedding similarity to expand keywords like "crypto" → "blockchain"
app.py Streamlit interface to run everything in one click

✅ Current Capabilities

  • Multi-source Knowledge Graph with user/product/content data
  • Natural language → LLM-based rules (AND, OR, nested)
  • Embedding-based synonym matching (e.g., “crypto” → “blockchain”)
  • Graph subvisualization of user-to-interest relationships
  • Streamlit UI to demo everything end-to-end

📊 Sample Data Overview

Dataset Description
users.csv User demographics: age, gender, location
products.csv Product info with tags/categories
orders.csv User-product purchase history
streaming.csv User-content interaction & genres

📁 Folder Structure

auto-audience-generator/
├── assets/            # To store logo images or any other artifact
├── data/              # Sample CSVs
├── notebooks/         # Jupyter demo notebooks
├── src/               # Modular Python code
│   ├── graph_builder.py # Knowledge Graph builder
│   ├── graph_queries.py # Rule execution engine
│   ├── prompt_to_rules.py # LLM-based rule extractor
│   ├── semantic_matcher.py # Embedding-based semantic expander
├── app.py             # Main Streamlit app
├── requirements.txt
│── README.md

🛠️ Setup Instructions

  • Clone the repo:
  • (Optional) Create virtual environment:
    • python3 -m venv venv
    • source venv/bin/activate
  • Install dependencies:
    • pip install -r requirements.txt
  • Add your OpenRouter API Key (in .streamlit/secrets.toml):
    • OPENROUTER_API_KEY = "your-key-here"
  • Run the Streamlit app:
    • streamlit run app.py

🛠️ Planned Enhancements

Category Planned Feature
🔄 Rule Intelligence Score each rule with confidence / prompt follow-ups to relax or tighten rules
✍️ Rule Editor In-app manual rule editing + live rule preview
💾 Rule History Save, re-use, and manage frequently used prompts and rules
🧠 Smarter Matching Expand KG and embeddings to support domain-specific synonyms
🧩 Auto Schema Build Ingest raw CSV + data dictionary → auto-create graph schema
⚡ UX Improvements Tag clouds, field highlights, advanced visualizations

🤝 Contributing

Contributions are welcome. Please open an issue or pull request to discuss improvements, features, or bug fixes.


📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🌐 Author

Nikhil Singh
GitHub | LinkedIn


Enjoy building! 🎯

About

LLM-powered audience segment builder using Knowledge Graphs & natural language rules

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors