The Auto Audience Generator is a smart, LLM-powered tool designed to automatically generate targeted user audiences from natural language prompts using a structured Knowledge Graph (KG), rule-based filtering, and semantic matching.
This tool allows marketers, analysts, and product teams to:
-
Input natural-language prompts like:
"Find crypto enthusiasts"
"Show users interested in fitness and wellness"
"Find near graduting studnets" -
Behind the scenes, it:
- Builds a Knowledge Graph using user-product-content interaction data.
- Uses LLM to extract logical audience filtering rules (e.g., age > 18, tag IN [“crypto”, “blockchain”]).
- Expands matching fields using semantic embeddings for robust keyword-to-field-value matching.
- Applies rules on the KG and outputs the list of matched users.
- Visualizes user-item relationships using a dynamic subgraph display.
| Module | Description |
|---|---|
| graph_builder.py | Constructs the Knowledge Graph from CSVs based on a JSON schema |
| prompt_to_rules.py | Uses OpenRouter + LLM to turn prompts into executable logical rules |
| graph_queries.py | Evaluates logical rules (AND/OR/nested) on the KG to select audience |
| semantic_matcher.py | Uses embedding similarity to expand keywords like "crypto" → "blockchain" |
| app.py | Streamlit interface to run everything in one click |
- Multi-source Knowledge Graph with user/product/content data
- Natural language → LLM-based rules (AND, OR, nested)
- Embedding-based synonym matching (e.g., “crypto” → “blockchain”)
- Graph subvisualization of user-to-interest relationships
- Streamlit UI to demo everything end-to-end
| Dataset | Description |
|---|---|
| users.csv | User demographics: age, gender, location |
| products.csv | Product info with tags/categories |
| orders.csv | User-product purchase history |
| streaming.csv | User-content interaction & genres |
auto-audience-generator/
├── assets/ # To store logo images or any other artifact
├── data/ # Sample CSVs
├── notebooks/ # Jupyter demo notebooks
├── src/ # Modular Python code
│ ├── graph_builder.py # Knowledge Graph builder
│ ├── graph_queries.py # Rule execution engine
│ ├── prompt_to_rules.py # LLM-based rule extractor
│ ├── semantic_matcher.py # Embedding-based semantic expander
├── app.py # Main Streamlit app
├── requirements.txt
│── README.md
- Clone the repo:
- git clone https://github.com/nik21hil/auto-audience-generator.git
- cd auto-audience-generator
- (Optional) Create virtual environment:
- python3 -m venv venv
- source venv/bin/activate
- Install dependencies:
- pip install -r requirements.txt
- Add your OpenRouter API Key (in .streamlit/secrets.toml):
- OPENROUTER_API_KEY = "your-key-here"
- Run the Streamlit app:
- streamlit run app.py
| Category | Planned Feature |
|---|---|
| 🔄 Rule Intelligence | Score each rule with confidence / prompt follow-ups to relax or tighten rules |
| ✍️ Rule Editor | In-app manual rule editing + live rule preview |
| 💾 Rule History | Save, re-use, and manage frequently used prompts and rules |
| 🧠 Smarter Matching | Expand KG and embeddings to support domain-specific synonyms |
| 🧩 Auto Schema Build | Ingest raw CSV + data dictionary → auto-create graph schema |
| ⚡ UX Improvements | Tag clouds, field highlights, advanced visualizations |
Contributions are welcome. Please open an issue or pull request to discuss improvements, features, or bug fixes.
This project is licensed under the MIT License - see the LICENSE file for details.
Nikhil Singh
GitHub | LinkedIn
Enjoy building! 🎯