Create unified vector embeddings for customer profiling and segmentation
Customer Vector Lab is a no-code tool to:
- Upload raw customer data
- Automatically clean and standardize numeric features
- Generate vector embeddings using PCA
- Cluster customers using KMeans
- Visualize personas using:
- 📊 PCA scatter plot
- 📈 UMAP + t-SNE projections
- 🕸 Radar charts of cluster traits
- Explore customer distribution across clusters
It's perfect for data scientists, marketers, and business analysts to quickly identify segments and personas for personalization, targeting, or storytelling.
- CSV format
- Works best with customer records that include:
- Demographics (age, income, location)
- Behavioral signals (spending, visits)
- Transaction data (LTV, frequency)
- Categorical variables are automatically one-hot encoded
- ID columns are excluded from clustering
- Final dataset includes all original columns + cluster labels + PC1/PC2
- Ready for persona marketing, analysis, or targeted campaigns
# 1. Clone the repo
git clone https://github.com/nik21hil/customer-vector-lab.git
cd customer-vector-lab
# 2. (Optional) Create virtual environment
python -m venv env
source env/bin/activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Run the Streamlit app
streamlit run app.pycustomer-vector-lab/
├── assets/ # To store logo images or any other artifact
├── data/ # Sample customer CSVs
├── notebooks/ # Jupyter demo notebooks
├── src/ # Modular Python code
│ ├── preprocess.py
│ ├── embeddings.py
│ ├── clustering.py
│ ├── visualize.py
├── app.py # Main Streamlit app
├── requirements.txt
└── README.md
MIT License — feel free to fork, remix, and use.
Built by @nik21hil
For issues or suggestions, feel free to open a GitHub issue or connect via LinkedIn.
Enjoy building! 🎯