β οΈ Project Under Development This project is currently under development and only supports limited functionality. Please check the release notes for stable features.
A project by Persona Lab at ModuLabs.
A tool that supports steering LLMs to exhibit specific personalities. The goal is to automatically generate datasets and work with just a model and personality specification.
- Build Steering Dataset - Generate steering datasets for vector extraction
- Extract Steering Vectors - Extract steering vectors using mean_diff or BiPO methods
- Steering Experiment - Apply steering vectors using CAA (Contrastive Activation Addition)
- Configuration - Environment variables and performance tuning
- OpenRouter Integration - Use cloud APIs instead of local GPU
- Community Datasets - Pre-built datasets and registry
- Troubleshooting - Common issues and solutions
- Contributing - Development guide and contribution guidelines
Basic Installation (CPU Version)
# Install uv (Windows)
Invoke-WebRequest -Uri "https://astral.sh/uv/install.ps1" -OutFile "install_uv.ps1"
& .\install_uv.ps1
# Project setup
uv venv
& .\.venv\Scripts\Activate.ps1
uv syncInstallation in Google Colab
# Install directly from GitHub
!pip install git+https://github.com/modulabs-personalab/psyctl.git
# Or install from specific branch
!pip install git+https://github.com/modulabs-personalab/psyctl.git@main
# Set environment variables
import os
os.environ['HF_TOKEN'] = 'your_huggingface_token_here'
os.environ['PSYCTL_LOG_LEVEL'] = 'INFO'
# Usage example
from psyctl import DatasetBuilder, P2, LLMLoaderGPU Acceleration Installation (CUDA Support)
# Install CUDA-enabled PyTorch after basic installation
uv pip install torch --index-url https://download.pytorch.org/whl/cu121
# Verify installation
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"Important: The
transformerspackage hastorchas a dependency, so runninguv syncwill automatically install the CPU version. For GPU usage, you need to run the CUDA installation command above again.
# 1. Generate dataset
psyctl dataset.build.steer \
--model "google/gemma-3-27b-it" \
--personality "Extroversion, Machiavellism" \
--output "./dataset/steering"
# 2. Upload dataset to HuggingFace Hub (optional)
psyctl dataset.upload \
--dataset-file "./dataset/steering/steering_dataset_*.jsonl" \
--repo-id "username/extroversion-steering"
# 3. Extract steering vector
psyctl extract.steering \
--model "meta-llama/Llama-3.2-3B-Instruct" \
--layer "model.layers[13].mlp.down_proj" \
--dataset "./dataset/steering" \
--output "./steering_vector/out.safetensors"
# 4. Steering experiment
psyctl steering \
--model "meta-llama/Llama-3.2-3B-Instruct" \
--steering-vector "./steering_vector/out.safetensors" \
--input-text "Tell me about yourself"
# 5. Inventory test
psyctl benchmark inventory \
--model "meta-llama/Llama-3.2-3B-Instruct" \
--steering-vector "./steering_vector/out.safetensors" \
--inventory "ipip_neo_120" \
--trait "Neuroticism"PSYCTL provides 5 main commands. See documentation links above for detailed usage.
| Command | Description | Documentation |
|---|---|---|
dataset.build.steer |
Generate steering datasets | Guide |
dataset.upload |
Upload datasets to HuggingFace | Guide |
extract.steering |
Extract steering vectors | Guide |
steering |
Apply steering to generation | Guide |
benchmark inventory |
Test with psychological inventories (logit-based) | See below |
benchmark llm-as-judge |
Test with LLM as Judge (situation-based questions) | See below |
inventory.list |
List available inventories | See below |
Benchmark Methods:
- Inventory: Uses standardized psychological inventories (e.g., IPIP-NEO) with logit-based scoring. More objective and reproducible.
- LLM as Judge: Generates situation-based questions and uses an LLM to evaluate responses. More flexible and context-aware.
- For API-based judges (OpenAI, OpenRouter), set environment variables:
OPENAI_API_KEYfor OpenAI modelsOPENROUTER_API_KEYfor OpenRouter models
- For local models, use
local-default(reuses target model) or configure custom model path inbenchmark_config.json - For custom API servers, edit
benchmark_config.jsonto add your server configuration
- For API-based judges (OpenAI, OpenRouter), set environment variables:
| Inventory | Domain | License | Notes |
|---|---|---|---|
| IPIP-NEO-300/120 | Big Five | Public Domain | Full & short forms |
| NPI-40 | Narcissism | Free research use | Forced-choice |
| PNI-52 | Pathological narcissism | CC-BY-SA | Likert 1β6 |
| NARQ-18 | Admiration & Rivalry | CC-BY-NC | Two sub-scales |
| MACH-IV | Machiavellianism | Public Domain | Likert 1β5 |
| LSRP-26 | Psychopathy | Public Domain | Primary & secondary |
| PPI-56 | Psychopathy | Free research use | Short form |
PSYCTL uses environment variables for configuration. Required:
# Get your token from https://huggingface.co/settings/tokens
export HF_TOKEN="your_huggingface_token_here" # Linux/macOS
$env:HF_TOKEN = "your_token_here" # WindowsFor detailed configuration options (directories, performance tuning, logging), see Configuration Guide.
# 1. Generate dataset for extroversion personality
# Set batch size for optimal performance
export PSYCTL_INFERENCE_BATCH_SIZE="16"
psyctl dataset.build.steer \
--model "meta-llama/Llama-3.2-3B-Instruct" \
--personality "Extroversion" \
--output "./dataset/extroversion" \
--limit-samples 1000
# 2. Extract steering vector
psyctl extract.steering \
--model "meta-llama/Llama-3.2-3B-Instruct" \
--layer "model.layers[13].mlp.down_proj" \
--dataset "./dataset/extroversion" \
--output "./steering_vector/extroversion.safetensors"
# 3. Apply steering to generate text
psyctl steering \
--model "meta-llama/Llama-3.2-3B-Instruct" \
--steering-vector "./steering_vector/extroversion.safetensors" \
--input-text "Tell me about yourself"
# 4. Measure personality changes with inventory
psyctl benchmark inventory \
--model "meta-llama/Llama-3.2-3B-Instruct" \
--steering-vector "./steering_vector/extroversion.safetensors" \
--inventory "ipip_neo_120" \
--trait "Extraversion"
# 5. Measure personality changes with LLM as Judge
# Note: For API-based judges, set environment variables:
# export OPENAI_API_KEY="your-key" # For OpenAI models
# export OPENROUTER_API_KEY="your-key" # For OpenRouter models
# Or configure custom API server in benchmark_config.json
psyctl benchmark llm-as-judge \
--model "meta-llama/Llama-3.2-3B-Instruct" \
--steering-vector "./steering_vector/extroversion.safetensors" \
--trait "Extraversion" \
--judge-model "local-default" \
--num-questions 10 \
--strengths "1.0,2.0,3.0"More Examples:
- See examples/ directory for Python library usage
- Check documentation links above for detailed guides
Contributions are welcome! See Contributing Guide for:
- Development environment setup
- Code style and standards
- Testing guidelines
- Pull request process
- Evaluating and Inducing Personality in Pre-trained Language Models
- Refusal in Language Models Is Mediated by a Single Direction
- Steering Llama 2 via Contrastive Activation Addition
- Steering Large Language Model Activations in Sparse Spaces
- Identifying and Manipulating Personality Traits in LLMs Through Activation Engineering
- Toy model of superposition
- Personalized Steering of LLMs: Versatile Steering Vectors via Bi-directional Preference Optimization
- The dark core of personality
- The Dark Triad of personality: Narcissism, Machiavellianism, and psychopathy. Journal of Research in Personality
- Style-Specific Neurons for Steering LLMs in Text Style Transfer
- Between facets and domains: 10 aspects of the Big Five. Journal of Personality and Social Psychology
