π Cell-free System Efficient Expression Regulatory Sequence Design Tool Based on NVIDIA EVO2-40B and Zhipu GLM-4.5
Integrates three-stage design workflow, LLM Agent intelligent optimization system, and natural language interaction interface to provide bioengineers with powerful DNA sequence design capabilities
This project implements a Python program based on the interaction between deep learning model EVO2 and large language model GLM-4.5-x, specifically designed for efficient expression regulatory sequence design in cell-free systems. Through a three-stage iterative optimization process, it achieves intelligent generation, validation, and optimization of DNA sequences, ultimately producing functional genetic regulatory elements.
- Three-stage Iterative Design Process: Unconstrained Exploration β Constrained Generation β Modular Validation
- Dual Model Collaboration: NVIDIA EVO2-40B + Zhipu GLM-4.5-x
- LLM Agent Automatic Optimization System: Intelligent Agent collaborative sequence optimization system
- Professional Sequence Analysis: GC content, functional element identification, secondary structure prediction
- Cell-free System Optimization: Specialized sequence optimization for cell-free expression systems
- GFP Expression Experiment Demo: Complete GFP expression regulatory sequence design workflow
- Interactive User Interface: Beautiful command-line interface and menu system based on Rich
- FASTA Sequence Export: Support for standard format sequence file export
- Natural Language Interaction: Multi-turn dialogue for sequence design requirements
- Agent Intelligent Optimization: Quality-driven automatic iteration and parameter adjustment
macOS:
# Using Homebrew (recommended)
brew install uv
# Or using curl
curl -LsSf https://astral.sh/uv/install.sh | shLinux:
# Using curl
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or using pip
pip install uvWindows:
# Using PowerShell
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# Or using pip
pip install uv# Clone the project
git clone https://github.com/SmartisanNaive/LLM_for_EVO2.git
cd LLM_for_EVO2
# Create virtual environment and install dependencies using uv
uv sync
# Install project to virtual environment
uv pip install -e .# Configure API keys
uv run evo2-designer setup
# Manage API configuration
uv run evo2-designer api-config
# Test API connection
uv run evo2-designer test- API Configuration:
api_config.json(project root)
Example configuration:
{
"nvidia_api_key": "your_nvidia_api_key_here",
"glm_api_key": "your_glm_api_key_here",
"evo2_base_url": "https://health.api.nvidia.com/v1/biology/nvidia/evo",
"glm_base_url": "https://open.bigmodel.cn/api/paas/v4/"
}# Launch interactive menu
uv run python main.py
# or
uv run evo2-designer interactive# Basic commands
uv run evo2-designer setup # Initialize configuration
uv run evo2-designer test # Test API connection
uv run evo2-designer design --prompt "TAATACGACTCACTATAGGG" --length 99
uv run evo2-designer analyze --sequence "ATCGATCGATCG"
uv run evo2-designer list-projects # View project list
uv run evo2-designer agent-config # Configure Agent parameters# Run GFP experiment demo
uv run evo2-designer gfp-demo --max-length 140 --target-length 120 --exportDesigns DNA regulatory sequences for GFP protein expression in cell-free systems, including T7 promoter, 5'UTR with RBS, and start codon ATG.
from evo2_sequence_designer import (
Evo2Client, GLMClient, ThreeStageDesigner,
DesignParameters
)
# Initialize clients
evo2_client = Evo2Client(Evo2Config(api_key="your_nvidia_api_key"))
glm_client = GLMClient(GLMConfig(api_key="your_glm_api_key"))
# Create designer
designer = ThreeStageDesigner(evo2_client, glm_client)
# Run design
parameters = DesignParameters(
initial_prompt="TAATACGACTCACTATAGGG",
target_length=99
)
project = designer.run_complete_design(parameters)
print(f"Final sequence: {project.final_sequence}")from evo2_sequence_designer.models.evo2_client import Evo2Client, Evo2Config
config = Evo2Config(api_key="your_nvidia_api_key")
client = Evo2Client(config)
response = client.generate_sequence(
prompt="TAATACGACTCACTATAGGG",
max_tokens=100,
temperature=0.7
)from evo2_sequence_designer.models.glm_client import GLMClient, GLMConfig
config = GLMConfig(api_key="your_glm_api_key")
client = GLMClient(config)
analysis = client.analyze_sequence(
sequence="ATCGATCGATCG",
analysis_type="optimization"
)from evo2_sequence_designer.analysis import SequenceAnalyzer
analyzer = SequenceAnalyzer()
analysis = analyzer.analyze_sequence("ATCGATCGATCG")
print(f"Quality score: {analysis.quality_score}")- NVIDIA EVO2: https://build.nvidia.com/arc/evo2-40b (DNA sequence generation)
- Zhipu GLM: https://open.bigmodel.cn/ (Biological analysis)
evo2-sequence-designer/
βββ main.py # Program entry
βββ src/evo2_sequence_designer/
β βββ main.py # Main module
β βββ models/ # Model interfaces
β β βββ evo2_client.py # EVO2 client
β β βββ glm_client.py # GLM client
β βββ analysis/ # Sequence analysis
β βββ design/ # Design workflow
β βββ agents/ # Agent system
β βββ demos/ # Demo modules
βββ api_config.json # API configuration
βββ pyproject.toml
- Python 3.11+
- uv package manager
- NVIDIA API access
- Zhipu AI API access
biopython: Bioinformatics analysisrequests: HTTP requestszhipuai: Zhipu AI SDKrich: CLI interfacetyper: Command line framework
Extend BaseAgent class in agents/ directory to implement custom optimization logic.
Extend SequenceAnalyzer in analysis/ directory for new sequence analysis features.
Add new model clients in models/ directory following the existing interface pattern.
Add new commands using the typer framework in the main application.
- API call caching
- Batch processing
- Intelligent retry mechanism
- Asynchronous requests
- Fork the project
- Create a feature branch
- Commit changes
- Submit a Pull Request
MIT License. See the LICENSE file for details.
- NVIDIA: EVO2-40B model API
- Zhipu AI: GLM-4.5-x model support
- BioPython: Bioinformatics tools
Note: Valid API keys required. Please comply with service terms.