GewoonJaap · Copilot · Nov 4, 2025 · Nov 4, 2025 · Nov 4, 2025 · Nov 4, 2025
diff --git a/.dev.vars.example b/.dev.vars.example
@@ -2,8 +2,13 @@
 
 # Required: OAuth2 credentials JSON from Gemini CLI authentication
 # Get this by running `gemini auth` and copying the contents of ~/.gemini/oauth_creds.json
+# Supports single account (object) or multiple accounts (array) for rate limit avoidance
+# Single account example:
 GCP_SERVICE_ACCOUNT={"access_token":"ya29.a0AS3H6Nx...","refresh_token":"1//09FtpJYpxOd...","scope":"https://www.googleapis.com/auth/cloud-platform ...","token_type":"Bearer","id_token":"eyJhbGciOiJSUzI1NiIs...","expiry_date":1750927763467}
 
+# Multiple accounts example (for rate limit avoidance):
+# GCP_SERVICE_ACCOUNT=[{"access_token":"ya29...","refresh_token":"1//...","scope":"...","token_type":"Bearer","id_token":"eyJ...","expiry_date":1750927763467},{"access_token":"ya29...","refresh_token":"1//...","scope":"...","token_type":"Bearer","id_token":"eyJ...","expiry_date":1750927763467}]
+
 # Optional: Google Cloud Project ID (auto-discovered if not set)
 # GEMINI_PROJECT_ID=your-project-id
 
@@ -12,6 +17,11 @@ GCP_SERVICE_ACCOUNT={"access_token":"ya29.a0AS3H6Nx...","refresh_token":"1//09Ft
 # Example: sk-1234567890abcdef1234567890abcdef
 OPENAI_API_KEY=sk-your-secret-api-key-here
 
+# Optional: Enable multi-account rotation for rate limit avoidance (set to "true" to enable)
+# When enabled with multiple accounts in GCP_SERVICE_ACCOUNT, the system will automatically
+# rotate between accounts when rate limits are encountered, ensuring continuous operation
+ENABLE_MULTI_ACCOUNT=true
+
 # Optional: Enable fake thinking output for thinking models (set to "true" to enable)
 # When enabled, models marked with thinking: true will generate synthetic reasoning text
 # before providing their actual response, similar to OpenAI's o3 model behavior

diff --git a/MULTI_ACCOUNT_TESTING.md b/MULTI_ACCOUNT_TESTING.md
@@ -0,0 +1,168 @@
+# Multi-Account Support Testing Guide
+
+This document explains how to test the multi-account support feature for rate limiting avoidance.
+
+## Setup for Testing
+
+### 1. Prepare Multiple Google Accounts
+
+You'll need at least 2 Google accounts authenticated with Gemini CLI:
+
+```bash
+# Account 1
+gemini auth
+# Copy ~/.gemini/oauth_creds.json to account1.json
+
+# Account 2 (use different Google account)
+# Delete ~/.gemini/oauth_creds.json first
+gemini auth
+# Copy ~/.gemini/oauth_creds.json to account2.json
+```
+
+### 2. Create Multi-Account Configuration
+
+Combine the credentials into a JSON array:
+
+```bash
+# Create combined.json
+echo '[' > combined.json
+cat account1.json >> combined.json
+echo ',' >> combined.json
+cat account2.json >> combined.json
+echo ']' >> combined.json
+
+# Minify for environment variable (remove newlines and spaces)
+cat combined.json | jq -c '.' > credentials.json
+```
+
+### 3. Configure Environment Variables
+
+In your `.dev.vars` file:
+
+```bash
+# Multi-account credentials
+GCP_SERVICE_ACCOUNT=<paste content from credentials.json>
+
+# Enable multi-account rotation
+ENABLE_MULTI_ACCOUNT=true
+
+# Optional: Your API key
+OPENAI_API_KEY=sk-your-test-key
+```
+
+## Testing Scenarios
+
+### Test 1: Basic Account Rotation
+
+1. Start the development server:
+   ```bash
+   npm run dev
+   ```
+
+2. Make a request to the chat completions endpoint:
+   ```bash
+   curl -X POST http://localhost:8787/v1/chat/completions \
+     -H "Content-Type: application/json" \
+     -H "Authorization: Bearer sk-your-test-key" \
+     -d '{
+       "model": "gemini-2.5-flash",
+       "messages": [{"role": "user", "content": "Hello"}]
+     }'
+   ```
+
+3. Check the console logs - you should see:
+   - `Loaded 2 accounts. Multi-account mode: true`
+   - `Found available account at index X`
+
+### Test 2: Rate Limit Fallback
+
+To test rate limit handling, you would need to:
+
+1. Generate enough requests to hit the rate limit on account 1
+2. The system should automatically switch to account 2
+3. Check logs for:
+   - `Got rate limit error (429) for account 0`
+   - `Marking account 0 as rate-limited`
+   - `Switching from account 0 to account 1`
+
+### Test 3: Account Health Tracking
+
+1. Check the KV storage for account health data:
+   ```bash
+   wrangler kv:key list --binding=GEMINI_CLI_KV
+   ```
+
+2. You should see keys like:
+   - `oauth_token_cache_account_0`
+   - `oauth_token_cache_account_1`
+   - `account_rotation_state`
+   - `account_health_0` (if an account was rate-limited)
+
+### Test 4: Single Account Compatibility
+
+To verify backward compatibility:
+
+1. Configure a single account (not an array):
+   ```bash
+   GCP_SERVICE_ACCOUNT={"access_token":"...","refresh_token":"...","scope":"...","token_type":"Bearer","id_token":"...","expiry_date":...}
+   ENABLE_MULTI_ACCOUNT=false
+   ```
+
+2. The system should work exactly as before with no multi-account logic
+
+## Monitoring in Production
+
+When deployed to Cloudflare Workers, monitor the logs:
+
+```bash
+wrangler tail
+```
+
+Look for:
+- Account rotation events
+- Rate limit detections
+- Successful failovers
+- Account health updates
+
+## Expected Behavior
+
+### Normal Operation
+- Requests use accounts in round-robin rotation
+- Each account's token is cached independently
+- Rotation state is persisted in KV storage
+
+### Rate Limit Scenario
+1. Request fails with HTTP 429 or 503
+2. Current account is marked as rate-limited
+3. System switches to next available account
+4. Request is retried (up to 3 times)
+5. Rate-limited account enters cooldown (60 seconds)
+
+### All Accounts Rate-Limited
+- System will return an error after exhausting all accounts
+- Error message: "All accounts are rate-limited. Please try again later."
+
+## Troubleshooting
+
+### Issue: "Authentication failed"
+- Verify all accounts have valid refresh tokens
+- Check that credentials are properly formatted as JSON array
+- Ensure `ENABLE_MULTI_ACCOUNT=true` is set
+
+### Issue: Not switching accounts on rate limit
+- Verify `ENABLE_MULTI_ACCOUNT=true` is set
+- Check that you have multiple accounts in the array
+- Review worker logs for error messages
+
+### Issue: Accounts not recovering from rate limit
+- Check KV storage TTL settings (default 60 seconds cooldown)
+- Verify account health keys expire properly
+- Review timestamp calculations in logs
+
+## Performance Metrics
+
+Expected improvements with N accounts:
+- Rate limit capacity: ~N × single account limit
+- Failover time: < 100ms (KV lookup + auth)
+- Additional storage: ~1KB per account in KV
+- Request overhead: Minimal (~10ms for account selection)
diff --git a/README.md b/README.md
@@ -19,6 +19,7 @@ Transform Google's Gemini models into OpenAI-compatible endpoints using Cloudfla
 - 🆓 **Free Tier Access** - Leverage Google's free tier through Code Assist API
 - 📡 **Real-time Streaming** - Server-sent events for live responses with token usage
 - 🎭 **Multiple Models** - Access to latest Gemini models including experimental ones
+- 🔀 **Multi-Account Support** - Automatic rotation between multiple accounts to avoid rate limiting
 
 ## 🤖 Supported Models
 
@@ -100,6 +101,55 @@ You need OAuth2 credentials from a Google account that has accessed Gemini. The
    }
    ```
 
+#### Multi-Account Setup (Optional - for Rate Limit Avoidance)
+
+To avoid rate limiting, you can configure multiple Google accounts. The system will automatically rotate between accounts when one hits a rate limit.
+
+1. **Authenticate your first account**:
+   - Run `gemini auth` and login with your first Google account
+   - Navigate to the credentials file location:
+     - **Windows:** `C:\Users\USERNAME\.gemini\oauth_creds.json`
+     - **macOS/Linux:** `~/.gemini/oauth_creds.json`
+   - Copy the entire contents to a file named `account1.json`
+
+2. **Authenticate your second account** (repeat for more accounts):
+   - Delete the existing `~/.gemini/oauth_creds.json` file
+   - Run `gemini auth` again and login with your second Google account
+   - Copy the new credentials to `account2.json`
+
+3. **Combine credentials into an array**:
+   Instead of a single credential object, use a JSON array:
+   ```json
+   [
+     {
+       "access_token": "ya29.a0AS3H6Nx...",
+       "refresh_token": "1//09FtpJYpxOd...",
+       "scope": "https://www.googleapis.com/auth/cloud-platform ...",
+       "token_type": "Bearer",
+       "id_token": "eyJhbGciOiJSUzI1NiIs...",
+       "expiry_date": 1750927763467
+     },
+     {
+       "access_token": "ya29.a0Bb2H8Mx...",
+       "refresh_token": "1//09GtqKZqyPe...",
+       "scope": "https://www.googleapis.com/auth/cloud-platform ...",
+       "token_type": "Bearer",
+       "id_token": "eyJhbGciOiJSUzI1NiIt...",
+       "expiry_date": 1750927763467
+     }
+   ]
+   ```
+
+4. **Enable multi-account mode**:
+   Set the `ENABLE_MULTI_ACCOUNT` environment variable to `"true"` in your `.dev.vars` file (see Step 3: Environment Setup below).
+
+**How it works:**
+- The system tracks account health and rate limit status in Cloudflare KV storage
+- When a request fails with a rate limit error (HTTP 429 or 503), the system automatically switches to the next available account
+- Rate-limited accounts are placed on cooldown (60 seconds by default) before being tried again
+- Accounts are rotated in a round-robin fashion for optimal distribution
+- Up to 3 retry attempts are made before giving up
+
 ### Step 2: Create KV Namespace
 
 ```bash
@@ -118,6 +168,8 @@ kv_namespaces = [
 ### Step 3: Environment Setup
 
 Create a `.dev.vars` file:
+
+**Single Account (Basic Setup):**
 ```bash
 # Required: OAuth2 credentials JSON from Gemini CLI authentication
 GCP_SERVICE_ACCOUNT={"access_token":"ya29...","refresh_token":"1//...","scope":"...","token_type":"Bearer","id_token":"eyJ...","expiry_date":1750927763467}
@@ -131,6 +183,18 @@ GCP_SERVICE_ACCOUNT={"access_token":"ya29...","refresh_token":"1//...","scope":"
 OPENAI_API_KEY=sk-your-secret-api-key-here
 ```
 
+**Multiple Accounts (Rate Limit Avoidance):**
+```bash
+# Required: OAuth2 credentials JSON array for multiple accounts
+GCP_SERVICE_ACCOUNT=[{"access_token":"ya29...","refresh_token":"1//...","scope":"...","token_type":"Bearer","id_token":"eyJ...","expiry_date":1750927763467},{"access_token":"ya29...","refresh_token":"1//...","scope":"...","token_type":"Bearer","id_token":"eyJ...","expiry_date":1750927763467}]
+
+# Optional: Enable multi-account rotation (required for automatic account switching)
+ENABLE_MULTI_ACCOUNT=true
+
+# Optional: API key for authentication
+OPENAI_API_KEY=sk-your-secret-api-key-here
+```
+
 For production, set the secrets:
 ```bash
 wrangler secret put GCP_SERVICE_ACCOUNT
@@ -158,9 +222,10 @@ npm run dev
 
 | Variable | Required | Description |
 |----------|----------|-------------|
-| `GCP_SERVICE_ACCOUNT` | ✅ | OAuth2 credentials JSON string. |
+| `GCP_SERVICE_ACCOUNT` | ✅ | OAuth2 credentials JSON string. Supports single account (object) or multiple accounts (array) for rate limit avoidance. |
 | `GEMINI_PROJECT_ID` | ❌ | Google Cloud Project ID (auto-discovered if not set). |
 | `OPENAI_API_KEY` | ❌ | API key for authentication. If not set, the API is public. |
+| `ENABLE_MULTI_ACCOUNT` | ❌ | Enable multi-account rotation for rate limit avoidance (set to `"true"`). Only works when `GCP_SERVICE_ACCOUNT` contains an array of accounts. |
 
 #### Thinking & Reasoning
 
@@ -217,11 +282,26 @@ npm run dev
 - Only applies to supported model pairs (currently: pro → flash).
 - Works for both streaming and non-streaming requests.
 
+**Multi-Account Support:**
+- When `ENABLE_MULTI_ACCOUNT` is set to `"true"` and `GCP_SERVICE_ACCOUNT` contains multiple accounts (JSON array), the system automatically rotates between accounts to avoid rate limiting.
+- **Intelligent Rotation**: Accounts are rotated in a round-robin fashion, with automatic fallback when one account hits a rate limit.
+- **Account Health Tracking**: The system tracks which accounts are rate-limited and automatically skips them until the cooldown period expires (60 seconds default).
+- **Seamless Failover**: When a request fails with HTTP 429 or 503, the system immediately switches to the next available account and retries the request (up to 3 attempts).
+- **Stateless & Distributed**: Account rotation state is stored in Cloudflare KV, ensuring consistent behavior across all edge locations and worker instances.
+- **Token Caching**: Each account's OAuth token is cached independently in KV storage for optimal performance.
+- **Works with Auto Model Switching**: Multi-account rotation and auto model switching can be used together for maximum resilience.
+
+**Benefits of Multi-Account:**
+- **Increased throughput**: Effectively multiplies your rate limit capacity by the number of accounts.
+- **Uninterrupted service**: Automatic failover ensures requests don't fail due to rate limits.
+- **Simple setup**: Just authenticate multiple Google accounts and combine their credentials into an array.
+- **Production-ready**: Designed for serverless Cloudflare Workers with distributed state management.
+
 ### KV Namespaces
 
 | Binding | Purpose |
 |---------|---------|
-| `GEMINI_CLI_KV` | Token caching and session management |
+| `GEMINI_CLI_KV` | Token caching, session management, account rotation state, and account health tracking |
 
 ## 🚨 Troubleshooting