Add OpenRouter multi-model and multi-language support (initial scaffolding with GPT-OSS-120B)#38
Add OpenRouter multi-model and multi-language support (initial scaffolding with GPT-OSS-120B)#38bestian wants to merge 108 commits intoJigsaw-Code:mainfrom
Conversation
json_schema: {
name: "response",
strict: true, // 若改為 false 會允許更寬鬆的格式
schema: schema
}
|
I'm also interested in this direction, @bestian, and am excited for the vTaiwan process using sensemaking-tools! I wonder if it could be chunked into smaller pieces for easier review and discussion of integration potential? It seems perhaps there's:
If the maintainer found it easier to weigh in the components of this, I'd be happy to help submit smaller, more atomic pull requests in the interest of seeing these arrive upstream 🙏 |
|
@patcon I think your approach will be much better for maintenance. It’s beyond my experience and skill, but if you can divide this PR into chunks or components, I would be very grateful for your contribution. |
|
I sent an invitation to you for colleboration, please check it: https://github.com/bestian/sensemaking-tools perhaps you can create new branches in this repo for chunking this PR into smaller pieces for easier review? Thank you a lot. |
|
thanks! Going to wait to hear back from one of the maintainers, as I don't want to presume these are directions they would like to take things in 🙏 |
library/README_intro.md
Outdated
| @@ -0,0 +1,253 @@ | |||
| # **Sensemaker by Jigsaw \- A Google AI Proof of Concept** | |||
There was a problem hiding this comment.
Is this file just a copy of the original README?
metasoarous
left a comment
There was a problem hiding this comment.
Thanks for your submission @bestian!
Overall, this is a helpful addition to the project. A few changes would be helpful before we merge this:
- Documentation & Artifacts: There are a lot of new markdown files (e.g.
design/branch_todo.md,design/工程設計.md, and variousINTEGRATION_SUMMARY.mdandIMPLEMENTATION_GUIDE.mdfiles), some of which seem to be AI planning artifacts. Can you please:- Remove any planning documents (or scaffolding code) which aren't necessary to understand or run the codebase.
- For any technical documentation you think is important to keep (like architecture designs), please translate them into English so they are accessible to all maintainers. If it's easier, feel free to just remove them for now!
- Ensure code comments and logging statements are also in English for maintainability.
- (Optional) If you're interested, we can discuss how to best organize non-English usage documentation in a future PR, but that's not a requirement for this merge.
We appreciate the effort to bring OpenRouter and multi-language support to Sensemaker!
the old implementation here might be better. Since this is an abstract base class, we don't want to specify the default in the constructor function, since this may often get overridden. Jigsaw-Code#38 (comment)
It shouldn't be necessary to check this explicitly here, since this gets covered by the outputSchema Jigsaw-Code#38 (comment)
|
Thanks for the review @metasoarous! I’ve addressed the review comments as much as possible and tested the changes locally. I did attempt to use AI to translate all comments into English, but unfortunately it also modified some functional code, so I had to revert those changes. As a result, some comments are still in Chinese for now — please feel free to edit or ignore them if needed. Could you please take another look when you have time? |
This commit fixes compatibility issues when using Anthropic models (like Claude Opus 4.6) through the OpenRouter API. Changes: - Use json_object mode for Anthropic models instead of strict json_schema (Anthropic models don't support the strict parameter) - Add 'topics' to wrapper key detection to handle Anthropic's response format - Add convert_polis.py utility script to convert Polis CSV exports to the required sensemaker format Testing: - Successfully tested with anthropic/claude-opus-4.6 model - Generated comprehensive summary of 31-statement Polis conversation - All output formats (HTML, MD, JSON, CSV) working correctly Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
log:
/Users/bestian/Documents/GitHub/sensemaking-tools/library/src/models/openrouter_model.ts:97
throw new Error('Response format error: expected array but got ' + typeof processedData);
^
Error: Response format error: expected array but got object
at OpenRouterModel. (/Users/bestian/Documents/GitHub/sensemaking-tools/library/src/models/openrouter_model.ts:97:17)
at Generator.next ()
feat(openrouter): Add MiniMax M2.5 support; keep Anthropic and GPT-OSS-120b unchanged
…debug logs 修正 Vertex AI 的 systemInstruction 使用方式、新增語言參數至 runner CLI、清除除錯日誌 - Fix getRequest() in vertex_model.ts: use proper GenerateContentRequest systemInstruction field instead of incorrectly placing language prefix as "system" role in contents array 修正 vertex_model.ts 中的 getRequest():使用正確的 GenerateContentRequest systemInstruction 欄位,而非錯誤地將語言前綴放在 contents 陣列的 "system" 角色中 - Add -l/--language CLI option to runner.ts for Vertex AI runner, bringing feature parity with runner_openrouter.ts 為 Vertex AI runner (runner.ts) 新增 -l/--language CLI 選項, 使其與 runner_openrouter.ts 功能一致 - Remove all commented-out debug console.log statements across 7 files 移除 7 個檔案中所有被註解掉的除錯 console.log 語句
…cleanup Fix Vertex AI systemInstruction, add language flag, clean up debug logs / 修正 Vertex AI systemInstruction、新增語言參數、清除除錯日誌
Update .env.example Update .gitignore 建立工程設計與分支的todo list初稿,建立一個方向,實作過程中還會再修。work on bestian#3 設定工程基本方向 Update 工程設計.md 建立鷹架資料夾和hello_World, work on bestian#2 use .ts Update package-lock.json 以測試用的小程式,先確保 openai-sdk可以被串接上, close bestian#2 讓simple_ai_prompt.ts支持google/gemini-2.5-pro的不同格式回應, work on bestian#5 測試幾個主要模型的文字輸出格式, work on bestian#5 建立文字轉換函式, work on bestian#6 實驗結構化輸出, work on bestian#7 Update branch_todo.md 複製核心類型定義,實作openrouter核心模型組件, work on bestian#8 將 README中的範例程式建立鷹架版,以openrouter跑動, work on bestian#9 Update test.md 預設使用gpt-oss-120b, work on bestian#10 建立example/tutorial.ts和相關說明在README.md 設定規格化輸出為嚴格模式 json_schema: { name: "response", strict: true, // 若改為 false 會允許更寬鬆的格式 schema: schema } 讓.env位置移到根目錄 複刻runner_utils和第一個runner.ts,work on bestian#1 work on # 查找每個讀取.env的檔,讓它們優先讀取系統環境變數,讀不到才讀.env檔, work on bestian#12 Update .gitignore set "stream: false" init multilang support, before test, work on bestian#14 debug and work on bestian#14 debug 參數傳導問題 and add logs and close bestian#15 初始化打包 debug 語言設定參數傳導,和將模型串流限制為非串流。 remove dist-worker 增加防護措施 讓程式預設並可以處理stream Update openrouter_model.ts debug streamed response processing debug JSON fix logic 讓回應處理更彈性以適應open router的回應格式 修改categorization和model的處理邏輯以適應streaming回應 Create fix_csv_columns_simple.py 創建並修訂csv修理轉換器,給polis.tw和pol.is的檔案用 準備好以npm pack的方式打包讓後端專案測試 為了打包給 Cloudflare Workers 安裝,在環境中不使用 TypeBox 編譯器 調整環境變數的讀取方式以適配CF_worker 統計時增加彈性 檢查是否是 VoteTally 實例(具有 getTotalCount 方法) 若否則以一般的object來處理 Update openrouter_model.ts 有時LLM 回傳的是 {"items": [...]} 格式,但程式碼期望的是直接的陣列。這個變更讓程式對此處理得更彈性。 Update env_loader.ts add max_tokens to prevent truncation error add maxRetries from 3 to 5 close bestian#16 close bestian#17 reduce logs 解決子主題學習驗證邏輯過於嚴格的問題 修正驗證邏輯,before realdata test 加上西班牙文、日文和簡体中文支援,並加重語言指定語氣。 test use system prompt. work on bestian#15 把system指令和user指令分開,work on bestian#15 bestian#24 將容易出錯的區塊提示語本身換成多語言, work on bestian#15 修復 executeConcurrently, work on bestian#30 修改summarization和overview使之能生成多語言內容,work on bestian#28 LearnTopic階段的prompt改為多語言 JSON修復邏輯優化 移除檢測到所有主題都是籠統名稱 (), 觸發 retry的邏輯 Update categorization.ts Revert "Update categorization.ts" This reverts commit 76c1bd1. Revert "移除檢測到所有主題都是籠統名稱 (), 觸發 retry的邏輯" This reverts commit 0812246. Revert "JSON修復邏輯優化" This reverts commit 86e4846. Revert "LearnTopic階段的prompt改為多語言" This reverts commit 999d92f. Learn Topic階段, 意見相違分析的prompt改為多語言 Update README.md 處理沒有學到新主題時的問題 將取得共同意見的提示語也轉成多語言,before realdata test 修復提示語中缺少的markdown格式提示 修復錯誤的多語言提示語 把getSubtopicSummary的提示語抽成多語言 新增從markdown中提取JSON的試驗邏輯,以增加LLM呼叫準確率,減少retry次數。 對topic是否存在的校驗,使用更寬鬆的檢查,處理可能的格式變化 XX statements 改成 多語言 模版 "moderately low alignment"部份改成多語言 對齊日文提示語 補上缺少的函式和多語言文字 將Summery結果的Other轉成多語言,before test 將報告最後的靜態文字statements改成多語言 優化open router model的JSON修復邏輯 debug優化JSON修復邏輯 // 如果開頭是方括號,表示這是一個陣列,必須保持陣列結構 修改Learn Subtopic 相關的提示語,要LLM不可以傳回空陣列或空白的內容 Revert "debug優化JSON修復邏輯" This reverts commit d6d00ce. Revert "優化open router model的JSON修復邏輯" This reverts commit e214eba. modify test files and types.ts to make sure all tests in /library can pass rename csv_fixer_for_polis_tw Merge README, work on bestian#41 remove design and scaffold remove design notes Update model.ts the old implementation here might be better. Since this is an abstract base class, we don't want to specify the default in the constructor function, since this may often get overridden. Jigsaw-Code#38 (comment) Update categorization.ts It shouldn't be necessary to check this explicitly here, since this gets covered by the outputSchema Jigsaw-Code#38 (comment) Update types.ts Add support for Anthropic models in OpenRouter integration This commit fixes compatibility issues when using Anthropic models (like Claude Opus 4.6) through the OpenRouter API. Changes: - Use json_object mode for Anthropic models instead of strict json_schema (Anthropic models don't support the strict parameter) - Add 'topics' to wrapper key detection to handle Anthropic's response format - Add convert_polis.py utility script to convert Polis CSV exports to the required sensemaker format Testing: - Successfully tested with anthropic/claude-opus-4.6 model - Generated comprehensive summary of 31-statement Polis conversation - All output formats (HTML, MD, JSON, CSV) working correctly Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> debug open_router_model 實驗新增MiniMax M2.5相容, 目前傳回值的設定有誤。待修。 work on bestian#47 log: /Users/bestian/Documents/GitHub/sensemaking-tools/library/src/models/openrouter_model.ts:97 throw new Error('Response format error: expected array but got ' + typeof processedData); ^ Error: Response format error: expected array but got object at OpenRouterModel. (/Users/bestian/Documents/GitHub/sensemaking-tools/library/src/models/openrouter_model.ts:97:17) at Generator.next () debug, MiniMax M2.5相容可跑出資料了, work on bestian#47 Fix Vertex AI systemInstruction, add language flag to runner, remove debug logs 修正 Vertex AI 的 systemInstruction 使用方式、新增語言參數至 runner CLI、清除除錯日誌 - Fix getRequest() in vertex_model.ts: use proper GenerateContentRequest systemInstruction field instead of incorrectly placing language prefix as "system" role in contents array 修正 vertex_model.ts 中的 getRequest():使用正確的 GenerateContentRequest systemInstruction 欄位,而非錯誤地將語言前綴放在 contents 陣列的 "system" 角色中 - Add -l/--language CLI option to runner.ts for Vertex AI runner, bringing feature parity with runner_openrouter.ts 為 Vertex AI runner (runner.ts) 新增 -l/--language CLI 選項, 使其與 runner_openrouter.ts 功能一致 - Remove all commented-out debug console.log statements across 7 files 移除 7 個檔案中所有被註解掉的除錯 console.log 語句 Update runner.ts, 在example的部份加入輸出語言
Description:
This PR introduces an initial implementation that allows SenseMaking-tool to connect with OpenRouter for multi-model support.
Key Updates
Added scaffolding for OpenRouter integration, enabling users to choose between multiple models.
Verified that GPT-OSS-120B (open-weights model) can successfully run the core Sensemaker pipeline.
Implemented multi-language support, currently covering:
Maintained scaffolding code intentionally, so that contributors and reviewers can more easily test, review, and iterate on this integration.
Notes
Next Steps
Gather feedback from maintainers on:
Testing
To use open router model, you have to set up envivronment varibles (.env) as bellow:
then run following command (with "./files/comments.csv" prepared)