PaperBot helps you search PubMed, review results, and send selected records into Zotero with a safer, researcher-friendly workflow.
It includes:
- a CLI importer:
pubmed_to_zotero.py - a Streamlit web app:
app_streamlit.py - an internal package:
paperbot/
- Search PubMed from a query string
- Import into Zotero personal or group libraries
- Select or auto-create Zotero collection paths
- Separate PubMed sorting from local secondary reranking
- Optional OpenAlex-based citation and journal metrics
- Write metric snapshots into Zotero
extraand filter-friendly tags - Attach Open Access PDF link attachments when available
- Enrich existing Zotero items with citation/journal metrics in
extra - Preview before import
- Re-check duplicate status against the current Zotero state
- Import from preview cache or history using current form settings
- Avoid accidental duplicate records by DOI/PMID matching
- Link existing Zotero items into a target collection instead of recreating them
- Block ambiguous same-name collection paths
python -m venv .venv
. .\.venv\Scripts\Activate.ps1
pip install -r requirements.txtZotero:
- Create an API key: https://www.zotero.org/settings/keys
- Use your numeric Zotero user or group ID, not your email address
Optional APIs:
- NCBI E-utilities key/email for higher request limits
- OpenAlex email/key for citation and journal metrics
You can either set environment variables, place them in a local .env, or enter values in the web UI.
Example values:
$env:ZOTERO_USER_ID="1234567"
$env:ZOTERO_API_KEY="your_zotero_api_key"
$env:ZOTERO_LIBRARY_TYPE="users"
$env:ZOTERO_LIBRARY_ID="1234567"
$env:ZOTERO_COLLECTION_PATH="ProjectA/Review"
$env:NCBI_EMAIL="you@example.com"
$env:NCBI_API_KEY="your_ncbi_api_key"
$env:OPENALEX_EMAIL="you@example.com"
$env:OPENALEX_API_KEY="your_openalex_api_key"See .env.example for a full template.
Configuration priority in the web app is:
.paperbot_streamlit_settings.json- system environment variables / local
.env - built-in defaults
Preview without writing to Zotero:
python .\pubmed_to_zotero.py --query "glioblastoma AND immunotherapy" --max-results 20 --dry-runImport into a personal library:
python .\pubmed_to_zotero.py --query "glioblastoma AND immunotherapy" --max-results 20Import into a group library:
python .\pubmed_to_zotero.py --query "glioblastoma AND immunotherapy" --library-type groups --library-id 123456Import into a collection path:
python .\pubmed_to_zotero.py --query "glioblastoma AND immunotherapy" --collection-path "ProjectA/Review/2026Q2"Use separate search and rerank controls:
python .\pubmed_to_zotero.py --query "glioblastoma AND immunotherapy" --pubmed-sort relevance --secondary-sort citation_count_descRun:
streamlit run .\app_streamlit.pyThen open the local URL shown in the terminal, usually http://localhost:8501.
If a local .env exists, the app now reads it automatically at startup.
The web app has two tabs:
PubMed Import: search PubMed, preview results, and import selected recordsZotero Enrich: scan existing Zotero items and write OpenAlex-based metrics intoextra
- Enter a PubMed query
- Set result count and Zotero target
- Load existing collections when needed
- Run preview first
- Review
StatusandAction - Re-check status if library, collection, or duplicate settings changed
- Import only the rows you want
new+create new item: record is not in Zotero and can be createdduplicate_existing+no change: record already exists in the relevant scopeexisting_add_to_collection+add existing to target collection: record exists in Zotero but is not yet in the target collectionduplicate_incoming+no change: duplicate inside the current candidate batch
- Duplicate matching uses DOI and PMID
- You can evaluate duplicates against the whole library or only the target collection
- Preview cache and history imports are re-evaluated against the current form settings before import
- Ambiguous collection paths are blocked if Zotero contains duplicate same-name collections under the same parent
- Collection path separator is
/
paperbot/core.py: core PubMed/Zotero logicpaperbot/web.py: Streamlit app implementationpaperbot/cli.py: CLI entry modulepaperbot/__main__.py: package entrypointpubmed_to_zotero.py: backward-compatible CLI wrapperapp_streamlit.py: backward-compatible Streamlit wrappertest_pubmed_to_zotero.py: unit testsrequirements.txt: dependencies
Backward-compatible commands still work:
python .\pubmed_to_zotero.py --query "glioblastoma AND immunotherapy" --dry-run
streamlit run .\app_streamlit.pyYou can also use the package entrypoint:
python -m paperbot --query "glioblastoma AND immunotherapy" --dry-runLocal-only files are intentionally ignored:
.paperbot_streamlit_settings.json.paperbot_history.json- local virtual environments
.env
This repository is suitable for open source, but your personal data should stay local.
Do not commit:
- API keys
.env- local Zotero settings/history
- personal research queries if you want them private
Run a quick syntax check:
python -m py_compile app_streamlit.py pubmed_to_zotero.py test_pubmed_to_zotero.pyRun tests:
python -m unittest -vThis project is released under the MIT License. See LICENSE.