A personal local CLI that turns a copied YouTube transcript txt file into a bilingual transcript:
[2:03] All right, this is CS50.
[2:03] 좋아요, 이것이 CS50입니다.It is built on top of the official @openai/codex-sdk, so the program uses Codex from your local machine instead of calling the OpenAI API directly from custom code.
- Accepts a local
txtfile path as an argument - Parses YouTube transcript lines such as
2:032분 3초All right, this is CS50 - Supports both
m:ssandh:mm:ss - Keeps every parsed line and translates it into natural Korean
- Can run one extra natural-Korean review pass after translation
- Can reopen an existing bilingual txt and polish only the Korean lines
- Writes a final
*_번역본.txtfile beside the original input - Saves progress after each chunk so long transcripts can resume
- Node.js 18+
- A working Codex login on this machine
cd "C:\Users\USER\OneDrive\Documents\New project\codex-transcript-translator"
npm install
npx codex loginWhen the login prompt appears, sign in with ChatGPT.
npm run translate -- "C:\Users\USER\Downloads\챕터 1 Introduction.txt"Optional flags:
--output "C:\path\custom_output.txt"--model gpt-5.4--reasoning low--chunk-size 100--max-chars 15000--review-pass--polish-existing--overwrite--fresh
Recommended for long lecture transcripts on a ChatGPT/Codex plan:
npm run translate -- "C:\Users\USER\Downloads\lecture1.txt" --reasoning low --chunk-size 100 --max-chars 15000If you want one extra natural-Korean cleanup pass before saving:
npm run translate -- "C:\Users\USER\Downloads\lecture1.txt" --reasoning low --chunk-size 100 --max-chars 15000 --review-passIf you already have a bilingual transcript and only want to polish the Korean lines:
npm run translate -- "C:\Users\USER\Downloads\lecture0_번역본.txt" --polish-existing --reasoning low --chunk-size 100 --max-chars 15000The polish-only mode writes a new *_다듬기.txt file by default.
For especially short subtitle-style lines, --chunk-size matters more than --max-chars. The original 30 / 7000 split can create too many Codex turns and burn through your weekly limit faster than necessary.
If the input file is:
2:032분 3초All right, this is CS50.
2:122분 12초Harvard University's introduction to computer science.The output will look like:
[2:03] All right, this is CS50.
[2:03] 좋아요, 이것이 CS50입니다.
[2:12] Harvard University's introduction to computer science.
[2:12] 하버드 대학교의 컴퓨터 과학 입문 강좌입니다.- The translator keeps timestamps in the final file, but asks Codex to translate only the spoken text.
- Non-timestamp lines such as chapter headings are also preserved and translated.
- A progress file is written as
*.progress.jsonwhile translation is running. If the process stops midway, rerun the same command to resume.