Skip to content

v1#240

Open
mattpocock wants to merge 154 commits intomainfrom
v1
Open

v1#240
mattpocock wants to merge 154 commits intomainfrom
v1

Conversation

@mattpocock
Copy link
Owner

No description provided.

@changeset-bot
Copy link

changeset-bot bot commented Oct 19, 2025

🦋 Changeset detected

Latest commit: 2c55228

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link

vercel bot commented Oct 19, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
evalite Ignored Ignored Feb 20, 2026 1:09pm

@pkg-pr-new
Copy link

pkg-pr-new bot commented Oct 19, 2025

Open in StackBlitz

npm i https://pkg.pr.new/mattpocock/evalite@240

commit: 2c55228

@mattpocock
Copy link
Owner Author

@claude Remove the BACKEND_ONLY_CONSTANT import that's causing the error in CI

@claude
Copy link

claude bot commented Oct 19, 2025

Claude finished @mattpocock's task —— View job


Task completed

  • Check CI status to understand the error
  • Read modified files to locate the problematic import
  • Remove the BACKEND_ONLY_CONSTANT imports
  • Commit and push the fix

Fixed the CI error:

The error was caused by imports from evalite/backend-only-constants that referenced exports that no longer exist:

  1. test-utils.ts: Removed unused DB_LOCATION import
  2. files.test.ts: Replaced FILES_LOCATION import with a local constant "./node_modules/.evalite/files"

The changes have been committed and pushed to the v1 branch. CI should now pass!


Job Run | Branch: v1

mattpocock and others added 11 commits November 10, 2025 17:39
- Remove unused DB_LOCATION import from test-utils.ts
- Replace FILES_LOCATION import with local constant in files.test.ts

Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
- Add dotenv as a dependency
- Create env-setup-file module that imports dotenv/config
- Export env-setup-file as 'evalite/env-setup-file'
- Automatically prepend env-setup-file to setupFiles array
- Update documentation to reflect automatic .env loading
- Update example config to remove manual dotenv setup

Fixes #234

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
… precedence

- Add loadVitestSetupFiles() to load setupFiles from vitest.config.ts
- Merge setupFiles from both configs with evalite.config.ts taking precedence
- Add tests for vitest.config.ts setupFiles support and precedence
- setupFiles execution order: env-setup-file -> vitest -> evalite

Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
mattpocock and others added 16 commits November 15, 2025 14:36
* Add .editorconfig file

* Return vitest instance when returning with !shouldKeepRunning

This fixes the TS errors.

* Introduce ESLint and add typecheck npm script

- Include ESLint 9 as root dependency
- Set up ESLint to lint the whole repo
- Extend the root config and add a few package-specific plugins for Evalite UI
- Add a consistent `typecheck` npm script for type checking across the repo

Use can now use `pnpm lint` in root and UI app and `pnpm typecheck` anywhere in the repo.
Use `pnpm lint --fix` to attempt to fix the issues.

* Add missing break in switch case

* Fix CI

---------

Co-authored-by: Matt Pocock <mattpocockvoice@gmail.com>
mattpocock and others added 13 commits November 28, 2025 20:14
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…er incorrectly reports success.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…eshold success from overriding failures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
* feat: add watchFiles support for Evalite watch mode

* refactor: rename watchFiles option to forceRerunTriggers

* Fixed errors and made the forceReruntriggers work how it does in vitest

* Changeset

---------

Co-authored-by: Matt Pocock <mattpocockvoice@gmail.com>
- Add streaming JSON output with jq in afk.sh scripts
- Add RALPH commit history context for prior work awareness
- Filter issues to only those with 'ralph' label
- Update prompts: tracer bullet prioritization, RALPH: commit prefix, pnpm ci feedback loop
- Remove progress.txt dependency in favor of structured commit messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix table crash with double-width characters (emoji) in narrow terminals
  by adding try-catch fallback in renderTable. The `table` library crashes
  when wrapWord can't handle emoji characters that take 2 columns.
  Also enforce minimum column width of 3 to prevent negative widths.
- Fix export hang test timeout: increase from 1000ms to 10000ms since
  exportCommand legitimately takes >1s when running evals from scratch.
- Add disableServer: true to exportCommand's runEvalite call since
  the server is unnecessary when auto-running evals for export.
- Remove unused DB_LOCATION import in test-utils.ts (fixes typecheck).

Files changed:
- packages/evalite/src/reporter/rendering.ts
- packages/evalite/src/export-static.ts
- packages/evalite-tests/tests/export-static.test.ts
- packages/evalite-tests/tests/test-utils.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The server is needed in production for caching. Instead of hardcoding
disableServer: true, expose it as an opt-in parameter so only tests
disable it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* RALPH: Bump AI SDK deps to v6 and migrate core types (#379)

Task: Foundational vertical slice for AI SDK v5→v6 migration.

Key decisions:
- Public types use version-agnostic aliases (LanguageModel, EmbeddingModel from "ai")
- Internal middleware types use version-specific V3 types from @ai-sdk/provider
- Usage shape adapted: inputTokens/outputTokens now objects with .total, totalTokens computed as sum
- Middleware specificationVersion: 'v3' added per v6 requirement
- Removed obsolete "media" content type check (replaced by "file" in v6)

Files changed:
- packages/evalite/package.json: ai ^5→^6, @ai-sdk/provider ^2→^3
- packages/evalite/src/ai-sdk.ts: V2→V3 types, LanguageModel public API, usage shape migration
- packages/evalite/src/types.ts: LanguageModelV2→LanguageModel, EmbeddingModelV2<string>→EmbeddingModel
- packages/evalite-tests/package.json: ai ^5→^6, @ai-sdk/openai ^2→^3
- packages/example/package.json: ai ^5→^6, @ai-sdk/openai ^2→^3, @ai-sdk/provider ^2→^3
- apps/evalite-ui/package.json: ai ^5→^6
- pnpm-lock.yaml: updated

Blockers: #380 needs MockLanguageModelV2→V3 migration in test fixtures + scorer generateObject→generateText migration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* RALPH: Migrate scorers to generateText + Output.object() and test fixtures to MockLanguageModelV3 (#380)

Task: Migrate all v5 API usage to v6 patterns per PRD #378.

Key decisions:
- Scorers: generateObject() → generateText() + Output.object(), result.object → result.output
- Mocks: MockLanguageModelV2 → MockLanguageModelV3 with plain object doGenerate (not function)
- Usage shape: V3 nested objects { inputTokens: { total }, outputTokens: { total } }
- FinishReason: V3 object shape { unified: "stop", raw: undefined }
- Removed obsolete rawCall, providerMetadata, request, response from mock fixtures

Files changed:
- packages/evalite/src/scorers/utils/statement-evaluation.ts (3 call sites)
- packages/evalite/src/scorers/answer-correctness.ts (1 call site)
- packages/evalite/src/scorers/answer-relevancy.ts (1 call site)
- packages/evalite/src/scorers/context-recall.ts (1 call site)
- packages/evalite-tests/tests/fixtures/ai-sdk-traces/traces.eval.ts
- packages/evalite-tests/tests/fixtures/ai-sdk-caching/caching.eval.ts
- packages/evalite-tests/tests/fixtures/ai-sdk-caching-config-disabled/caching.eval.ts
- packages/evalite-tests/tests/fixtures/ai-sdk-caching-config-precedence/caching.eval.ts
- packages/example/src/fake-models.eval.ts

Blockers: #381 (docs + changeset) is now unblocked.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* RALPH: Update documentation for AI SDK v6 and add changeset (#381)

Task: Update docs to reflect v6 migration per PRD #378.

Key decisions:
- Type signature: LanguageModelV2 → LanguageModel (version-agnostic alias)
- Structured output: generateObject/streamObject → generateText/streamText + Output.object()
- tips/vercel-ai-sdk.mdx already v6-compatible, no changes needed
- Minor version changeset for evalite package

Files changed:
- apps/evalite-docs/src/content/docs/api/ai-sdk.mdx
- .changeset/0000-ai-sdk-v6.md

Blockers: None. All #378 PRD tasks are now complete.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants