Bug fix: running evaluation script didn't load environment by yangm2 · Pull Request #346 · codeforpdx/tenantfirstaid

yangm2 · 2026-05-11T18:32:25Z

…how deployments work) when .env file is not found

What type of PR is this? (check all applicable)

Description

The evaluation script imports constants.py which is where all the environment variables are supposed to be loaded. Due to quirks in invocation and how the .env path was resolved, it did not work consistently and looked like a user-specific issue when it was an invocation-specific issue.

This also adds a more descriptive warning when .env is not found.

And this refactors the logger for consistent formatting across files.

Related Tickets & Documents

Related Issue #
Closes #

QA Instructions, Screenshots, Recordings

Please replace this line with instructions on how to test your changes, a note on the devices and browsers this has been tested on, as well as any relevant images for UI changes.

Added/updated tests?

Yes
No, and this is why: please replace this line with details on why tests have not been included
I need help with writing tests

Documentation

If this PR changes the system architecture, Architecture.md has been updated

[optional] Are there any post deployment tasks we need to perform?

yangm2 · 2026-05-11T18:34:22Z

@claude pr-review

yangm2 · 2026-05-12T03:27:48Z

@claude use ultra-review skill

claude · 2026-05-12T03:28:07Z

Claude finished @yangm2's task in 3m 26s —— View job

Ultra-review

Note: there's no ultra-review skill defined in this repo (only review and security-review). This review applies both, plus the focus areas from .claude/CLAUDE.md.

Gather context — read PR diff, changed files, and related code
Code quality review (style, modularity, dead code, types)
Security review (input handling, secrets, env vars)
Performance review
Testing review (coverage, edge cases)
Documentation review (Architecture.md, comments)
Post final review feedback

Summary

Solid bug-fix PR. The root cause — load_dotenv(override=True) was called without a path, so it searched from CWD instead of path_to_env — is correctly fixed by passing dotenv_path=path_to_env. The logger refactor is a real improvement: a single import-side-effect-free module, idempotent configure_logging(), and a careful temporary_formatted_handler that lets constants.py emit a formatted warning before any entrypoint has had a chance to call configure_logging(). The AST-based test guarding against a logger→constants import cycle is a nice touch.

A few minor things worth tightening before merge.

Code quality

1. Forward reference to GEMINI_THINKING_BUDGET_DYNAMIC (pre-existing, but adjacent to your change). backend/tenantfirstaid/constants.py:158

self.THINKING_BUDGET: Final = GEMINI_THINKING_BUDGET_DYNAMIC  # line 158
...
GEMINI_THINKING_BUDGET_DYNAMIC: Final = -1                    # line 163

This works only because __init__ is invoked at line 170 (after line 163 has executed). It's brittle — if anyone moves SINGLETON above this assignment, or constructs _GoogEnvAndPolicy() directly during import time (which the tests do, but only at test time), they'd hit a NameError. Trivially fixable by moving GEMINI_THINKING_BUDGET_DYNAMIC above the class. Not introduced by this PR, but you're already in this file.

2. Stray import logging inside __main__ block. backend/evaluate/run_langsmith_evaluation.py:199

if __name__ == "__main__":
    import logging
    ...

logging is stdlib; nothing prevents it from sitting at the top of the module. Putting it under __main__ saves nothing and reads as if there's a load-order reason that doesn't exist. Move it up next to argparse.

3. Iterator import inconsistency.

backend/tenantfirstaid/logger.py:11 uses from collections.abc import Iterator ✓
backend/tests/test_logger.py:6 uses from typing import Iterator

Tiny but worth aligning since this PR introduces both.

4. Comment-period style. backend/tenantfirstaid/app.py:25,31,44 — pre-existing comments here are missing trailing periods (# Configure CORS with strict origin validation, etc.). Project convention from .claude/CLAUDE.md says full sentences ending with a period. Optional cleanup since you're touching the file.

Testing — gaps worth filling

5. configure_logging() itself has no tests. backend/tenantfirstaid/logger.py:55. All five new tests cover temporary_formatted_handler. The three behaviors of configure_logging() are uncovered:

Idempotency (no-op when root already has a handler)
Level = DEBUG when ENV=dev, INFO otherwise
Handler attached to root with _ColoredLevelFormatter

These are exactly the kind of thing that silently breaks during refactors. Suggest one or two small tests.

6. The color path of _ColoredLevelFormatter.format is untested. backend/tenantfirstaid/logger.py:35. test_emits_record_through_temporary_handler patches isatty → False and asserts no escapes; nothing asserts the positive — that ANSI escapes do appear when isatty() is True at formatter-construction time. The finally: block that restores record.levelname after coloring is also untested; if it ever broke, you'd silently mutate LogRecord instances passed to downstream handlers.

7. The silence_missing_env_warning fixture has a subtle effect worth noting in the docstring. backend/tests/test_constants.py:91

caplog.set_level(logging.CRITICAL, logger="tenantfirstaid.constants")

This raises the logger's effective level to CRITICAL, which prevents the warning from being emitted at all (not just from being captured). That's fine for "keep test output clean", but the docstring says "tests that don't want to assert on it can opt into this fixture to keep test output clean" — it actually suppresses emission, not just output. Tiny doc-precision nit.

8. Production path isn't fully exercised. The test test_missing_env_file_emits_warning_with_resolved_path calls _GoogEnvAndPolicy() directly, bypassing the temporary_formatted_handler wrap that the SINGLETON uses at module-load time. So the real production code path — warning emitted through the temporary handler with propagation disabled — isn't covered. Could be exercised by re-importing constants under importlib.reload with Path.exists patched, but that's a heavyweight test. Calling it out for awareness rather than asking you to add it.

Security

No issues. The fix actually improves posture slightly: loading .env from an explicit absolute path removes a class of "wrong .env loaded depending on CWD" surprises (which is mostly a correctness issue, but could be a security issue in deployment scenarios where a directory above the project contains a stale or attacker-controlled .env).

The warning message logs the resolved .env path, which reveals filesystem layout but no secrets — acceptable for an ops-facing warning.

Performance

No concerns. Logging setup is one-time at startup.

Documentation

Architecture.md is updated and the new "Configuration and Logging" section is genuinely useful — explains the load-order constraints, idempotency, TTY-aware coloring, and the narrow use of temporary_formatted_handler. The directory listing also gets logger.py added. Good.

One small thing: the section says "Environment variables are loaded once, at constants.py import time, from an absolute path (backend/.env)." The "once" framing is true at module-import granularity but slightly misleading given that test fixtures construct _GoogEnvAndPolicy() multiple times per test session. Optional rephrasing.

Things done well (worth keeping)

AST-based circular-import test (test_logger_module_does_not_import_constants) — good defensive invariant that's cheap to maintain
temporary_formatted_handler correctly suspends propagation to avoid double-emission via lastResort — the kind of thing that's easy to get wrong
_ColoredLevelFormatter caches isatty() once at construction and restores record.levelname after formatting (no record mutation leaks to downstream handlers) — clean
Switching self.logger = logging.getLogger("LangChainChatManager") → logging.getLogger(__name__) is the right call; lets filtering work by module path
The isolated_logger fixture explicitly loggerDict.pops, which is the right cleanup for logging.Logger.manager's process-lifetime caching — most people miss this

Verdict

Approve-ready after the small cleanups above. The two I'd actually do before merge: #2 (move the import logging up) and #5 (add a couple of configure_logging tests). Everything else is taste.
· branch fix-env-load

…how deployments work) when .env file is not found

yangm2 requested review from ValeriyaMetla and dan-moncada May 11, 2026 18:32

yangm2 self-assigned this May 11, 2026

yangm2 added bug Something isn't working backend Bot implementation and other backend concerns labels May 11, 2026

This comment was marked as outdated.

Sign in to view

yangm2 mentioned this pull request May 11, 2026

Pr/evaluator tooling #342

Closed

11 tasks

yangm2 mentioned this pull request May 12, 2026

Evaluator tooling: concurrency, delta display, and tone rubric #327

Merged

5 tasks

yangm2 added 11 commits May 16, 2026 14:14

explicitly pass path to env and add warning (not error since this is …

869bcdb

…how deployments work) when .env file is not found

colorize logger.warning and refactor logging across files

aef3c5c

refactor logging based on review

ffed64d

refactor

b1eeeab

don't (re)configure logging when importing as module

3f54727

add logger tests

7a72a1f

fix comments

ba83ae2

refactor logger again

dbe063c

refactor tests

71c3aaa

tighten up test mock

beed832

update doc

c7b1d93

yangm2 force-pushed the fix-env-load branch from 4714157 to c7b1d93 Compare May 16, 2026 21:16

yangm2 merged commit 1a4eb7a into codeforpdx:main May 16, 2026
7 checks passed

yangm2 deleted the fix-env-load branch May 16, 2026 21:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug fix: running evaluation script didn't load environment#346

Bug fix: running evaluation script didn't load environment#346
yangm2 merged 11 commits into
codeforpdx:mainfrom
yangm2:fix-env-load

yangm2 commented May 11, 2026 •

edited

Loading

Uh oh!

yangm2 commented May 11, 2026

Uh oh!

This comment was marked as outdated.

yangm2 commented May 12, 2026

Uh oh!

claude Bot commented May 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yangm2 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this? (check all applicable)

Description

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

Added/updated tests?

Documentation

[optional] Are there any post deployment tasks we need to perform?

Uh oh!

yangm2 commented May 11, 2026

Uh oh!

This comment was marked as outdated.

yangm2 commented May 12, 2026

Uh oh!

claude Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ultra-review

Summary

Code quality

Testing — gaps worth filling

Security

Performance

Documentation

Things done well (worth keeping)

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yangm2 commented May 11, 2026 •

edited

Loading

claude Bot commented May 12, 2026 •

edited

Loading