Skip to content

Bug fix: running evaluation script didn't load environment#346

Merged
yangm2 merged 11 commits into
codeforpdx:mainfrom
yangm2:fix-env-load
May 16, 2026
Merged

Bug fix: running evaluation script didn't load environment#346
yangm2 merged 11 commits into
codeforpdx:mainfrom
yangm2:fix-env-load

Conversation

@yangm2
Copy link
Copy Markdown
Contributor

@yangm2 yangm2 commented May 11, 2026

…how deployments work) when .env file is not found

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update
  • Infrastructure
  • Maintenance

Description

The evaluation script imports constants.py which is where all the environment variables are supposed to be loaded. Due to quirks in invocation and how the .env path was resolved, it did not work consistently and looked like a user-specific issue when it was an invocation-specific issue.

This also adds a more descriptive warning when .env is not found.

And this refactors the logger for consistent formatting across files.

Related Tickets & Documents

  • Related Issue #
  • Closes #

QA Instructions, Screenshots, Recordings

Please replace this line with instructions on how to test your changes, a note on the devices and browsers this has been tested on, as well as any relevant images for UI changes.

Added/updated tests?

  • Yes
  • No, and this is why: please replace this line with details on why tests have not been included
  • I need help with writing tests

Documentation

  • If this PR changes the system architecture, Architecture.md has been updated

[optional] Are there any post deployment tasks we need to perform?

@yangm2 yangm2 self-assigned this May 11, 2026
@yangm2 yangm2 added bug Something isn't working backend Bot implementation and other backend concerns labels May 11, 2026
@yangm2
Copy link
Copy Markdown
Contributor Author

yangm2 commented May 11, 2026

@claude pr-review

@claude

This comment was marked as outdated.

@yangm2 yangm2 mentioned this pull request May 11, 2026
11 tasks
@yangm2
Copy link
Copy Markdown
Contributor Author

yangm2 commented May 12, 2026

@claude use ultra-review skill

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 12, 2026

Claude finished @yangm2's task in 3m 26s —— View job


Ultra-review

Note: there's no ultra-review skill defined in this repo (only review and security-review). This review applies both, plus the focus areas from .claude/CLAUDE.md.

  • Gather context — read PR diff, changed files, and related code
  • Code quality review (style, modularity, dead code, types)
  • Security review (input handling, secrets, env vars)
  • Performance review
  • Testing review (coverage, edge cases)
  • Documentation review (Architecture.md, comments)
  • Post final review feedback

Summary

Solid bug-fix PR. The root cause — load_dotenv(override=True) was called without a path, so it searched from CWD instead of path_to_env — is correctly fixed by passing dotenv_path=path_to_env. The logger refactor is a real improvement: a single import-side-effect-free module, idempotent configure_logging(), and a careful temporary_formatted_handler that lets constants.py emit a formatted warning before any entrypoint has had a chance to call configure_logging(). The AST-based test guarding against a logger→constants import cycle is a nice touch.

A few minor things worth tightening before merge.


Code quality

1. Forward reference to GEMINI_THINKING_BUDGET_DYNAMIC (pre-existing, but adjacent to your change). backend/tenantfirstaid/constants.py:158

self.THINKING_BUDGET: Final = GEMINI_THINKING_BUDGET_DYNAMIC  # line 158
...
GEMINI_THINKING_BUDGET_DYNAMIC: Final = -1                    # line 163

This works only because __init__ is invoked at line 170 (after line 163 has executed). It's brittle — if anyone moves SINGLETON above this assignment, or constructs _GoogEnvAndPolicy() directly during import time (which the tests do, but only at test time), they'd hit a NameError. Trivially fixable by moving GEMINI_THINKING_BUDGET_DYNAMIC above the class. Not introduced by this PR, but you're already in this file.

2. Stray import logging inside __main__ block. backend/evaluate/run_langsmith_evaluation.py:199

if __name__ == "__main__":
    import logging
    ...

logging is stdlib; nothing prevents it from sitting at the top of the module. Putting it under __main__ saves nothing and reads as if there's a load-order reason that doesn't exist. Move it up next to argparse.

3. Iterator import inconsistency.

  • backend/tenantfirstaid/logger.py:11 uses from collections.abc import Iterator
  • backend/tests/test_logger.py:6 uses from typing import Iterator

Tiny but worth aligning since this PR introduces both.

4. Comment-period style. backend/tenantfirstaid/app.py:25,31,44 — pre-existing comments here are missing trailing periods (# Configure CORS with strict origin validation, etc.). Project convention from .claude/CLAUDE.md says full sentences ending with a period. Optional cleanup since you're touching the file.


Testing — gaps worth filling

5. configure_logging() itself has no tests. backend/tenantfirstaid/logger.py:55. All five new tests cover temporary_formatted_handler. The three behaviors of configure_logging() are uncovered:

  • Idempotency (no-op when root already has a handler)
  • Level = DEBUG when ENV=dev, INFO otherwise
  • Handler attached to root with _ColoredLevelFormatter

These are exactly the kind of thing that silently breaks during refactors. Suggest one or two small tests.

6. The color path of _ColoredLevelFormatter.format is untested. backend/tenantfirstaid/logger.py:35. test_emits_record_through_temporary_handler patches isattyFalse and asserts no escapes; nothing asserts the positive — that ANSI escapes do appear when isatty() is True at formatter-construction time. The finally: block that restores record.levelname after coloring is also untested; if it ever broke, you'd silently mutate LogRecord instances passed to downstream handlers.

7. The silence_missing_env_warning fixture has a subtle effect worth noting in the docstring. backend/tests/test_constants.py:91

caplog.set_level(logging.CRITICAL, logger="tenantfirstaid.constants")

This raises the logger's effective level to CRITICAL, which prevents the warning from being emitted at all (not just from being captured). That's fine for "keep test output clean", but the docstring says "tests that don't want to assert on it can opt into this fixture to keep test output clean" — it actually suppresses emission, not just output. Tiny doc-precision nit.

8. Production path isn't fully exercised. The test test_missing_env_file_emits_warning_with_resolved_path calls _GoogEnvAndPolicy() directly, bypassing the temporary_formatted_handler wrap that the SINGLETON uses at module-load time. So the real production code path — warning emitted through the temporary handler with propagation disabled — isn't covered. Could be exercised by re-importing constants under importlib.reload with Path.exists patched, but that's a heavyweight test. Calling it out for awareness rather than asking you to add it.


Security

No issues. The fix actually improves posture slightly: loading .env from an explicit absolute path removes a class of "wrong .env loaded depending on CWD" surprises (which is mostly a correctness issue, but could be a security issue in deployment scenarios where a directory above the project contains a stale or attacker-controlled .env).

The warning message logs the resolved .env path, which reveals filesystem layout but no secrets — acceptable for an ops-facing warning.


Performance

No concerns. Logging setup is one-time at startup.


Documentation

Architecture.md is updated and the new "Configuration and Logging" section is genuinely useful — explains the load-order constraints, idempotency, TTY-aware coloring, and the narrow use of temporary_formatted_handler. The directory listing also gets logger.py added. Good.

One small thing: the section says "Environment variables are loaded once, at constants.py import time, from an absolute path (backend/.env)." The "once" framing is true at module-import granularity but slightly misleading given that test fixtures construct _GoogEnvAndPolicy() multiple times per test session. Optional rephrasing.


Things done well (worth keeping)

  • AST-based circular-import test (test_logger_module_does_not_import_constants) — good defensive invariant that's cheap to maintain
  • temporary_formatted_handler correctly suspends propagation to avoid double-emission via lastResort — the kind of thing that's easy to get wrong
  • _ColoredLevelFormatter caches isatty() once at construction and restores record.levelname after formatting (no record mutation leaks to downstream handlers) — clean
  • Switching self.logger = logging.getLogger("LangChainChatManager")logging.getLogger(__name__) is the right call; lets filtering work by module path
  • The isolated_logger fixture explicitly loggerDict.pops, which is the right cleanup for logging.Logger.manager's process-lifetime caching — most people miss this

Verdict

Approve-ready after the small cleanups above. The two I'd actually do before merge: #2 (move the import logging up) and #5 (add a couple of configure_logging tests). Everything else is taste.
· branch fix-env-load

@yangm2 yangm2 merged commit 1a4eb7a into codeforpdx:main May 16, 2026
7 checks passed
@yangm2 yangm2 deleted the fix-env-load branch May 16, 2026 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend Bot implementation and other backend concerns bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant