Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 24 additions & 2 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<!-- markdownlint-disable-file MD041 -->

# PULL REQUEST
# CODEXA PULL REQUEST

## Description

Expand All @@ -15,9 +15,31 @@

## Testing

<!-- Optional -->
<!-- This section explains to users how to test the changes on their local machine -->

## Review Checklist

The following checklist is used to ensure that the PR meets the project's quality standards.
Please check each item as you complete it.

- [ ] Funcationality changes have been tested locally
- [ ] Code changes are covered by `pytest`
- [ ] Documentation updated
- [ ] README.md updated
- [ ] Configuration files updated
- [ ] Linted with `pre-commit`

The following items are case-by-case and may not apply to all PRs. Please check them if they
are relevant; this is primarily for the reviewers to understand the nature of the PR.

- [ ] This PR contains a **breaking change** and includes migration steps
- [ ] This PR is a **bug fix** and includes a test that reproduces the bug
- [ ] This PR is a **new feature** and includes a test that verifies the feature works
- [ ] This PR is a **refactor** and does not change functionality
- [ ] This PR is a **documentation update** and does not change functionality
- [ ] This PR is a **performance improvement** and includes benchmarks
- [ ] This PR is a **security fix** and includes tests that verify the fix

## Future Work

<!-- Optional -->
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/testing.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ jobs:
- name: Run tests with coverage
run: |
poetry run coverage run \
--source=testclerk \
--source=codexa \
--omit="*/__*.py,*/test_*.py,/tmp/*" \
-m pytest
poetry run coverage report > coverage.txt
Expand All @@ -43,7 +43,7 @@ jobs:
if: always()
run: |
TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")
echo "# TestClerk Unit Testing Report" >> $GITHUB_STEP_SUMMARY
echo "# Codexa Unit Testing Report" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "## Pytest coverage" >> $GITHUB_STEP_SUMMARY
echo '```text' >> $GITHUB_STEP_SUMMARY
Expand Down
3 changes: 3 additions & 0 deletions CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Codexa Developers

* @jgfranco17
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@ RUN curl -sSL https://install.python-poetry.org | python3 -
WORKDIR /app

COPY README.md pyproject.toml poetry.lock* ./
COPY testclerk ./testclerk
COPY codexa ./codexa
RUN poetry config virtualenvs.create false \
&& poetry install --no-interaction --no-ansi --only main \
&& poetry build \
&& pip install dist/*.whl

ENTRYPOINT ["testclerk"]
ENTRYPOINT ["codexa"]
CMD ["--help"]
29 changes: 29 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# BSD 3-Clause License

Copyright (c) 2025, Joaquin Gabriel A. Franco.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
72 changes: 70 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,71 @@
# TestClerk
# Codexa

Simple CLI to analyze code for testing.
**Codexa** is a CLI-powered test execution and analysis toolkit that runs your automated tests,
feeds the results into a Large Language Model, and returns actionable, human-friendly insights
for your next development steps.

Think of it as your **post-test intelligence officer** — it doesn't just tell you _what_ failed,
it explains _why_ and _what to do about it_.

---

## Features

- **Config-driven execution**: Define your tests, frameworks, and environments in a single
configuration file.
- **LLM-powered analysis**: Get context-aware interpretations of failures, flakiness, and
performance issues.
- **Actionable next steps**: Receive step-by-step recommendations, not just raw logs.
- **Framework-agnostic**: Works with popular testing frameworks (e.g., Pytest, Jest, Go test,
JUnit) or custom commands.
- **Clear reporting**: Generates concise summaries and detailed diagnostics directly in your
terminal.

---

## Quick Start

### Install

Install Codexa using `pip`:

```shell
pip install codexa
```

### Configure

```yaml filename=".codexa.yaml"
tests:
- id: python
command: pytest --maxfail=1 --disable-warnings
- id: javascript
command: npm test -- --maxWorkers=1
- id: go
command: go test -v -timeout 30s

llm:
provider: openai
model: gpt-4o
api_key: $OPENAI_API_KEY
```

### Run

Codexa will automatically detect your configuration file and execute the tests defined within
it. It will then analyze the results using the specified LLM provider and model.

```shell
codexa run
```

Codexa uses a YAML configuration file to define:

- Test commands to run
- LLM provider and model
- Output format and destination
- Filters for warnings, failures, and flaky tests

## License

MIT License — see LICENSE for details.
File renamed without changes.
File renamed without changes.
167 changes: 167 additions & 0 deletions codexa/client/accessor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
from openai import OpenAI

from codexa.core.errors import CodexaAccessorError


class RemoteAIAccessor:
"""Class for interacting with remote LLM APIs.

Uses the DeepSeek R1 free model by default.
"""

def __init__(
self,
api_key: str,
prompt: str,
base_url: str = "https://openrouter.ai/api/v1",
model: str = "deepseek/deepseek-r1:free",
) -> None:
if not api_key:
raise CodexaAccessorError("API key is required")
if not prompt:
raise CodexaAccessorError("Accessor requires setup prompt")
self.__api_key = api_key
self.__base_url = base_url
self.__model = model
self.__client = OpenAI(api_key=self.__api_key, base_url=self.__base_url)
self.__setup_prompt = prompt

@property
def setup_prompt(self) -> str:
"""Return the setup prompt."""
return self.__setup_prompt

def make_request(self, message: str, timeout: float = 60.0) -> str:
"""Make a request to the LLM API.

Args:
message (str): Interaction message
timeout (float, optional): Request timeout (s), defaults to 60.0.

Raises:
CodexaGenerationError: If the LLM API call fails

Returns:
str: LLM response text
"""
response = self.__client.chat.completions.create(
model=self.__model,
messages=[
{"role": "system", "content": self.__setup_prompt},
{"role": "user", "content": message},
],
stream=False,
timeout=timeout,
)
content = response.choices[0].message.content
if not content:
reason = response.choices[0].message.refusal
raise CodexaAccessorError(
message=f"Failed to generate code: {reason}",
help_text="Please try again re-running the command",
)
return content


class ReportScanner(RemoteAIAccessor):
"""Class for generating test report summary."""

def __init__(self, api_key: str):
self.__setup_prompt = """Take on the role of a Senior QA Engineer.

I need to implement testing in Python. I am using Pytest as my test harness.
Your primary task is to read the Pytest execution output and write a summary
report.

Key requirements:
1. Report should be written in a way that is easy to understand.
2. Report must provide actionable steps for fixing the issues.
3. Report should be written in Markdown syntax, properly formatted.

Sections that the report must include:
1. Summary of the test run
2. Per error, a summary of the error and the steps to fix it
3. Any other relevant information

Please consider:
- Test writing best practices
- ISTQB Tester guidelines

From this step forward, I will provide you with the execution output and you
will write the report. Minimize chat response, focus on the code. Provide ONLY
the summary, not write additional comments or greetings.
"""

super().__init__(api_key, prompt=self.__setup_prompt)

def analyze_tests(self, test_output: str, timeout: float = 60.0) -> str:
"""Read the test execution output and generate a report.

Args:
test_output (str): Pytest execution output

Returns:
str: Generated test summary report
"""
message = f"Generate a report for the following test output:\n\n{test_output}"
return self.make_request(message, timeout)


class RepoAnalyzer(RemoteAIAccessor):
"""Class for analyzing the repository."""

def __init__(self, api_key: str):
self.__setup_prompt = """# Overview

Take on the role of a Senior Software Engineer,
specializing in automated testing, code review, and test strategy.

You will be given a `git diff` between the current working tree and a remote reference
(usually the main branch). Your task is to analyze the diff and identify what areas of
the codebase have changed, and what testing actions are necessary based on those changes.

Your objective is to **guide the author on what tests are required or should be updated**.

## Goals
- Review the changed files and modified code in the diff.
- Identify which functions, classes, or modules have been added, modified, or removed.
- Detect if new logic paths, branches, conditions, or data flows are introduced.
- Determine whether the existing tests need to be updated or if new tests should be created.
- Recommend specific **types of tests** required (e.g., unit, integration, regression, edge cases).
- If test files are included in the diff, comment on their adequacy and suggest improvements.
- Flag if changes to existing test cases might break or misrepresent the new behavior.
- If no changes in test files are detected, but logic changes exist, point out the test coverage gap.

## Output Format

Respond in **Markdown** with the following sections:

- Summary
- Areas Requiring Tests
- Suggested Tests
- Risks

Follow proper Markdownlint formatting and syntax. Ensure to add spaces after headers.
Don't put the triple-backticks (```) in the output, format the response as if it were
to be pasted into a Markdown file directly.

## Additional Notes

- Be precise and concise; prefer bullet points where helpful.
- Favor actionable suggestions over verbose explanations.
- You are not writing the tests — you are identifying and planning them.
"""

super().__init__(api_key, prompt=self.__setup_prompt)

def compare_diff(self, diff: str, timeout: float = 60.0) -> str:
"""Assess the repository changes and generate a report.

Args:
diff (str): Repository changes

Returns:
str: Generated test summary report
"""
message = f"Prepare an analysis and report for this diff:\n\n{diff}"
return self.make_request(message, timeout)
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

from git import Repo

from testclerk.core.errors import TestClerkRuntimeError
from codexa.core.errors import CodexaRuntimeError

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -32,4 +32,4 @@ def compare_git_diff(remote_ref: str, repo_path: str = os.getcwd()) -> str:
diff_index = head_commit.diff(remote_commit, create_patch=True)
return "\n".join(str(d.diff) for d in diff_index if d.diff)
except Exception as e:
raise TestClerkRuntimeError(f"Failed to get git diff: {e}")
raise CodexaRuntimeError(f"Failed to get git diff: {e}")
File renamed without changes.
10 changes: 5 additions & 5 deletions testclerk/commands/compare.py → codexa/commands/compare.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@

import click

from testclerk.client.accessor import RepoAnalyzer
from testclerk.client.versioning import compare_git_diff
from testclerk.core.env import load_api_key
from testclerk.core.errors import TestClerkInputError, TestClerkOutputError
from codexa.client.accessor import RepoAnalyzer
from codexa.client.versioning import compare_git_diff
from codexa.core.env import load_api_key
from codexa.core.errors import CodexaInputError, CodexaOutputError

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -51,7 +51,7 @@ def compare_command(
) -> None:
"""Generate smart analysis from diff comparison.."""
if output is not None and output.suffix != ".md":
raise TestClerkInputError(
raise CodexaInputError(
message=f"Output file must be a Markdown file, got {output.suffix}",
help_text=f"Rename the output file to a {output.stem}.md",
)
Expand Down
Loading