A custom Apache JMeter listener plugin that captures and displays real-time LLM (Large Language Model) performance metrics in a tabular format. Designed for load testing LLM APIs and monitoring key inference metrics such as TTFT, TPOT, tokens per minute, and throughput.
| Metric | Description |
|---|---|
| Request Name | HTTP sampler label (one row per unique sampler) |
| Request Count | Total number of requests sent for this sampler |
| RPS (req/sec) | Requests per second (total requests / elapsed time) |
| Min (ms) | Minimum response time |
| Avg (ms) | Average response time |
| Max (ms) | Maximum response time |
| 90th Percentile (ms) | 90th percentile response time |
| TTFT (ms) | Time To First Token — average latency before the first byte of response |
| TPOT (ms) | Time Per Output Token — average (response_time - latency) / output_tokens |
| Input Tokens/Req | Average input tokens per request |
| Output Tokens/Req | Average output tokens per request |
| Input Tokens/Min | Input token throughput per minute (shows "NA" until 1 minute has elapsed) |
| Output Tokens/Min | Output token throughput per minute (shows "NA" until 1 minute has elapsed) |
| Total Input Tokens | Cumulative input tokens for this sampler |
| Total Output Tokens | Cumulative output tokens for this sampler |
- Apache JMeter 5.6.3 or later
- Java 8 (JDK 1.8) or later
- Download or build the
LLM_Metrics_Visualizer-0.1.jarfile. - Copy the JAR into the JMeter
lib/ext/directory:<JMETER_HOME>/lib/ext/LLM_Metrics_Visualizer-0.1.jar - Restart JMeter.
- Open JMeter and load or create a test plan.
- Right-click on a Thread Group (or the Test Plan node).
- Navigate to Add → Listener → LLM Metrics Visualizer.
- The listener will appear with an empty metrics table.
The plugin parses the HTTP response body as JSON and reads the following fields:
{
"input_tokens": 10,
"output_tokens": 20,
"response": "Hello, world!"
}| Field | Type | Description |
|---|---|---|
input_tokens |
int | Number of input/prompt tokens consumed |
output_tokens |
int | Number of output/completion tokens generated |
If the response is not valid JSON or the token fields are missing, the sample is skipped gracefully.
- TTFT is measured using JMeter's
SampleResult.getLatency()— the time from sending the request to receiving the first byte of the response. For accurate TTFT, the target API should flush headers or the first byte before completing generation. - TPOT is calculated as
(response_time - latency) / output_tokens. This requiresoutput_tokens > 0and meaningful latency separation (e.g., streaming or chunked responses).
- Input Tokens/Min and Output Tokens/Min columns display "NA" until at least 1 minute of test time has elapsed, to avoid misleading early extrapolations.
- Rows are grouped by HTTP sampler name. Each unique sampler label gets a single row that is updated in-place as new results arrive.
- JDK 8 or later (the project compiles to Java 8 bytecode)
- Apache Maven 3.6+
# Clone the repository
git clone https://github.com/<your-username>/LLM_Metrics_Visualizer.git
cd LLM_Metrics_Visualizer
# Build the plugin JAR
mvn clean package
# The JAR will be at:
# target/LLM_Metrics_Visualizer-0.1.jarcp target/LLM_Metrics_Visualizer-0.1.jar <JMETER_HOME>/lib/ext/Restart JMeter after copying the JAR.
├── pom.xml # Maven build configuration
├── src/
│ └── main/
│ ├── java/
│ │ └── com/example/jmeter/
│ │ └── LLMMetricsVisualizer.java # Plugin source
│ └── resources/
│ └── META-INF/services/
│ └── org.apache.jmeter.visualizers.Visualizer
└── README.md
| Setting | Value |
|---|---|
| Artifact ID | LLM_Metrics_Visualizer |
| Version | 0.1 |
| Java Target | 8 (bytecode major version 52) |
| JMeter Compatibility | 5.6.3+ |
| Bundled Dependencies | org.json:json:20210307 (shaded into the JAR) |
- Thread Group: Configure the number of threads, ramp-up, and loop count.
- HTTP Request Sampler:
- Method:
POST - URL: Your LLM API endpoint (e.g.,
http://localhost:8080/generate) - Body Data:
{ "prompt": "Generate a greeting message", "max_tokens": 50, "temperature": 0.7, "stream": false } - Content-Type Header:
application/json(add via HTTP Header Manager)
- Method:
- LLM Metrics Visualizer: Add as a listener under the Thread Group.
- Run the test and observe the metrics table updating in real time.
| Issue | Solution |
|---|---|
| Plugin not visible in JMeter | Ensure the JAR is in <JMETER_HOME>/lib/ext/ and restart JMeter |
| All metrics show zero | Verify the API response contains valid JSON with input_tokens and output_tokens fields |
| TTFT shows zero | The target API must flush the first byte before completing the full response |
| TPOT shows zero | Requires output_tokens > 0 and latency < response time (use chunked/streaming responses) |
| Tokens/Min shows "NA" | This is expected — values appear after 1 minute of elapsed test time |
ClassNotFoundException |
Ensure you are using the shaded JAR (LLM_Metrics_Visualizer-0.1.jar), not the original artifact |
This project is open source. See LICENSE for details.