feat: add stage descriptions for Produce API extension

ryan-gang · ryan-gang · commit 68c31102693c · 2025-07-10T21:16:46.000+05:30
diff --git a/stage_descriptions/produce-01_um3.md b/stage_descriptions/produce-01_um3.md
@@ -0,0 +1,28 @@
+In this stage, you'll add an entry for the `Produce` API to the APIVersions response.
+
+## APIVersions
+
+Your Kafka implementation should include the Produce API (key=0) in the ApiVersions response before implementing produce functionality. This would let the client know that the broker supports the Produce API.
+
+## Tests
+
+The tester will execute your program like this:
+
+```bash
+./your_program.sh /tmp/server.properties
+```
+
+It'll then connect to your server on port 9092 and send a valid `APIVersions` (v4) request.
+
+The tester will validate that:
+
+- The first 4 bytes of your response (the "message length") are valid.
+- The correlation ID in the response header matches the correlation ID in the request header.
+- The error code in the response body is `0` (No Error).
+- The response body contains at least one entry for the API key `0` (Produce).
+- The `MaxVersion` for the Produce API is at least 11.
+
+## Notes
+
+- You don't have to implement support for the `Produce` request in this stage. We'll get to this in later stages.
+- You'll still need to include the entry for `APIVersions` in your response to pass the previous stage.
diff --git a/stage_descriptions/produce-02_ck2.md b/stage_descriptions/produce-02_ck2.md
@@ -0,0 +1,34 @@
+In this stage, you'll add support for handling Produce requests to non-existent topics with proper error responses.
+
+## Handling Non-Existent Topics
+
+Your Kafka implementation should validate topic existence and return appropriate error codes when a client tries to produce to a non-existent topic. Kafka stores metadata about topics in the `__cluster_metadata` topic. To check if a topic exists or not, you'll need to read the `__cluster_metadata` topic's log file, located at `/tmp/kraft-combined-logs/__cluster_metadata-0/00000000000000000000.log`. If the topic exists, the topic will have a directory with all the data required for it to operate at `<log-dir>/<topic-name>-<partition-index>`.
+TODO: Do we need to explain the TOPIC_RECORD record in the `__cluster_metadata` log.
+
+## Tests
+
+The tester will execute your program like this:
+
+```bash
+./your_program.sh /tmp/server.properties
+```
+
+It'll then connect to your server on port 9092 and send a `Produce` (v11) request to a non-existent topic.
+
+The tester will validate that:
+
+- The first 4 bytes of your response (the "message length") are valid.
+- The correlation ID in the response header matches the correlation ID in the request header.
+- The error code in the response body is `3` (UNKNOWN_TOPIC_OR_PARTITION).
+- The `throttle_time_ms` field in the response is `0`.
+- The `name` field in the topic response inside response should correspond to the topic name in the request.
+- The `partition` field in the partition response inside topic response should correspond to the partition in the request.
+- The `offset` field in the partition response inside topic response should be `-1`.
+- The `timestamp` field in the partition response inside topic response should be `-1`.
+- The `log start offset` field in the partition response inside topic response should be `-1`.
+
+## Notes
+
+- You'll need to parse the `Produce` request in this stage to get the topic name and partition to send in the response.
+- The official docs for the `Produce` request can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce). Make sure to scroll down to the "Produce Response (Version: 11)" section.
+- The official Kafka docs don't cover the structure of records inside the `__cluster_metadata` topic, but you can find the definitions in the Kafka source code [here](https://github.com/apache/kafka/tree/5b3027dfcbcb62d169d4b4421260226e620459af/metadata/src/main/resources/common/metadata).
diff --git a/stage_descriptions/produce-03_ho8.md b/stage_descriptions/produce-03_ho8.md
@@ -0,0 +1,33 @@
+In this stage, you'll add support for handling Produce requests to invalid partitions for known topics with proper error responses.
+
+## Handling Invalid Partitions
+
+Your Kafka implementation should validate that partition indices exist for known topics and return appropriate errors for invalid partitions. Kafka stores metadata about partitions in the `__cluster_metadata` topic. To check if a partition exists or not, you'll need to read the `__cluster_metadata` topic's log file, located at `/tmp/kraft-combined-logs/__cluster_metadata-0/00000000000000000000.log`. If the partition exists, the partition will have a directory with all the data required for it to operate at `<log-dir>/<topic-name>-<partition-index>`.
+TODO: Do we need to explain the PARTITION_RECORD record in the `__cluster_metadata` log.
+
+## Tests
+
+The tester will execute your program like this:
+
+```bash
+./your_program.sh /tmp/server.properties
+```
+
+It'll then connect to your server on port 9092 and send a `Produce` (v11) request to a known topic with an invalid partition.
+
+The tester will validate that:
+
+- The first 4 bytes of your response (the "message length") are valid.
+- The correlation ID in the response header matches the correlation ID in the request header.
+- The error code in the response body is `3` (UNKNOWN_TOPIC_OR_PARTITION).
+- The `throttle_time_ms` field in the response is `0`.
+- The `name` field in the topic response inside response should correspond to the topic name in the request.
+- The `partition` field in the partition response inside topic response should correspond to the partition in the request.
+- The `offset` field in the partition response inside topic response should be `-1`.
+- The `timestamp` field in the partition response inside topic response should be `-1`.
+- The `log start offset` field in the partition response inside topic response should be `-1`.
+
+## Notes
+
+- The official docs for the `Produce` request can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce). Make sure to scroll down to the "Produce Response (Version: 11)" section.
+- The official Kafka docs don't cover the structure of records inside the `__cluster_metadata` topic, but you can find the definitions in the Kafka source code [here](https://github.com/apache/kafka/tree/5b3027dfcbcb62d169d4b4421260226e620459af/metadata/src/main/resources/common/metadata).
diff --git a/stage_descriptions/produce-04_ps7.md b/stage_descriptions/produce-04_ps7.md
@@ -0,0 +1,36 @@
+In this stage, you'll add support for successfully producing a single record to a valid topic and partition.
+
+## Single Record Production
+
+Your Kafka implementation should accept valid Produce requests, store the record in the appropriate log file, and return successful responses with assigned offsets. The record must be persisted to the topic's log file at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log` using Kafka's on-disk log format.
+
+## Tests
+
+The tester will execute your program like this:
+
+```bash
+./your_program.sh /tmp/server.properties
+```
+
+It'll then connect to your server on port 9092 and send multiple successive `Produce` (v11) requests to a valid topic and partition with single records as payload.
+
+The tester will validate that:
+
+- The first 4 bytes of your response (the "message length") are valid.
+- The correlation ID in the response header matches the correlation ID in the request header.
+- The error code in the response body is `0` (NO_ERROR).
+- The `throttle_time_ms` field in the response is `0`.
+- The `name` field in the topic response matches the topic name in the request.
+- The `partition` field in the partition response matches the partition in the request.
+- The `offset` field in the partition response contains the assigned offset to the record. (The offset is the offset of the record in the partition, not the offset of the batch. So 0 for the first record, 1 for the second record, and so on.)
+- The `timestamp` field in the partition response contains -1 (signifying that the timestamp is latest).
+- The `log_start_offset` field in the partition response is `0`.
+
+The tester will also verify that the record is persisted to the appropriate log file on disk at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log`.
+
+## Notes
+
+- You'll need to implement log file writing using Kafka's binary log format.
+- Records must be stored in RecordBatch format with proper CRC validation.
+- The offset assignment should start from 0 for new partitions.
+- The official docs for the `Produce` request can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce).
diff --git a/stage_descriptions/produce-05_sb8.md b/stage_descriptions/produce-05_sb8.md
@@ -0,0 +1,36 @@
+In this stage, you'll add support for producing multiple records from a single request with multiple record batches in the payload.
+
+## Batch Processing
+
+Your Kafka implementation should handle multiple records in a single batch, assign sequential offsets, and validate the LastOffsetDelta field correctly. The RecordBatch containing multiple records must be stored atomically to the log file.
+
+## Tests
+
+The tester will execute your program like this:
+
+```bash
+./your_program.sh /tmp/server.properties
+```
+
+It'll then connect to your server on port 9092 and send a `Produce` (v11) request containing a RecordBatch with multiple records.
+
+The tester will validate that:
+
+- The first 4 bytes of your response (the "message length") are valid.
+- The correlation ID in the response header matches the correlation ID in the request header.
+- The error code in the response body is `0` (NO_ERROR).
+- The `throttle_time_ms` field in the response is `0`.
+- The `name` field in the topic response matches the topic name in the request.
+- The `partition` field in the partition response matches the partition in the request.
+- The `offset` field in the partition response contains the base offset for the batch.
+- The `timestamp` field in the partition response contains a valid timestamp.
+- The `log_start_offset` field in the partition response is correct.
+
+The tester will also verify that the record is persisted to the appropriate log file on disk at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log` with sequential offsets.
+
+## Notes
+
+- Records within a batch must be assigned sequential offsets (e.g., if base offset is 5, records get offsets 5, 6, 7).
+- The entire batch is treated as a single atomic operation.
+- The response should return the base offset of the batch, not individual record offsets.
+- The official docs for the `Produce` request can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce).
diff --git a/stage_descriptions/produce-06_mf2.md b/stage_descriptions/produce-06_mf2.md
@@ -0,0 +1,40 @@
+In this stage, you'll add support for producing to multiple partitions of the same topic in a single request.
+
+## Partition Routing
+
+Your Kafka implementation should handle writes to multiple partitions within the same topic, manage independent offset assignment per partition, and aggregate responses correctly. Each partition maintains its own offset sequence independently.
+
+## Tests
+
+The tester will execute your program like this:
+
+```bash
+./your_program.sh /tmp/server.properties
+```
+
+It'll then connect to your server on port 9092 and send a `Produce` (v11) request targeting multiple partitions of the same topic.
+
+The tester will validate that:
+
+- The first 4 bytes of your response (the "message length") are valid.
+- The correlation ID in the response header matches the correlation ID in the request header.
+- The error code in the response body is `0` (NO_ERROR).
+- The `throttle_time_ms` field in the response is `0`.
+- The `name` field in the topic response matches the topic name in the request.
+- Each partition in the request has a corresponding partition response.
+- Each partition response contains:
+  - The correct `partition` field matching the request.
+  - An error code of `0` (NO_ERROR).
+  - A valid `offset` field with the assigned offset for that partition.
+  - A valid `timestamp` field.
+  - A correct `log_start_offset` field.
+- Records are persisted to the correct partition log files.
+- Offset assignment is independent per partition (partition 0 and partition 1 can both have offset 0).
+
+## Notes
+
+- Each partition maintains its own offset sequence starting from 0.
+- Multiple partitions can be written to simultaneously in a single request.
+- The response must include entries for all requested partitions.
+- Partition-level errors should be handled independently (one partition failure shouldn't affect others).
+- The official docs for the `Produce` request can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce).
diff --git a/stage_descriptions/produce-07_ar4.md b/stage_descriptions/produce-07_ar4.md
@@ -0,0 +1,43 @@
+In this stage, you'll add support for producing to multiple topics in a single request.
+
+## Cross-Topic Production
+
+Your Kafka implementation should handle complex requests with multiple topics, manage independent offset tracking per topic and partition, and handle complex response structures. Each topic maintains completely independent offset sequences.
+
+## Tests
+
+The tester will execute your program like this:
+
+```bash
+./your_program.sh /tmp/server.properties
+```
+
+It'll then connect to your server on port 9092 and send a `Produce` (v11) request targeting multiple topics with their respective partitions.
+
+The tester will validate that:
+
+- The first 4 bytes of your response (the "message length") are valid.
+- The correlation ID in the response header matches the correlation ID in the request header.
+- The error code in the response body is `0` (NO_ERROR).
+- The `throttle_time_ms` field in the response is `0`.
+- Each topic in the request has a corresponding topic response.
+- Each topic response contains:
+  - The correct `name` field matching the topic name in the request.
+  - Each partition in the request has a corresponding partition response.
+  - Each partition response contains:
+    - The correct `partition` field matching the request.
+    - An error code of `0` (NO_ERROR).
+    - A valid `offset` field with the assigned offset for that topic-partition.
+    - A valid `timestamp` field.
+    - A correct `log_start_offset` field.
+- Records are persisted to the correct topic-partition log files.
+- Offset assignment is independent per topic-partition combination.
+
+## Notes
+
+- Each topic-partition combination maintains its own independent offset sequence.
+- Multiple topics can be written to simultaneously in a single request.
+- The response structure is nested: topics contain partition responses.
+- Topic-level and partition-level errors should be handled independently.
+- This is the most complex produce scenario, combining multi-topic and multi-partition handling.
+- The official docs for the `Produce` request can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce).