Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
f4cfd49
feat: add stage descriptions for Produce API extension
ryan-gang Jul 10, 2025
d6db37a
feat: enhance stage description for the Produce API, adding interacti…
ryan-gang Jul 10, 2025
239ce70
feat: update stage description for handling Produce requests to non-e…
ryan-gang Jul 10, 2025
1e0dc22
feat: update stage description to clarify handling of Produce request…
ryan-gang Jul 10, 2025
62cb2bc
feat: enhance stage description for handling Produce requests to inva…
ryan-gang Jul 10, 2025
65f3f7f
feat: add TODO note for enhancing cluster metadata binspec details in…
ryan-gang Jul 10, 2025
03a3cb8
feat: enhance Produce stage description to clarify single record prod…
ryan-gang Jul 10, 2025
982ec2e
refactor: streamline Produce stage description for single record prod…
ryan-gang Jul 10, 2025
7043210
feat: update Produce stage description to clarify Kafka's on-disk log…
ryan-gang Jul 10, 2025
17ea069
feat: enhance Produce stage description to clarify handling of multip…
ryan-gang Jul 10, 2025
42396b8
feat: update Produce stage description to clarify handling of multipl…
ryan-gang Jul 10, 2025
01cf9dc
feat: refine Produce stage descriptions to clarify handling of multip…
ryan-gang Jul 10, 2025
2d085e6
feat: add TODO for testing multiple requests in Produce stage, focusi…
ryan-gang Jul 10, 2025
27111ee
feat: add new challenge extension for producing messages in Kafka, in…
ryan-gang Jul 10, 2025
1ce465d
chore: rename stage description files to match expected format
ryan-gang Jul 17, 2025
0f0f4b3
feat: update Produce stage description to include validation for both…
ryan-gang Jul 17, 2025
70fd9c5
feat: add support for producing messages to multiple topics and parti…
ryan-gang Jul 17, 2025
5a213be
feat: update Produce stage name and description to clarify handling o…
ryan-gang Jul 17, 2025
41a2c4d
fix: clarify language in Produce stage descriptions regarding request…
ryan-gang Jul 17, 2025
4ea0ea6
fix: update stage names in course-definition.yml for improved clarity…
ryan-gang Jul 17, 2025
23d9849
fix: enhance clarity in Produce stage descriptions by refining langua…
ryan-gang Jul 17, 2025
1f8ddfa
fix: improve clarity in Produce stage descriptions by refining langua…
ryan-gang Jul 17, 2025
3514f03
fix: refine language in Produce stage descriptions for consistency in…
ryan-gang Jul 17, 2025
fe38264
fix: enhance clarity in Produce stage descriptions by refining langua…
ryan-gang Jul 17, 2025
80be2e8
refactor: apply suggestions from code review
ryan-gang Jul 21, 2025
07abacc
fix: enhance clarity in Produce stage description by refining languag…
ryan-gang Jul 21, 2025
4335570
fix: update Produce stage name for improved clarity by removing refer…
ryan-gang Jul 21, 2025
539d2d6
fix: update links in Produce stage descriptions for improved accuracy…
ryan-gang Jul 21, 2025
c77abf3
fix: refine Produce stage descriptions for improved clarity by simpli…
ryan-gang Jul 21, 2025
eb78513
chore: add additional stage 3 in between
ryan-gang Jul 21, 2025
f5f5329
fix: simplify Produce stage description by hardcoding error response …
ryan-gang Jul 21, 2025
7b40866
feat: add detailed description for handling Produce requests with val…
ryan-gang Jul 21, 2025
170f1ee
docs: add links to interactive protocol inspector for Kafka's log fil…
ryan-gang Jul 21, 2025
30bf549
fix: clarify Produce request description by specifying that data span…
ryan-gang Jul 21, 2025
543120f
fix: improve clarity in stage descriptions by refining language and u…
ryan-gang Jul 21, 2025
fb20251
fix: update error code references in stage descriptions to improve co…
ryan-gang Jul 21, 2025
51e6793
fix: enhance clarity in stage descriptions by consistently formatting…
ryan-gang Jul 25, 2025
b20bc2c
fix: update stage descriptions to clarify Produce request handling wi…
ryan-gang Jul 25, 2025
1096865
refactor: apply suggestions from code review
ryan-gang Jul 25, 2025
848e932
docs: remove interactive protocol inspector links for Produce request…
ryan-gang Jul 25, 2025
7b3f26b
fix: update stage descriptions to reflect changes in Produce API vers…
ryan-gang Jul 25, 2025
da5378c
fix: update stage description to specify that a single `Produce` requ…
ryan-gang Jul 25, 2025
5f3ce1d
fix: enhance stage descriptions for producing messages by adding inte…
ryan-gang Jul 25, 2025
02caa7b
fix: clarify stage description for producing messages by specifying t…
ryan-gang Jul 25, 2025
9970aa6
refactor: apply suggestions from code review
ryan-gang Jul 28, 2025
0140d31
fix: update links in stage description for producing messages to refl…
ryan-gang Jul 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions course-definition.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,15 @@ extensions:

[fetch-api]: https://kafka.apache.org/protocol.html#The_Messages_Fetch

- slug: "producing-messages"
name: "Producing Messages"
description_markdown: |
In this challenge extension you'll add support for producing messages by implementing the [Produce][produce-api] API.

Along the way you'll learn about how Kafka's Produce API works, how Kafka stores messages on disk and more.

[produce-api]: https://kafka.apache.org/protocol.html#The_Messages_Produce

stages:
- slug: "vi6"
name: "Bind to a port"
Expand Down Expand Up @@ -199,3 +208,54 @@ stages:
difficulty: hard
marketing_md: |-
In this stage, you'll implement the Fetch response for a topic with multiple messages, reading them from disk.

# Producing Messages

- slug: "um3"
primary_extension_slug: "producing-messages"
name: "Include Produce in APIVersions"
difficulty: easy
marketing_md: |-
In this stage, you'll add the Produce API to the APIVersions response.

- slug: "ck2"
primary_extension_slug: "producing-messages"
name: "Produce to an invalid topic"
difficulty: medium
marketing_md: |-
In this stage, you'll implement the Produce response for an invalid topic.

- slug: "dp1"
primary_extension_slug: "producing-messages"
name: "Respond to Produce requests"
difficulty: medium
marketing_md: |-
In this stage, you'll implement the Produce response for a valid topic.

- slug: "ps7"
primary_extension_slug: "producing-messages"
name: "Produce a single record"
difficulty: hard
marketing_md: |-
In this stage, you'll implement producing a single record to disk.

- slug: "sb8"
primary_extension_slug: "producing-messages"
name: "Produce multiple records"
difficulty: hard
marketing_md: |-
In this stage, you'll implement producing multiple records.

- slug: "mf2"
primary_extension_slug: "producing-messages"
name: "Produce to multiple partitions"
difficulty: hard
marketing_md: |-
In this stage, you'll implement producing to multiple partitions.

- slug: "ar4"
primary_extension_slug: "producing-messages"
name: "Produce to multiple topics"
difficulty: hard
marketing_md: |-
In this stage, you'll implement producing to multiple topics.
36 changes: 36 additions & 0 deletions stage_descriptions/producing-messages-01-um3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
In this stage, you'll add an entry for the `Produce` API to the APIVersions response.

## The Produce API

The [Produce API](https://kafka.apache.org/protocol#The_Messages_Produce) (API key `0`) is used to produce messages to a Kafka topic.

We've created an interactive protocol inspector for the request & response structures for `Produce`:

- 🔎 [Produce Request (v12)](https://binspec.org/kafka-produce-request-v12)
- 🔎 [Produce Response (v12)](https://binspec.org/kafka-produce-response-v12)

In this stage, you'll only need to add an entry for the `Produce` API to the APIVersions response you implemented in earlier stages. This lets clients know that your broker supports the `Produce` API. We'll get to responding to `Produce` requests in later stages.

## Tests

The tester will execute your program like this:

```bash
./your_program.sh /tmp/server.properties
```

It'll then connect to your server on port 9092 and send a valid `APIVersions` (v4) request.

The tester will validate that:

- The first 4 bytes of your response (the "message length") are valid.
- The correlation ID in the response header matches the correlation ID in the request header.
- The error code in the response body is `0` (NO_ERROR).
- The response body contains at least one entry for the API key `0` (`Produce`).
- The `MaxVersion` for the `Produce` API is at least 12.

## Notes

- You don't have to implement support for handling `Produce` requests in this stage. We'll get to this in later stages.
- You'll still need to include the entry for `APIVersions` in your response to pass earlier stages.
- The `MaxVersion` for `Produce` and `APIVersions` are different. For `APIVersions`, it is 4. For `Produce`, it is 12.
42 changes: 42 additions & 0 deletions stage_descriptions/producing-messages-02-ck2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
In this stage, you'll add support for handling `Produce` requests to invalid topics or partitions.

## Produce API Response for Invalid Topics or Partitions

When a Kafka broker receives a `Produce` request, it needs to validate that both the topic and partition exist. If either the topic or partition doesn't exist, it returns an appropriate error code and response.

For this stage, you can hardcode the error response - assume that all `Produce` requests are for invalid topics or partitions and return the error code `3` (UNKNOWN_TOPIC_OR_PARTITION). In the next stage, you'll implement handling success responses.

We've created an interactive protocol inspector for the request & response structures for `Produce`:

- 🔎 [Produce Request (v12)](https://binspec.org/kafka-produce-request-v12)
- 🔎 [Produce Response (v12) - Invalid Topic](https://binspec.org/kafka-produce-error-response-v12-invalid-topic)

## Tests

The tester will execute your program like this:

```bash
./your_program.sh /tmp/server.properties
```

It'll then connect to your server on port 9092 and send a `Produce` (v12) request with either an invalid topic name or a valid topic but invalid partition.

The tester will validate that:

- The first 4 bytes of your response (the "message length") are valid.
- The correlation ID in the response header matches the correlation ID in the request header.
- The `throttle_time_ms` field in the response is `0`.
- The `topics` field has 1 element, and in that element:
- The `name` field matches the topic name in the request.
- The `partitions` field has 1 element, and in that element:
- The `error_code` is `3` (UNKNOWN_TOPIC_OR_PARTITION).
- The `index` field matches the partition in the request.
- The `base_offset` field is `-1`.
- The `log_append_time_ms` field is `-1`.
- The `log_start_offset` field is `-1`.

## Notes

- You'll need to parse the `Produce` request in this stage to get the topic name and partition to send in the response.
- You can hardcode the error response in this stage. We'll get to actually checking for valid topics and partitions in later stages.
- The official docs for the `Produce` API can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce). Make sure to scroll down to the "(Version: 12)" section.
56 changes: 56 additions & 0 deletions stage_descriptions/producing-messages-03-dp1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
In this stage, you'll add support for responding to `Produce` requests with a valid topic.

## Produce API Response for Valid Topics

When a Kafka broker receives a `Produce` request, it needs to validate that both the topic and partition exist. If either the topic or partition doesn't exist, it returns an appropriate error code and response.

The broker performs validation in this order:
1. **Topic validation**: Check if the topic exists by reading the `__cluster_metadata` topic's log file
2. **Partition validation**: If the topic exists, check if the partition exists within that topic

### Topic Validation

To validate that a topic exists, the broker reads the `__cluster_metadata` topic's log file, located at `/tmp/kraft-combined-logs/__cluster_metadata-0/00000000000000000000.log`. Inside the log file, the broker finds the topic's metadata, which is a `record` (inside a `RecordBatch`) with a payload of type `TOPIC_RECORD`. If there exists a `TOPIC_RECORD` with the given topic name and the topic ID, the topic exists.

### Partition Validation

To validate that a partition exists, the broker reads the same `__cluster_metadata` topic's log file and finds the partition's metadata, which is a `record` (inside a RecordBatch) with a payload of type `PARTITION_RECORD`. If there exists a `PARTITION_RECORD` with the given partition index, the UUID of the topic it is associated with, and the UUID of the directory it is associated with, the partition exists.

We've created an interactive protocol inspector for the request & response structures for `Produce`:

- 🔎 [Produce Request (v12)](https://binspec.org/kafka-produce-request-v12)
- 🔎 [Produce Response (v12)](https://binspec.org/kafka-produce-response-v12)

We've also created an interactive protocol inspector for the `__cluster_metadata` topic's log file:
- 🔎 [Cluster Metadata Log File](https://binspec.org/kafka-cluster-metadata)

In this stage, you'll need to implement the response for a `Produce` request with a valid topic. In later stages, you'll handle successfully producing messages to valid topics and partitions and persisting messages to disk using Kafka's `RecordBatch` format.

## Tests

The tester will execute your program like this:

```bash
./your_program.sh /tmp/server.properties
```

It'll then connect to your server on port 9092 and send a `Produce` (v12) request with a valid topic and partition.

The tester will validate that:

- The first 4 bytes of your response (the "message length") are valid.
- The correlation ID in the response header matches the correlation ID in the request header.
- The `throttle_time_ms` field in the response is `0`.
- The `topics` field has 1 element, and in that element:
- The `name` field matches the topic name in the request.
- The `partitions` field has 1 element, and in that element:
- The `error_code` is `0` (NO_ERROR).
- The `index` field matches the partition in the request.
- The `base_offset` field is `0` (signifying that this is the first record in the partition).
- The `log_append_time_ms` field is `-1` (signifying that the timestamp is the latest).
- The `log_start_offset` field is `0`.

## Notes

- The official docs for the `Produce` API can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce). Make sure to scroll down to the "(Version: 12)" section.
- The official Kafka docs don't cover the structure of records inside the `__cluster_metadata` topic, but you can find the definitions in the Kafka source code [here](https://github.com/apache/kafka/tree/5b3027dfcbcb62d169d4b4421260226e620459af/metadata/src/main/resources/common/metadata).
43 changes: 43 additions & 0 deletions stage_descriptions/producing-messages-04-ps7.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
In this stage, you'll add support for successfully producing a single record.

## Producing records

When a Kafka broker receives a `Produce` request, it needs to validate that the topic and partition exist (using the `__cluster_metadata` topic's log file), store the record in the appropriate log file using Kafka's on-disk format, and return a successful response.

The record must be persisted to the topic's log file at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log` using Kafka's [RecordBatch format](https://kafka.apache.org/documentation/#recordbatch).

TODO: edit link after topic log binspec is in
Kafka's on-disk log format uses the same [RecordBatch](https://binspec.org/kafka-topic-log) format that is used in `Produce` and `Fetch` requests. The `RecordBatch` you receive in the `Produce` request can be written to the log file as is.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Remove Outdated Comment

The "TODO: edit link after topic log binspec is in" comment was accidentally committed. This development note is outdated and confusing, as the binspec.org/kafka-topic-log link it refers to is already present and used in the immediate vicinity. It should be removed.

Locations (1)

Fix in Cursor Fix in Web


You can refer to the following interactive protocol inspector for Kafka's log file format:
- 🔎 [A sample topic's log file](https://binspec.org/kafka-topic-log)

## Tests

The tester will execute your program like this:

```bash
./your_program.sh /tmp/server.properties
```

It'll then connect to your server on port 9092 and send a `Produce` (v12) request with a single recordBatch containing a single record.

The tester will validate that:

- The first 4 bytes of your response (the "message length") are valid.
- The correlation ID in the response header matches the correlation ID in the request header.
- The `throttle_time_ms` field in the response is `0`.
- The `topics` field has 1 element, and in that element:
- The `name` field matches the topic name in the request.
- The `partitions` field has 1 element, and in that element:
- The `error_code` is `0` (NO_ERROR).
- The `index` field matches the partition in the request.
- The `base_offset` field contains `0` (signifying that this is the first record in the partition).
- The `log_append_time_ms` field is `-1` (signifying that the timestamp is the latest).
- The `log_start_offset` field is `0`.
- The record is persisted to the appropriate log file on disk at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log`.

## Notes

- The on-disk log files must be stored in [RecordBatch](https://kafka.apache.org/documentation/#recordbatch) format. You should write the `RecordBatch` from the request directly to the file.
- The official docs for the `Produce` API can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce). Make sure to scroll down to the "(Version: 12)" section.
43 changes: 43 additions & 0 deletions stage_descriptions/producing-messages-05-sb8.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
In this stage, you'll add support for producing multiple records in a single `Produce` request.

## Producing multiple records
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryan-gang at what point do we test "appending" to a log file? That seems like something that'd need extra implementation and something we should explicitly call out and test.


When a Kafka broker receives a `Produce` request containing a `RecordBatch` with multiple records, it needs to validate that the topic and partition exist, assign sequential offsets to each record within the batch, and store the entire batch to the log file. The `RecordBatch` containing multiple records must be stored as a single unit to the topic's log file at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log`.

We've created an interactive protocol inspector for the request & response structures for `Produce`:

- 🔎 [Produce Request (v12)](https://binspec.org/kafka-produce-request-v12)
- 🔎 [Produce Response (v12)](https://binspec.org/kafka-produce-response-v12)

You can refer to the following interactive protocol inspector for Kafka's log file format:
- 🔎 [A sample topic's log file](https://binspec.org/kafka-topic-log)

## Tests

The tester will execute your program like this:

```bash
./your_program.sh /tmp/server.properties
```

It'll then connect to your server on port 9092 and send a `Produce` (v12) request containing a RecordBatch with multiple records.

The tester will validate that:

- The first 4 bytes of your response (the "message length") are valid.
- The correlation ID in the response header matches the correlation ID in the request header.
- The `throttle_time_ms` field in the response is `0`.
- The `topics` field has 1 element, and in that element:
- The `name` field matches the topic name in the request.
- The `partitions` field has 1 element, and in that element:
- The `error_code` is `0` (NO_ERROR).
- The `index` field matches the partition in the request.
- The `base_offset` field is 0 (the base offset for the batch).
- The `log_append_time_ms` field is `-1` (signifying that the timestamp is the latest).
- The `log_start_offset` field is `0`.
- The records are persisted to the appropriate log file on disk at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log` with sequential offsets.

## Notes

- Records within a record batch must be assigned sequential offsets (e.g., if the base offset is 5, records get offsets 5, 6, 7).
- The official docs for the `Produce` API can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce). Make sure to scroll down to the "(Version: 12)" section.
44 changes: 44 additions & 0 deletions stage_descriptions/producing-messages-06-mf2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
In this stage, you'll add support for producing to multiple partitions of the same topic in a single request.

## Producing to multiple partitions

When a Kafka broker receives a `Produce` request targeting multiple partitions of the same topic, it needs to validate that the topic and all partitions exist, write records to each partition's log file independently, and return a response containing results for all partitions.

We've created an interactive protocol inspector for the request & response structures for `Produce`:

- 🔎 [Produce Request (v12)](https://binspec.org/kafka-produce-request-v12)
- 🔎 [Produce Response (v12)](https://binspec.org/kafka-produce-response-v12)

You can refer to the following interactive protocol inspector for Kafka's log file format:
- 🔎 [A sample topic's log file](https://binspec.org/kafka-topic-log)

## Tests

The tester will execute your program like this:

```bash
./your_program.sh /tmp/server.properties
```

It'll then connect to your server on port 9092 and send a `Produce` (v12) request targeting multiple partitions of the same topic. The request will contain multiple RecordBatches, one for each partition. Each RecordBatch will contain a single record.

The tester will validate that:

- The first 4 bytes of your response (the "message length") are valid.
- The correlation ID in the response header matches the correlation ID in the request header.
- The `throttle_time_ms` field in the response is `0`.
- The `topics` field has 1 element, and in that element:
- The `name` field matches the topic name in the request.
- The `partitions` field has `N` elements, one for each of the `N` partitions in the request:
- The `error_code` is `0` (NO_ERROR).
- The `index` field matches the partition in the request.
- The `base_offset` field contains the assigned offset for that partition.
- The `log_append_time_ms` field is `-1` (signifying that the timestamp is the latest).
- The `log_start_offset` field is `0`.
- Records are persisted to the correct partition log files on disk.

## Notes

- The response must include entries for all requested partitions.
- On-disk log files must be stored in `RecordBatch` format.
- The official docs for the `Produce` API can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce). Make sure to scroll down to the "(Version: 12)" section.
42 changes: 42 additions & 0 deletions stage_descriptions/producing-messages-07-ar4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
In this stage, you'll add support for producing to multiple topics in a single request.

## Producing to multiple topics

When a Kafka broker receives a `Produce` request targeting multiple topics with their respective partitions, it needs to validate that all topics and partitions exist, write records to each partition's log file independently, and return a nested response structure containing results for all topics and their partitions.

We've created an interactive protocol inspector for the request & response structures for `Produce`:

- 🔎 [Produce Request (v12)](https://binspec.org/kafka-produce-request-v12)
- 🔎 [Produce Response (v12)](https://binspec.org/kafka-produce-response-v12)

You can refer to the following interactive protocol inspector for Kafka's log file format:
- 🔎 [A sample topic's log file](https://binspec.org/kafka-topic-log)

## Tests

The tester will execute your program like this:

```bash
./your_program.sh /tmp/server.properties
```

It'll then connect to your server on port 9092 and send a `Produce` (v12) request targeting multiple topics with their respective partitions. The request will contain data for multiple topics, spanning multiple partitions for each topic.

The tester will validate that:

- The first 4 bytes of your response (the "message length") are valid.
- The correlation ID in the response header matches the correlation ID in the request header.
- The `throttle_time_ms` field in the response is `0`.
- The `topics` field has one element for each of the topics in the request, and in each element:
- The `name` field matches the topic name in the request.
- The `partitions` field has `M` elements, one for each of the `M` partitions in the request:
- The `error_code` is `0` (NO_ERROR).
- The `index` field matches the partition in the request.
- The `base_offset` field contains the assigned offset for that topic-partition.
- The `log_append_time_ms` field is `-1` (signifying that the timestamp is the latest).
- The `log_start_offset` field is `0`.
- Records are persisted to the correct topic-partition log files on disk.

## Notes

- The official docs for the `Produce` API can be found [here](https://kafka.apache.org/protocol.html#The_Messages_Produce). Make sure to scroll down to the "(Version: 12)" section.
Loading