Kafka/Produce API extension #45

ryan-gang · 2025-07-10T15:40:03Z

Add Produce API extension with 7 stages and updated stage descriptions.

Summary by CodeRabbit

New Features
- Introduced a new "Producing Messages" challenge extension, focusing on Kafka's Produce API and message storage.
- Added seven progressive stages covering API versioning, error handling for invalid topics, producing records to topics, multiple records, multiple partitions, and multiple topics.
Documentation
- Added detailed instructional guides for each new stage, outlining learning objectives, expected behaviors, and testing criteria for implementing Kafka's Produce API functionality.

coderabbitai · 2025-07-10T15:40:11Z

Warning

Rate limit exceeded

@ryan-gang has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 18 minutes and 0 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7b3f26b and 0140d31.

📒 Files selected for processing (5)

stage_descriptions/producing-messages-03-dp1.md (1 hunks)
stage_descriptions/producing-messages-04-ps7.md (1 hunks)
stage_descriptions/producing-messages-05-sb8.md (1 hunks)
stage_descriptions/producing-messages-06-mf2.md (1 hunks)
stage_descriptions/producing-messages-07-ar4.md (1 hunks)

Walkthrough

A new "Producing Messages" extension has been added to the course, introducing seven progressive stages that guide learners through implementing Kafka's Produce API. Each stage is accompanied by detailed instructional documentation, covering topics from API version advertisement to producing messages across multiple topics and partitions, with increasing complexity.

Changes

File(s)	Change Summary
course-definition.yml	Added "Producing Messages" extension and seven new stages (um3, ck2, dp1, ps7, sb8, mf2, ar4) with descriptions and difficulty levels.
stage_descriptions/producing-messages-01-um3.md	New doc: Guide for including Produce API in APIVersions response.
stage_descriptions/producing-messages-02-ck2.md	New doc: Guide for handling Produce requests to invalid topics/partitions.
stage_descriptions/producing-messages-03-dp1.md	New doc: Guide for responding to Produce requests with topic/partition validation using cluster metadata.
stage_descriptions/producing-messages-04-ps7.md	New doc: Guide for producing a single record and persisting it to disk.
stage_descriptions/producing-messages-05-sb8.md	New doc: Guide for producing multiple records in a single Produce request.
stage_descriptions/producing-messages-06-mf2.md	New doc: Guide for producing messages to multiple partitions in one request.
stage_descriptions/producing-messages-07-ar4.md	New doc: Guide for producing messages to multiple topics in one request.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Broker

    Client->>Broker: APIVersions Request
    Broker-->>Client: APIVersions Response (includes Produce API)

    Client->>Broker: Produce Request (topic/partition)
    alt Invalid topic/partition
        Broker-->>Client: Produce Response (error code 3)
    else Valid topic/partition
        Broker->>Broker: Validate via cluster metadata
        Broker->>Broker: Persist record(s) to log file(s)
        Broker-->>Client: Produce Response (success, offsets, etc.)
    end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Poem

🐇
Hopping through the Kafka fields,
New stages sprout where learning yields.
Messages produced, from one to many,
Topics and partitions, oh there are plenty!
With docs in tow and guidance bright,
The broker learns to do it right.
Produce, persist—what a delight!

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch kafka-produce

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

ryan-gang · 2025-07-10T18:36:33Z

@rohitpaulk some more work is required.

We want to send multiple consecutive requests from stage 5 onwards. But I'm yet to confirm how the response changes subsequently.
The last couple of stages are turning out to be very dry.
We don't really have binspec for the cluster metadata payload records. (TOPIC_RECORD, PARTITION_RECORD). Need to think of how to convey this easily.
Details are pending on how we would assert their on-disk writes. Need to implement this.

But I think we can finalize the breakdown from this. In the next review cycle, we can finalize the language.

rohitpaulk

Lots of work left with wording etc., but when it comes to stage breakdown - stage 2 & 3 are the ones that definitely need rework. stage 2 implementation is likely to pass stage 3 autoamtically.

stage_descriptions/produce-01_um3.md

stage_descriptions/produce-02_ck2.md

rohitpaulk · 2025-07-10T21:57:00Z

stage_descriptions/produce-03_ho8.md

+## Produce API response for invalid partitions
+
+When a Kafka broker receives a Produce request, it first needs to validate that the partition exists. If a partition doesn't exist, it returns an appropriate error code and response. 
+To validate that a partition exists, the broker reads the `__cluster_metadata` topic's log file, located at `/tmp/kraft-combined-logs/__cluster_metadata-0/00000000000000000000.log`. Inside the log file, the broker finds the topic's metadata, which is a `record` (inside a RecordBatch) with a payload of type `PARTITION_RECORD`. If there exists a `PARTITION_RECORD` with the given partition index, the UUID of the topic it is associated with and the UUID of directory it is associated with, the partition exists. Else, the partition doesn't exist.


From an implementer's standpoint, the code for the previous stage is automatically going to pass this one too - they don't really have to parse the cluster metadata file at all and can hardcode the error case. We should rework this, doesn't make sense to breakdown unless we can actually test whether they're implementing the steps we expect.

In the Fetch extension, we tell people that they can hardcode the error response for now, and that they'll get to actually parsing / checking etc. in later stages.

Yep, no way to test that they are parsing PARTITION_RECORDs, so we gonna merge stage 2 and 3.

rohitpaulk · 2025-07-10T21:58:26Z

stage_descriptions/produce-04_ps7.md

+
+## Single Record Production
+
+When a Kafka broker receives a Produce request for a valid topic and partition, it needs to validate that the topic and partition exist (using the `__cluster_metadata` topic's log file), store the record in the appropriate log file using Kafka's on-disk format, and return a successful response with the assigned offset.


Given how this breakdown is currently setup, I think users are going to end up hardcoding the error response for stage 2-3, and then in stage 4 they'll be stuck having to implement both parsing cluster_metadata and storing records in the log file.

Is there a way around this?

Don't think so.
One unconventional way might be:
Stage 2: Introduce creating successful responses. (Hard coded)
Stage 3: Introduce error case, at this point as they can receive both success / invalid requests they would be forced to parse the cluster metadata log and check for the existence of both topic and partitions.

stage_descriptions/produce-04_ps7.md

stage_descriptions/produce-06_mf2.md

course-definition.yml

ryan-gang · 2025-07-17T17:55:51Z

@rohitpaulk I think the correct terminology is Produce to.
I looked up some official sources, see the following.

https://github.com/apache/kafka/blob/4a2d4ee76a1eedc4efaa1a2a2f0732bec5610ed7/clients/src/main/java/org/apache/kafka/clients/producer/ProducerRecord.java#L26
"Each topic/partition to produce to" (https://kafka.apache.org/protocol.html#The_Messages_Produce)
"...consume from one Kafka topic and produce to another Kafka topic..." (https://kafka.apache.org/25/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html)
https://kafka.apache.org/31/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html

rohitpaulk

Added one round of comments

course-definition.yml

stage_descriptions/consuming-messages-01-gs0.md

stage_descriptions/consuming-messages-02-dh6.md

stage_descriptions/producing-messages-01-um3.md

stage_descriptions/producing-messages-02-ck2.md

stage_descriptions/producing-messages-05-mf2.md

stage_descriptions/producing-messages-06-ar4.md

stage_descriptions/producing-messages-03-ps7.md

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

course-definition.yml (1)
214-255: Duplicate slug values detected – please rename for uniqueness
The following slug values appear more than once across your YAML files, which will silently break course progression:

concurrent-clients

consuming-messages

listing-partitions

producing-messages

Action items:

Identify each occurrence of the above slugs (e.g. in extension-definition.yml or course-definition.yml).

Rename the offending slug: entries to unique identifiers.

Re-run:
grep -RhoP 'slug:\s*"\K[^"]+' . | sort | uniq -d
to verify that no duplicates remain.

♻️ Duplicate comments (9)

course-definition.yml (1)

236-247: Marketing copy contradicts earlier feedback about multi-partition tests

Line 20 of producing-messages-06-ar4.md still says “a single partition for each topic”, even after feedback requesting multi-partition coverage. Please align the marketing text here with the actual test plan (multi-partition should be supported).

No code change needed in this file, but update the description for clarity.
stage_descriptions/producing-messages-06-ar4.md (2)
38-40: Missing binspec link for on-disk file format

Previous review asked to link the RecordBatch on-disk binspec wherever persistence is tested. Add a bullet under “Tests”:
- 🔎 [RecordBatch file format](https://binspec.org/kafka-record-batch-v2)
20-22: Still restricts to one partition/topic – contradicts prior review

Line 20 explicitly says “a single partition for each topic”; prior feedback requested multi-partition coverage. Update wording & tests or mark reason for deviation.
stage_descriptions/producing-messages-03-ps7.md (3)
14-14: Provide the actual RecordBatch spec URL instead of the placeholder.

The [RecordBatch](add-link) placeholder is still unresolved. Replace it with the real binspec (or official docs) so learners can jump straight to the format definition.
- Kafka's on-disk log format uses the same [RecordBatch](add-link) format that is used in `Produce` and `Fetch` requests.
+Kafka's on-disk log format uses the same [RecordBatch](https://binspec.org/kafka-record-batch) format that is used in `Produce` and `Fetch` requests.
35-38: Explicitly name the topics array and require exactly one element.

Previous feedback asked for consistency with the Fetch stages. Mention the exact field (topics) and the expected cardinality to avoid ambiguity for implementers.
- Inside the topic response:
+The response’s `topics` array must contain exactly **one** entry.
+Inside that single topic response:
46-46: Clarify what “CRC validation” means for implementers.

They only need to write the CRC for each RecordBatch; validating it on write-path isn’t required in this stage and may confuse readers. Re-phrase accordingly.
- On-disk log files must be stored in `RecordBatch` format with proper CRC validation.
+On-disk log files must be stored in `RecordBatch` format, including the correct CRC for each batch.
stage_descriptions/producing-messages-04-sb8.md (1)
5-5: Add a binspec link for a multi-record topic log file.

Earlier feedback requested a binspec showing a regular topic’s log file with multiple messages; that context is still missing and would greatly help users verify their on-disk output.
When a Kafka broker receives a `Produce` request containing a `RecordBatch` with multiple records, it needs to validate that the topic and partition exist, assign sequential offsets to each record within the batch, and store the entire batch atomically to the log file. 
+
+You can inspect an example topic log file containing a multi-record batch here:
+- 🔎 [Sample topic log file](https://binspec.org/kafka-topic-log-multi-record)
+
The `RecordBatch` containing multiple records must be stored as a single unit to the topic's log file at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log`.
stage_descriptions/producing-messages-02-ck2.md (2)
21-25: Link to an error-response binspec rather than the generic success response.

Learners need to see the exact on-wire layout for an error scenario.
-🔎 [Produce Response (v11)](https://binspec.org/kafka-produce-response-v11)
+🔎 [Produce Error Response (v11)](https://binspec.org/kafka-produce-error-response-v11)
8-18: Trim the deep metadata-parsing explanation; this stage only requires a hard-coded error response.

Earlier review feedback pointed out that learners can’t be tested on UUID look-ups yet. A concise “hard-code UNKNOWN_TOPIC_OR_PARTITION for any unknown topic/partition” keeps the stage focused.
-### Topic Validation
-...
-If there exists a `TOPIC_RECORD` with the given topic name and the topic ID, the topic exists.
-
-### Partition Validation
-...
-If there exists a `PARTITION_RECORD` with the given partition index, the UUID of the topic it is associated with, and the UUID of the directory it is associated with, the partition exists.
-
-If either the topic or partition doesn't exist, the broker returns an error code of `3` (UNKNOWN_TOPIC_OR_PARTITION).
+For this stage you **don’t** have to read `__cluster_metadata`.  
+Simply parse the `Produce` request to extract the topic & partition, and respond with error code `3` (UNKNOWN_TOPIC_OR_PARTITION) whenever they don’t match a hard-coded “known” topic/partition pair.  
+Full metadata validation will be introduced in a later stage.

🧹 Nitpick comments (8)

course-definition.yml (1)
221-227: Stage name is wordy & deviates from earlier naming guidelines

Previous review suggested shorter, action-focused titles (“Produce to an invalid topic”) to keep the stage list readable.
Current title "Produce to an invalid topic or partition" is still long and mixes two error cases. Consider splitting or trimming:
-    name: "Produce to an invalid topic or partition"
+    name: "Produce to an invalid topic/partition"
stage_descriptions/producing-messages-01-um3.md (1)
12-13: Tiny wording tweak for flow

Consider plural “clients” for better readability:
-This will let the client know that the broker supports the `Produce` API.
+This lets clients know that your broker supports the `Produce` API.
stage_descriptions/consuming-messages-01-gs0.md (2)
29-31: Spelling: “atleast” ➜ “at least”
-- The `MaxVersion` for the Fetch API is atleast 16.
+- The `MaxVersion` for the Fetch API is at least 16.
26-29: Consistent constant casing

Other docs use NO_ERROR; this file uses “No Error”. Align to avoid learner confusion.
-- The error code in the response body is `0` (No Error).
+- The error code in the response body is `0` (NO_ERROR).
stage_descriptions/consuming-messages-02-dh6.md (1)
42-45: Remove stray double-space & line-wrap

The markdown has two spaces before “to scroll”, producing a non-breaking space in rendered HTML.
- Make sure  to scroll down to the "Fetch Response (Version: 16)" section.
+ Make sure to scroll down to the "Fetch Response (Version: 16)" section.
stage_descriptions/producing-messages-04-sb8.md (1)
32-33: Keep the wording minimal; the 0 base-offset rule is already declared.

The parenthetical explanation is redundant given the bullet explicitly says the value must be 0. Dropping it streamlines the spec.
-    - The `base_offset` field is 0 (the base offset for the batch).
+    - The `base_offset` field is `0`.
stage_descriptions/producing-messages-05-mf2.md (2)
24-31: State that the topics array must contain exactly one element for consistency.

Other stages make this explicit; repeating it here prevents off-by-one misunderstandings when multiple topics are introduced in the next stage.
- There is a single topic present in the response.
+The response’s `topics` array must contain exactly **one** element.
44-45: Mention the on-disk path to keep instructions uniform.

Previous stages specify the exact log-file path pattern; adding it here maintains consistency and removes guess-work.
- On-disk log files must be stored in `RecordBatch` format with proper CRC validation.
+On-disk log files for each partition must be stored in `RecordBatch` format (with correct CRC) at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log`.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7572554 and 9e1ef42.

📒 Files selected for processing (9)

course-definition.yml (2 hunks)
stage_descriptions/consuming-messages-01-gs0.md (2 hunks)
stage_descriptions/consuming-messages-02-dh6.md (2 hunks)
stage_descriptions/producing-messages-01-um3.md (1 hunks)
stage_descriptions/producing-messages-02-ck2.md (1 hunks)
stage_descriptions/producing-messages-03-ps7.md (1 hunks)
stage_descriptions/producing-messages-04-sb8.md (1 hunks)
stage_descriptions/producing-messages-05-mf2.md (1 hunks)
stage_descriptions/producing-messages-06-ar4.md (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: test_course_definition / test (go)
GitHub Check: test_course_definition / test (typescript)
GitHub Check: test_course_definition / test (zig)
GitHub Check: test_course_definition / test (java)
GitHub Check: test_course_definition / test (cpp)

🔇 Additional comments (1)

course-definition.yml (1)

74-82: Ensure list item indentation matches YAML spec

The new producing-messages extension sits at the same indentation level as its siblings, which is correct, but the nested block under description_markdown: must be indented by exactly two spaces relative to the key to be parsed as a multi-line scalar. A quick scan shows 4-space indentation – that’s fine because earlier extensions follow the same convention, just make sure you keep it consistent to avoid silent YAML parse failures.

…ve protocol inspector links and clarifying APIVersions requirements.

…xistent topics, including error response details and interactive protocol inspector links

…s for invalid topics, including error response details

…lid partitions, detailing error responses and adding interactive protocol inspector links

… Produce stage description

…uction process, including validation steps, log file format, and updated response fields

…uction by consolidating validation, storage, and response steps into a single sentence

… format and add interactive protocol inspector link for cluster metadata topic's log file

…le records in a single RecordBatch, including validation, storage, and updated response fields

…e partitions, including validation, response structure, and log file requirements

…le topics and partitions, including validation, response structure, and offset management

…ng on baseOffset and log_start_offset handling in Kafka

…cluding multiple stages for handling various production scenarios such as invalid topics, partitions, and batch production

… invalid topics and partitions, enhancing clarity on error handling and response structure

…tions in a single request, including validation, independent offset management, and nested response structure

…f invalid topics or partitions, enhancing consistency in error response messaging

…for invalid topics and partitions, removing detailed validation steps for clarity

…id topics, including validation steps and interactive protocol inspector links

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

stage_descriptions/producing-messages-07-ar4.md (3)
29-37: Bullet-nesting is off; Markdown renders this list flat

Lines 29-37 should be indented by two spaces so that “Inside each topic response” and its sub-items render as a nested list. Otherwise the hierarchy is lost and the doc looks jumbled.
-29 - Inside each topic response:
+29   - Inside each topic response:
-30   - The `name` field matches the topic name in the request.
+30     - The `name` field matches the topic name in the request.
  … (indent the remaining sub-bullets similarly)
5-5: Clarify independent offsets example

The example “topic foo partition 0 and topic bar partition 0 can both have records starting at offset 0” is good, but explicitly stating that offsets are scoped per-topic-partition removes any ambiguity for newcomers.
-Each topic-partition combination maintains its own independent offset sequence,
+Each <topic, partition> pair maintains its own independent, zero-based offset sequence,
38-39: Persist-to-disk criterion could mention RecordBatch format

Earlier stages emphasise persisting in RecordBatch with CRC validation; restating it here keeps consistency and prevents accidental regression.
- - Records are persisted to the correct topic-partition log files on disk.
+ - Records are persisted to the correct topic-partition log files on disk in valid Kafka `RecordBatch` format (CRC verified).
stage_descriptions/producing-messages-06-mf2.md (2)
29-37: Fix list indentation for proper nesting

Indent “Inside the topic response” and its child bullets to render correctly.
-29 - Inside the topic response:
+29   - Inside the topic response:
-30   - The `name` field matches the topic name in the request.
+30     - The `name` field matches the topic name in the request.
  … (indent remaining sub-bullets)
34-34: Highlight meaning of base_offset

Explicitly say it’s the first offset of the RecordBatch; this prevents confusion when batches hold multiple records.
-     - The `base_offset` field contains the assigned offset for that partition.
+     - `base_offset` equals the first offset of the RecordBatch written to that partition.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c77abf3 and 7b40866.

📒 Files selected for processing (6)

stage_descriptions/producing-messages-02-ck2.md (1 hunks)
stage_descriptions/producing-messages-03-dp1.md (1 hunks)
stage_descriptions/producing-messages-04-ps7.md (1 hunks)
stage_descriptions/producing-messages-05-sb8.md (1 hunks)
stage_descriptions/producing-messages-06-mf2.md (1 hunks)
stage_descriptions/producing-messages-07-ar4.md (1 hunks)

✅ Files skipped from review due to trivial changes (3)

stage_descriptions/producing-messages-05-sb8.md
stage_descriptions/producing-messages-03-dp1.md
stage_descriptions/producing-messages-04-ps7.md

🚧 Files skipped from review as they are similar to previous changes (1)

stage_descriptions/producing-messages-02-ck2.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: test_course_definition / test (typescript)
GitHub Check: test_course_definition / test (go)
GitHub Check: test_course_definition / test (zig)
GitHub Check: test_course_definition / test (java)
GitHub Check: test_course_definition / test (cpp)

🔇 Additional comments (1)

stage_descriptions/producing-messages-06-mf2.md (1)

44-44: Good call-out on RecordBatch & CRC

Re-emphasising CRC validation reinforces an important correctness requirement.

stage_descriptions/producing-messages-06-mf2.md

stage_descriptions/producing-messages-07-ar4.md

…e format in multiple stage descriptions

…s multiple partitions for each topic, enhancing understanding of request structure

…pdating link titles for consistency and accuracy

…nsistency and clarity across Produce request validations

rohitpaulk · 2025-07-22T04:40:43Z

Will review this tomorrow, want to see if I can use it to test automated reviews via core

ryan-gang · 2025-07-24T08:08:50Z

@rohitpaulk this is pending.

rohitpaulk · 2025-07-25T16:19:53Z

Yep, getting to it now

rohitpaulk

Reviewed half, let's revise based on these and request a review again please

rohitpaulk · 2025-07-25T17:32:22Z

stage_descriptions/producing-messages-01-um3.md

+
+We've created an interactive protocol inspector for the request & response structures for `Produce`:
+
+- 🔎 [Produce Request (v11)](https://binspec.org/kafka-produce-request-v11)


@ryan-gang did we settle on v11 vs v12? What was the conclusion?

Yes, we are going to go with v12.
The Request and Response structures are identical.
Changes in stage descriptions and binspec would be very minimal.
For tester, changes would be limited, but we would have to change the version of Kafka we use for testing on CI and locally, + fixtures are going to change a lot.

So I thought I would ship tester with v11, then in a separate PR, upgrade version + update CI + fixtures.

stage_descriptions/producing-messages-02-ck2.md

stage_descriptions/producing-messages-03-dp1.md

stage_descriptions/producing-messages-04-ps7.md

stage_descriptions/producing-messages-05-sb8.md

… 'Produce' references and improving overall readability

…th valid topics and partitions, and improve link references for RecordBatch format

Co-authored-by: Paul Kuruvilla <[email protected]>

stage_descriptions/producing-messages-03-dp1.md

… in stage descriptions to streamline content

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

stage_descriptions/producing-messages-06-mf2.md (2)

30-38: Fine-tune nested bullet indentation for reliable Markdown rendering

GitHub CommonMark sometimes treats two-space indentation ambiguously; four spaces (or a tab) is safest for guaranteeing a properly nested list.
Consider shifting lines 31-37 four spaces right to avoid the chance of the sub-list collapsing into plain text on some renderers.

35-36: Clarify expected offset semantics

Stating that base_offset “contains the assigned offset” leaves room for interpretation.
Appending a brief note such as “(e.g., 0 for the first record in an empty partition)” reinforces that offsets are scoped per partition and start from 0 when no prior records exist.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 543120f and 1096865.

📒 Files selected for processing (7)

stage_descriptions/producing-messages-01-um3.md (1 hunks)
stage_descriptions/producing-messages-02-ck2.md (1 hunks)
stage_descriptions/producing-messages-03-dp1.md (1 hunks)
stage_descriptions/producing-messages-04-ps7.md (1 hunks)
stage_descriptions/producing-messages-05-sb8.md (1 hunks)
stage_descriptions/producing-messages-06-mf2.md (1 hunks)
stage_descriptions/producing-messages-07-ar4.md (1 hunks)

✅ Files skipped from review due to trivial changes (2)

stage_descriptions/producing-messages-05-sb8.md
stage_descriptions/producing-messages-04-ps7.md

🚧 Files skipped from review as they are similar to previous changes (4)

stage_descriptions/producing-messages-03-dp1.md
stage_descriptions/producing-messages-07-ar4.md
stage_descriptions/producing-messages-01-um3.md
stage_descriptions/producing-messages-02-ck2.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: test_course_definition / test (typescript)

🔇 Additional comments (1)

stage_descriptions/producing-messages-06-mf2.md (1)

1-46: Overall description reads clearly — no blocking issues

The stage requirements, validation steps, and external references are precise and technically accurate. Nice work.

…ion from v11 to v12, ensuring consistency in request and response links and validation criteria

stage_descriptions/producing-messages-04-ps7.md

…est now includes a recordBatch containing a single record, enhancing clarity on request structure

…ractive protocol inspector links and clarifying record handling for multiple records and partitions

…hat records are written to each partition's log file independently

stage_descriptions/producing-messages-03-dp1.md

stage_descriptions/producing-messages-04-ps7.md

rohitpaulk · 2025-07-28T07:12:36Z

stage_descriptions/producing-messages-04-ps7.md

+The record must be persisted to the topic's log file at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log` using Kafka's RecordBatch format.
+You can refer to the official Kafka docs for the [RecordBatch format](https://kafka.apache.org/documentation/#recordbatch).
+
+Kafka's on-disk log format uses the same [RecordBatch](https://binspec.org/kafka-record-batches) format that is used in `Produce` and `Fetch` requests. The `RecordBatch` you receive in the `Produce` request can be written to the log file as is.


Is this actually a binspec file we're going to make? Could we just link to a recordBatch within a topic's binspec?

Yes we could link to a topic's log file. It would be just like a couple of record batches.

rohitpaulk · 2025-07-28T07:14:36Z

stage_descriptions/producing-messages-05-sb8.md

@@ -0,0 +1,43 @@
+In this stage, you'll add support for producing multiple records in a single `Produce` request.
+
+## Producing multiple records


@ryan-gang at what point do we test "appending" to a log file? That seems like something that'd need extra implementation and something we should explicitly call out and test.

stage_descriptions/producing-messages-07-ar4.md

Co-authored-by: Paul Kuruvilla <[email protected]>

…ect accurate RecordBatch format references and clarify log format details

cursor · 2025-07-28T15:57:53Z

stage_descriptions/producing-messages-04-ps7.md

+The record must be persisted to the topic's log file at `<log-dir>/<topic-name>-<partition-index>/00000000000000000000.log` using Kafka's [RecordBatch format](https://kafka.apache.org/documentation/#recordbatch).
+
+TODO: edit link after topic log binspec is in
+Kafka's on-disk log format uses the same [RecordBatch](https://binspec.org/kafka-topic-log) format that is used in `Produce` and `Fetch` requests. The `RecordBatch` you receive in the `Produce` request can be written to the log file as is.


Bug: Remove Outdated Comment

The "TODO: edit link after topic log binspec is in" comment was accidentally committed. This development note is outdated and confusing, as the binspec.org/kafka-topic-log link it refers to is already present and used in the immediate vicinity. It should be removed.

Locations (1)

stage_descriptions/producing-messages-04-ps7.md#L8-L10

ryan-gang self-assigned this Jul 10, 2025

ryan-gang force-pushed the kafka-produce branch from 9018c1e to 68c3110 Compare July 10, 2025 15:48

ryan-gang requested a review from rohitpaulk July 10, 2025 18:37

rohitpaulk requested changes Jul 10, 2025

View reviewed changes

ryan-gang force-pushed the kafka-produce branch from d9bb13a to 04bcea8 Compare July 17, 2025 06:43

ryan-gang requested a review from rohitpaulk July 17, 2025 19:30

rohitpaulk requested changes Jul 20, 2025

View reviewed changes

rohitpaulk reviewed Jul 20, 2025

View reviewed changes

stage_descriptions/producing-messages-03-ps7.md Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jul 21, 2025

View reviewed changes

ryan-gang added 18 commits July 21, 2025 23:19

feat: add stage descriptions for Produce API extension

f4cfd49

feat: enhance stage description for the Produce API, adding interacti…

d6db37a

…ve protocol inspector links and clarifying APIVersions requirements.

feat: update stage description for handling Produce requests to non-e…

239ce70

…xistent topics, including error response details and interactive protocol inspector links

feat: update stage description to clarify handling of Produce request…

1e0dc22

…s for invalid topics, including error response details

feat: enhance stage description for handling Produce requests to inva…

62cb2bc

…lid partitions, detailing error responses and adding interactive protocol inspector links

feat: add TODO note for enhancing cluster metadata binspec details in…

65f3f7f

… Produce stage description

feat: enhance Produce stage description to clarify single record prod…

03a3cb8

…uction process, including validation steps, log file format, and updated response fields

refactor: streamline Produce stage description for single record prod…

982ec2e

…uction by consolidating validation, storage, and response steps into a single sentence

feat: update Produce stage description to clarify Kafka's on-disk log…

7043210

… format and add interactive protocol inspector link for cluster metadata topic's log file

feat: enhance Produce stage description to clarify handling of multip…

17ea069

…le records in a single RecordBatch, including validation, storage, and updated response fields

feat: update Produce stage description to clarify handling of multipl…

42396b8

…e partitions, including validation, response structure, and log file requirements

feat: refine Produce stage descriptions to clarify handling of multip…

01cf9dc

…le topics and partitions, including validation, response structure, and offset management

feat: add TODO for testing multiple requests in Produce stage, focusi…

2d085e6

…ng on baseOffset and log_start_offset handling in Kafka

feat: add new challenge extension for producing messages in Kafka, in…

27111ee

…cluding multiple stages for handling various production scenarios such as invalid topics, partitions, and batch production

chore: rename stage description files to match expected format

1ce465d

feat: update Produce stage description to include validation for both…

0f0f4b3

… invalid topics and partitions, enhancing clarity on error handling and response structure

feat: add support for producing messages to multiple topics and parti…

70fd9c5

…tions in a single request, including validation, independent offset management, and nested response structure

feat: update Produce stage name and description to clarify handling o…

5a213be

…f invalid topics or partitions, enhancing consistency in error response messaging

ryan-gang added 3 commits July 22, 2025 00:35

chore: add additional stage 3 in between

eb78513

fix: simplify Produce stage description by hardcoding error response …

f5f5329

…for invalid topics and partitions, removing detailed validation steps for clarity

feat: add detailed description for handling Produce requests with val…

7b40866

…id topics, including validation steps and interactive protocol inspector links

coderabbitai bot reviewed Jul 21, 2025

View reviewed changes

stage_descriptions/producing-messages-06-mf2.md Show resolved Hide resolved

stage_descriptions/producing-messages-07-ar4.md Show resolved Hide resolved

ryan-gang added 4 commits July 22, 2025 00:51

docs: add links to interactive protocol inspector for Kafka's log fil…

170f1ee

…e format in multiple stage descriptions

fix: clarify Produce request description by specifying that data span…

30bf549

…s multiple partitions for each topic, enhancing understanding of request structure

fix: improve clarity in stage descriptions by refining language and u…

543120f

…pdating link titles for consistency and accuracy

fix: update error code references in stage descriptions to improve co…

fb20251

…nsistency and clarity across Produce request validations

ryan-gang requested a review from rohitpaulk July 21, 2025 20:18

rohitpaulk requested changes Jul 25, 2025

View reviewed changes

ryan-gang and others added 3 commits July 25, 2025 23:27

fix: enhance clarity in stage descriptions by consistently formatting…

51e6793

… 'Produce' references and improving overall readability

fix: update stage descriptions to clarify Produce request handling wi…

b20bc2c

…th valid topics and partitions, and improve link references for RecordBatch format

refactor: apply suggestions from code review

1096865

Co-authored-by: Paul Kuruvilla <[email protected]>

cursor bot reviewed Jul 25, 2025

View reviewed changes

stage_descriptions/producing-messages-03-dp1.md Outdated Show resolved Hide resolved

docs: remove interactive protocol inspector links for Produce request…

848e932

… in stage descriptions to streamline content

coderabbitai bot reviewed Jul 25, 2025

View reviewed changes

fix: update stage descriptions to reflect changes in Produce API vers…

7b3f26b

…ion from v11 to v12, ensuring consistency in request and response links and validation criteria

cursor bot reviewed Jul 25, 2025

View reviewed changes

stage_descriptions/producing-messages-04-ps7.md Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

ryan-gang added 3 commits July 25, 2025 23:47

fix: update stage description to specify that a single Produce requ…

da5378c

…est now includes a recordBatch containing a single record, enhancing clarity on request structure

fix: enhance stage descriptions for producing messages by adding inte…

5f3ce1d

…ractive protocol inspector links and clarifying record handling for multiple records and partitions

fix: clarify stage description for producing messages by specifying t…

02caa7b

…hat records are written to each partition's log file independently

ryan-gang requested a review from rohitpaulk July 25, 2025 18:30

rohitpaulk requested changes Jul 28, 2025

View reviewed changes

ryan-gang and others added 2 commits July 28, 2025 21:21

refactor: apply suggestions from code review

9970aa6

Co-authored-by: Paul Kuruvilla <[email protected]>

fix: update links in stage description for producing messages to refl…

0140d31

…ect accurate RecordBatch format references and clarify log format details

cursor bot reviewed Jul 28, 2025

View reviewed changes


		## Single Record Production

		When a Kafka broker receives a Produce request for a valid topic and partition, it needs to validate that the topic and partition exist (using the `__cluster_metadata` topic's log file), store the record in the appropriate log file using Kafka's on-disk format, and return a successful response with the assigned offset.


		We've created an interactive protocol inspector for the request & response structures for `Produce`:

		- 🔎 [Produce Request (v11)](https://binspec.org/kafka-produce-request-v11)

		@@ -0,0 +1,43 @@
		In this stage, you'll add support for producing multiple records in a single `Produce` request.

		## Producing multiple records

Kafka/Produce API extension #45

Are you sure you want to change the base?

Kafka/Produce API extension #45

Uh oh!

Conversation

ryan-gang commented Jul 10, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

ryan-gang commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rohitpaulk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryan-gang Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ryan-gang commented Jul 17, 2025

Uh oh!

rohitpaulk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rohitpaulk commented Jul 22, 2025

Uh oh!

ryan-gang commented Jul 24, 2025

Uh oh!

rohitpaulk commented Jul 25, 2025

Uh oh!

rohitpaulk left a comment

Choose a reason for hiding this comment

ryan-gang commented Jul 10, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 10, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

ryan-gang commented Jul 10, 2025 •

edited

Loading

ryan-gang Jul 17, 2025 •

edited

Loading