Skip to content
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
course-definition-tester
.history/

## MacOS
.DS_Store
46 changes: 46 additions & 0 deletions course-definition.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,16 @@ extensions:
Along the way, you'll learn about how to implement the `*` quantifier (zero or more), and bounded quantifiers.
[1]: https://learn.microsoft.com/en-us/dotnet/standard/base-types/quantifiers-in-regular-expressions

- slug: "highlighting"
name: "Highlighting"
description_markdown: |
In this challenge extension, you'll add support for [highlighting][1] to your Grep implementation.

Along the way, you'll learn about [ANSI escape codes][2], and more.

[1]: https://linuxcommando.blogspot.com/2007/10/grep-with-color-output.html
[2]: https://en.wikipedia.org/wiki/ANSI_escape_code

stages:
- slug: "cq2"
name: "Match a literal character"
Expand Down Expand Up @@ -304,3 +314,39 @@ stages:
difficulty: hard
marketing_md: |-
In this stage, we'll add support for `{n,m}`, the [between n and m times](https://docs.microsoft.com/en-us/dotnet/standard/base-types/quantifiers-in-regular-expressions#match-between-n-and-m-times-nm) quantifier.

# Highlighting
- slug: "bm2"
primary_extension_slug: "highlighting"
name: "Highlighting a single match"
difficulty: hard
marketing_md: |-
In this stage, you'll add support for highlighting a single match in your grep implementation.

- slug: "eq0"
primary_extension_slug: "highlighting"
name: "Highlighting multiple matches"
difficulty: hard
marketing_md: |-
In this stage, you'll add support for highlighting multiple matches in your grep implementation.

- slug: "jk4"
primary_extension_slug: "highlighting"
name: "Disabling highlighting"
difficulty: easy
marketing_md: |-
In this stage, you'll add support for disabling the highlighting in your grep implementation using the `never` coloring option.

- slug: "na5"
primary_extension_slug: "highlighting"
name: "Auto highlighting option"
difficulty: medium
marketing_md: |-
In this stage, you'll add support for `auto` coloring option in the `--color` flag in your grep implementation.

- slug: "nd0"
primary_extension_slug: "highlighting"
name: "Default highlighting behavior"
difficulty: easy
marketing_md: |-
In this stage, you'll implement the `auto` coloring option as the default coloring option when the `--color` flag is not present.
92 changes: 92 additions & 0 deletions stage_descriptions/highlighting-01-bm2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
In this stage, you'll add support for highlighting a single match in your grep implementation.

### Highlighting the matched text
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have the prerequisites, let's change the ordering here:

  • Highlighting a single match
  • Highlighting multiple matches
  • Disabling highlighting
  • Auto highlighting behaviour (not sure about this name, could be improved)
  • Default highlighting behaviour (not sure about this name, could be improved)


When `--color=always` option is used with grep, it always highlights the matched text in its output.

Example usage:

<html>

Check warning on line 9 in stage_descriptions/highlighting-01-bm2.md

View workflow job for this annotation

GitHub Actions / LLM Doc Lint

no_semicolons (stage_descriptions/highlighting-01-bm2.md) failed

tools/llm-linter/reports/junit.xml
Raw output
Found 5 semicolons outside fenced code blocks or inline code spans, all in HTML style attributes within example sections: two in the first HTML example line, two in the test HTML example line, and one in the later HTML output example line. Semicolons within fenced code blocks and inline backticks are allowed and not counted.

Suggested fixes:
  - description: "Wrap the HTML example in a fenced code block to allow semicolons."
  - before: "<html>\n<pre>\n<code>$ echo -n "I have 1 apple" | grep --color=always -E '\d'\nI have <span style="color: red; font-weight: bold;">1</span> apple</code>\n</pre>\n</html>"
  - after: "```\n$ echo -n "I have 1 apple" | grep --color=always -E '\d'\nI have <span style="color: red; font-weight: bold;">1</span> apple\n```"
  - description: "Wrap the test HTML example in a fenced code block to allow semicolons."
  - before: "<html>\n<pre>\n<code>$ echo -n "I have 3 apples" | grep --color=always -E '\d'\nI have <span style="font-weight:bold; color:red;">3</span> apples</code>\n</pre>\n</html>"
  - after: "```\n$ echo -n "I have 3 apples" | grep --color=always -E '\d'\nI have <span style="font-weight:bold; color:red;">3</span> apples\n```"
  - description: "Wrap the output HTML example in a fenced code block to allow semicolons."
  - before: "<html>\n<pre>\n<code>hello<span style="color:red; font-weight:bold" >matched</span>world</code>\n</pre>\n</html>"
  - after: "```\nhello<span style="color:red; font-weight:bold" >matched</span>world\n```"
<pre>
<code>$ echo -n "I have 1 apple" | grep --color=always -E '\d'
I have <span style="color: red; font-weight: bold;">1</span> apple</code>
</pre>
</html>

Grep uses [ANSI escape sequences](https://en.wikipedia.org/wiki/ANSI_escape_code) to add color to terminal output. These are special character sequences that terminals interpret as formatting commands rather than regular text.

The default color used by grep for the matched text is bold red. For example, grep uses the following ANSI escape sequences for wrapping the matched text:

```
\033[01;31m\033[K
...
\033[m\033[K
```

**Exaple Opening Sequence: `\033[01;31m\033[K`**

| Component | Meaning |
|-----------|---------|
| `\033` | Escape character that introduces the ANSI control sequence |
| `[` | Start marker for [Select Graphic Rendition (SGR)](https://vt100.net/docs/vt510-rm/SGR.html) parameters |
| `01;31` | SGR codes: `01` = bold/bright text, `31` = red foreground color (separated by `;`) |
| `m` | Terminates the SGR sequence |
| |
| `\033` | Escape character that introduces the ANSI control sequence |
| `[` | Start marker for SGR parameters |
| `K` | Erase all characters to the right of the cursor |

**Example Closing Sequence: `\033[m\033[K`**

| Component | Meaning |
|-----------|---------|
| `\033` | Escape character that introduces the ANSI control sequence |
| `[` | Start marker for SGR parameters |
| *(empty)* | No parameters = reset all attributes to default |
| `m` | Terminates the SGR sequence |
| |
| `\033` | Escape character that introduces the ANSI control sequence
| `[` | Start marker for SGR parameters |
| `K` | Erase all characters to the right of the cursor |

When the SGR parameter is `0` or is not present (empty), it resets all attributes so that the rest of the text will be printed without any highlights.

### Tests

The tester will execute your program like this:

<html>
<pre>
<code>$ echo -n "I have 3 apples" | grep --color=always -E '\d'
I have <span style="font-weight:bold; color:red;">3</span> apples</code>
</pre>
</html>

If the input does not match the pattern, your program must:
- Exit with the code 1
- Exit with no printed output

If the input text matches the pattern, your program must:
- Exit with the code 0
- Print the input text to the standard output
- Highlight the matched text in the input using the grep's default color.

### Notes

1. You only need to handle the case of single match. We'll get to highlighting multiple matches in the next stage.

2. The matched text should highlighted using the bold (`01`) and red (`31`) attributes. You may use any combination of ANSI codes to achieve this highlighting effect. For example, to produce the following output:

<html>
<pre>
<code>hello<span style="color:red; font-weight:bold" >matched</span>world</code>
</pre>
</html>

Any of the following sequences can be used

```
hello\033[31;01m\033[Kmatched\033[m\033[Kworld
hello\033[31;01m\033[K\033[Kmatched\033[m\033[K\033[Kworld
hello\033[01;31m\033[Kmatched\033[m\033[Kworld
```
42 changes: 42 additions & 0 deletions stage_descriptions/highlighting-02-eq0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
In this stage, you'll add support for highlighting multiple matches in your grep implementation.

### Highlighting multiple matches

Grep highlights all the matching texts it can find in each line.

Example usage:

<html>

Check warning on line 9 in stage_descriptions/highlighting-02-eq0.md

View workflow job for this annotation

GitHub Actions / LLM Doc Lint

no_semicolons (stage_descriptions/highlighting-02-eq0.md) failed

tools/llm-linter/reports/junit.xml
Raw output
Found semicolons outside allowed code formatting (not in fenced code blocks or inline backticks): 12 semicolons in HTML <span> style attributes across Example usage, Tests, and Notes sections.

Suggested fixes:
  - description: "Wrap the Example usage HTML block in a fenced code block so semicolons are within code formatting."
  - before: "<html>\n<pre>\n<code>$ echo -n "dogs and cats are pets" | grep --color=always -E '(dogs|cats)'\n<span style="color: red; font-weight: bold;">dogs</span> and <span style="color: red; font-weight: bold;">cats</span> are pets</code>\n</pre>\n</html>"
  - after: "```html\n<html>\n<pre>\n<code>$ echo -n "dogs and cats are pets" | grep --color=always -E '(dogs|cats)'\n<span style="color: red; font-weight: bold;">dogs</span> and <span style="color: red; font-weight: bold;">cats</span> are pets</code>\n</pre>\n</html>\n```"
  - description: "Wrap the Tests HTML block in a fenced code block so semicolons are within code formatting."
  - before: "<html>\n<pre>\n<code>$ echo -n "jekyll and hyde" | grep --color=always -E '(jekyll|hyde)'\n<span style="color: red; font-weight: bold;">jekyll</span> and <span style="color: red; font-weight: bold;">hyde</span>\n<br />\n$ echo -n "2025" | grep --color=always -E '\d'\n<span style="color: red; font-weight: bold;">2025</span></code></pre>\n</html>"
  - after: "```html\n<html>\n<pre>\n<code>$ echo -n "jekyll and hyde" | grep --color=always -E '(jekyll|hyde)'\n<span style="color: red; font-weight: bold;">jekyll</span> and <span style="color: red; font-weight: bold;">hyde</span>\n<br />\n$ echo -n "2025" | grep --color=always -E '\d'\n<span style="color: red; font-weight: bold;">2025</span></code></pre>\n</html>\n```"
  - description: "Wrap the inline span in the Notes sentence with backticks so its semicolons are in an inline code span."
  - before: "The tester accepts multiple valid ANSI-encoded representations of the same highlighted text. To display the bold red text: <span style="color: red; font-weight: bold;">2025</span>, any equivalent combination is acceptable."
  - after: "The tester accepts multiple valid ANSI-encoded representations of the same highlighted text. To display the bold red text: `<span style="color: red; font-weight: bold;">2025</span>`, any equivalent combination is acceptable."
<pre>
<code>$ echo -n "dogs and cats are pets" | grep --color=always -E '(dogs|cats)'
<span style="color: red; font-weight: bold;">dogs</span> and <span style="color: red; font-weight: bold;">cats</span> are pets</code>
</pre>
</html>

### Tests

The tester will execute your program like this:

<html>
<pre>
<code>$ echo -n "jekyll and hyde" | grep --color=always -E '(jekyll|hyde)'
<span style="color: red; font-weight: bold;">jekyll</span> and <span style="color: red; font-weight: bold;">hyde</span>
<br />
$ echo -n "2025" | grep --color=always -E '\d'
<span style="color: red; font-weight: bold;">2025</span></code></pre>
</html>

If the input does not match the pattern, your program must:
- Exit with the code 1
- Exit with no printed output

If the input text matches the pattern, your program must:
- Exit with the code 0
- Print the input text to the standard output
- Highlight all the matched texts in the input using the grep's default color.

### Notes

- The tester accepts multiple valid ANSI-encoded representations of the same highlighted text. To display the bold red text: <span style="color: red; font-weight: bold;">2025</span>, any equivalent combination is acceptable. Example of valid ANSI sequences:
- `\u001b[1;31m2025\u001b[0m`
- `\u001b[1;31m20\u001b[0m\u001b[1;31m25\u001b[0m`
30 changes: 30 additions & 0 deletions stage_descriptions/highlighting-03-jk4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
In this stage, you'll add support for disabling the highlighting in your grep implementation using the `never` coloring option.

### The `--color=never` option

When a line is matched, grep only prints the matched line to the standard output. It does not highlight the matched text.

Example usage:

```bash
$ echo -n "Sally has 3 parrots" | grep --color=never -E "par+ots?"
Sally has 3 parrots
```

### Tests

The tester will execute your program like this:

```bash
$ echo -n "I have 5 vegetables" | grep --color=never -E '\d'
I have 5 vegetables
```

If the input matches the pattern, your program must:
- Exit with the code 0
- Print the input line to the standard output
- No highlights should be placed in the output text because `--color=never` option is being used.

If the input does not match the pattern, your program must:
- Exit with the code 1
- Exit with no printed output
82 changes: 82 additions & 0 deletions stage_descriptions/highlighting-04-na5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
In this stage, you'll add support for `auto` coloring option in the `--color` flag in your grep implementation.

Check warning on line 1 in stage_descriptions/highlighting-04-na5.md

View workflow job for this annotation

GitHub Actions / LLM Doc Lint

no_semicolons (stage_descriptions/highlighting-04-na5.md) failed

tools/llm-linter/reports/junit.xml
Raw output
Found semicolons outside of fenced/inline code: in HTML style attributes on lines 16 (1), 18 (1), 47 (2), and 49 (2).

Suggested fixes:
  - description: "Wrap the first HTML example in a fenced code block so semicolons are allowed."
  - replacement: "```html\n$ echo -n "I have 3 cows" | grep --color=auto -E 'cows'\nI have <span style="font-weight:bold;color:red">3</span> cows\n$ echo -n "I have 4 cows" | grep --color=auto -E 'cows' >> /dev/tty\nI have <span style="font-weight:bold;color:red">4</span> cows\n```"
  - replace_range: "{"start_line":13,"end_line":20}"
  - description: "Wrap the second HTML example in a fenced code block so semicolons are allowed."
  - replacement: "```html\n$ echo -n "I have 4 cats" | grep --color=auto -E 'cats'\nI have 4 <span style="color:red; font-weight:bold;">cats</span>\n$ echo -n "I have 5 cats" | grep --color=auto -E 'cats' >> /dev/tty\nI have 5 <span style="color:red; font-weight:bold;">cats</span>\n```"
  - replace_range: "{"start_line":44,"end_line":51}"
  - description: "Alternative: remove semicolons by using a single CSS property in the style attributes."
  - locations: "[{"line":16,"before":"<span style=\"font-weight:bold;color:red\">3</span>","after":"<span style=\"color:red\">3</span>"},{"line":18,"before":"<span style=\"font-weight:bold;color:red\">4</span>","after":"<span style=\"color:red\">4</span>"},{"line":47,"before":"<span style=\"color:red; font-weight:bold;\">cats</span>","after":"<span style=\"color:red\">cats</span>"},{"line":49,"before":"<span style=\"color:red; font-weight:bold;\">cats</span>","after":"<span style=\"color:red\">cats</span>"}]"

### The `auto` color option

When `--color=auto` option is used with grep, it behaves in the following manner:

- If the output stream is a [TTY device](https://www.ibm.com/docs/en/aix/7.1.0?topic=communications-tty-terminal-device), like the terminal, highlighting is enabled.

- If the output stream is not a TTY device, for example, the output is piped to another command, or being redirected to a non-TTY device, highlighting is disabled.

Example usage:

<html>
<pre>
<code>$ echo -n "I have 3 cows" | grep --color=auto -E 'cows'
I have <span style="font-weight:bold;color:red">3</span> cows
$ echo -n "I have 4 cows" | grep --color=auto -E 'cows' >> /dev/tty
I have <span style="font-weight:bold;color:red">4</span> cows</code>
</pre>
</html>

The output text is highlighted in this case since the output in both cases is a TTY device.

When the output stream is piped to another command, or redirected to a non-TTY device, the ANSI highlighting sequences are not placed in the output text.

```bash
# Output stream is piped to another command
$ echo -n "I have 3 apples" | grep --color=auto -E '\d' | hexdump -C
00000000 49 20 68 61 76 65 20 33 20 61 70 70 6c 65 73 0a |I have 3 apples.|
00000010

# Output stream is redirected to a non-TTY device
$ echo -n "I have 4 apples" | grep --color=auto -E '\d' >> output.txt

$ hexdump -C output.txt
00000000 49 20 68 61 76 65 20 34 20 61 70 70 6c 65 73 0a |I have 4 apples.|
00000010
```

### Tests

The tester will execute your program like this:

<html>
<pre>
<code>$ echo -n "I have 4 cats" | grep --color=auto -E 'cats'
I have 4 <span style="color:red; font-weight:bold;">cats</span>
$ echo -n "I have 5 cats" | grep --color=auto -E 'cats' >> /dev/tty
I have 5 <span style="color:red; font-weight:bold;">cats</span></code>
</pre>
</html>

If the input does not match the pattern, your program must:
- Exit with the code 1

If the input text matches the pattern, your program must:
- Exit with the code 0
- Print the input text to the standard output
- The matched text in the output should be highlighted

The tester will also execute your program like this:

```bash
# Redirection to a non-tty device
$ echo -n "I have 3 horses" | grep --color=auto -E '\d' >> file.txt

# Piping to another command
$ echo -n "He has 9 rabbits" | grep --color=auto -E '\d' | another_command
```

For both of these cases,
If the input does not match the pattern, your program must:
- Exit with the code 1
- Exit with no printed output

If the input text matches the pattern, your program must exit with the code 0 and
- The input text should be written to the file `file.txt`, or be supplied to another command, depending on the case.
- The ANSI escape sequence for highlighting should not be present inside the file, or supplied to another command, depending on the case.

### Notes

- You might find it helpful to use the equivalent of [`isatty()`](https://man7.org/linux/man-pages/man3/isatty.3.html) function in your programming language to check whether the output stream is a TTY device.
69 changes: 69 additions & 0 deletions stage_descriptions/highlighting-05-nd0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
In this stage, you'll implement the `auto` coloring option as the default coloring option when the `--color` flag is not present.

Check warning on line 1 in stage_descriptions/highlighting-05-nd0.md

View workflow job for this annotation

GitHub Actions / LLM Doc Lint

no_semicolons (stage_descriptions/highlighting-05-nd0.md) failed

tools/llm-linter/reports/junit.xml
Raw output
Found 2 semicolons outside fenced/inline code: both appear in <span style="font-weight:bold;color:red"> within a raw HTML block (<html><pre><code>...</code></pre></html>). The rule only allows semicolons in fenced code blocks or inline code spans.

Suggested fixes:
  - description: "Wrap the entire HTML example in a fenced code block so the semicolons are within code formatting."
  - example_after: "```html\n$ echo -n "I have 3 cows" | grep -E 'cows'\nI have <span style="font-weight:bold;color:red">3</span> cows\n$ echo -n "I have 4 cows" | grep -E 'cows' >> /dev/tty\nI have <span style="font-weight:bold;color:red">4</span> cows\n```"

### The default behavior

When the `--color` flag is not present, it is equivalent of using grep with the `--color=auto` option.

Example usage:

<html>
<pre>
<code>$ echo -n "I have 3 cows" | grep -E 'cows'
I have <span style="font-weight:bold;color:red">3</span> cows
$ echo -n "I have 4 cows" | grep -E 'cows' >> /dev/tty
I have <span style="font-weight:bold;color:red">4</span> cows</code>
</pre>
</html>

The output text is highlighted in this case.

When the output stream is piped to another command, or redirected to a non-TTY device, the ANSI highlighting sequences are not placed in the output text.

```bash
# Output stream is piped to another command
$ echo -n "I have 3 apples" | grep --color=auto -E '\d' | hexdump -C
00000000 49 20 68 61 76 65 20 33 20 61 70 70 6c 65 73 0a |I have 3 apples.|
00000010

# Output stream is redirected to a non-TTY device
$ echo -n "I have 4 apples" | grep --color=auto -E '\d' >> output.txt

$ hexdump -C output.txt
00000000 49 20 68 61 76 65 20 34 20 61 70 70 6c 65 73 0a |I have 4 apples.|
00000010
```

### Tests

The tester will execute your program like this:

```bash
$ echo -n "I have 3 horses" | grep -E '\d'
```

If the input does not match the pattern, your program must:
- Exit with the code 1
- Exit with no printed output

If the input text matches the pattern, your program must:
- Exit with the code 0
- Print the input text to the standard output
- The matched text in the output should be highlighted

The tester will also execute your program like this:

```bash
# Redirection to a non-tty device
$ echo -n "I have 3 horses" | grep -E '\d' >> file.txt

# Piping to another command
$ echo -n "He has 9 rabbits" | grep -E '\d' | another_command
```

For both of these cases,
If the input does not match the pattern, your program must:
- Exit with the code 1

If the input text matches the pattern, your program must exit with the code 0 and
- The input text should be written to the file `file.txt`, or be supplied to another command, depending on the case.
- The ANSI escape sequence for highlighting should not be present inside the file, or supplied to another command, depending on the case.
Loading