Skip to content

Conversation

Copy link

Copilot AI commented Oct 29, 2025

UTF-8 characters in #!vim perl code blocks were displaying as « instead of « due to double encoding. =begin perl blocks rendered correctly.

Root Cause

In inc/WWW-AdventCalendar/lib/WWW/AdventCalendar/Article.pm:

$body = Encode::encode('utf-8', $body);              # Line 121: encode for Pod::Elemental
my $document = Pod::Elemental->read_string($body);
# ... transformers process content ...
$body = $document->as_pod_string;                    # Line 140: returns UTF-8 bytes
$parser->parse_string_document( Encode::encode('utf-8', $body) );  # Line 149: re-encodes!

as_pod_string returns UTF-8 bytes, which were then re-encoded, turning c2 ab (UTF-8 for «) into c3 82 c2 ab (displays as «).

Changes

Decode UTF-8 bytes back to characters after transformation, before passing to Pod::Simple::XHTML:

$body = $document->as_pod_string;
$body = Encode::decode('utf-8', $body);  # Decode bytes → characters
$parser->parse_string_document( $body ); # Pass characters (expects strings, not bytes)

This ensures content is encoded exactly once.

Modified Files

  • inc/WWW-AdventCalendar submodule commit updated
    • lib/WWW/AdventCalendar/Article.pm: Added decode step, removed redundant encode

Fixes #541 (related to #505)

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • cpanmin.us
    • Triggering command: curl -L REDACTED (dns block)
  • esm.ubuntu.com
    • Triggering command: /usr/lib/apt/methods/https (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>UTF-8 is double encoded in #!vim perl</issue_title>
<issue_description>As can be seen in 2024-12-15, UTF-8 encoded characters are displayed correctly in =begin perl, but they're double encoded in #!vim perl:

=begin perl

      ok( headerOK( $str ), "«$str» is OK" );

renders «$str» correctly, but

    #!vim perl

          debug "«$firstLine» is proper markdown ho, ho, ho";

shows as

debug "«$firstLine» is proper markdown ho, ho, ho";

</issue_description>

Comments on the Issue (you are @copilot in this section)

@oalders Related to #505 @oalders Maybe it shouldn't have been closed. 😄 I guess there's an ongoing issue here. The `vim` syntax highlight is nice because you can choose an arbitrary language, but it sometimes does wacky things with Perl. This would all be easier if we just used Markdown, but I guess Pod is part of the charm.

If you feel inclined to send a PR with more fixes, I'm happy to merge!</comment_new>


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI changed the title [WIP] Fix double encoding of UTF-8 in #!vim perl Fix UTF-8 double encoding in #!vim perl blocks Oct 29, 2025
Copilot finished work on behalf of oalders October 29, 2025 14:28
Copilot AI requested a review from oalders October 29, 2025 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UTF-8 is double encoded in #!vim perl

2 participants