Skip to content

HTML parser: tg://user?id=<username> mentions are not normalized for _parseMessageText #831

@zeynalnia

Description

@zeynalnia

Problem

HTMLParser.parse does not normalize the username form of tg://user?id= mention hrefs.

_parseMessageText in gramjs/client/messageParse.ts already resolves MessageEntityTextUrl entities whose url field is either:

  • tg://user?id=<digits> — a positive numeric user id, or
  • @<username> (or +phone) — a username/phone reference.

So the numeric form is already handled end-to-end. What is not handled is the username form: when the HTML contains <a href="tg://user?id=<username>">…</a> (and the corresponding tg://user?id=@<username>), _parseMessageText's regex (/^@|\+|tg:\/\/user\?id=(\d+)/) does not match, so the mention is silently dropped.

Expected behavior

When the HTML parser encounters <a href="tg://user?id=<username>">name</a> (where <username> is a valid Telegram username, optionally preceded by @), it should produce a MessageEntityTextUrl whose url is the @-prefixed username (@<username>). _parseMessageText will then resolve the user via the existing _replaceWithMention flow.

Reproduction

import { HTMLParser } from "telegram/extensions/html";

const html = '<a href="tg://user?id=alice">name</a>';
const [text, entities] = HTMLParser.parse(html);
console.log(entities[0].url); // currently: "tg://user?id=alice" (dropped by _parseMessageText)
                              // expected:  "@alice"

Notes

  • Numeric ids must continue to round-trip with unparse, so the parser should leave them untouched (the tg://user?id=<digits> form is what _parseMessageText already understands).
  • Negative ids and bot-API style chat/channel ids are not supported by Telegram's mention mechanism and should also be left untouched.
  • Username validation should follow Telegram's documented rules: 5-32 characters, [A-Za-z0-9_], at least one letter or number; the optional leading @ is accepted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions