Skip to content

Adding support for operators #9

@Rua

Description

@Rua

I am thinking of making a PR to add support for operators to shlex. This would follow the same model as Python-shlex, where the caller specifies which characters they want to treat as operators. Then, the lexer ensures that unquoted strings contain either runs of only operator or non-operator characters (in a quoted string they can still be mixed). No actual interpretation of the operators is done, they just represent a separate set of words.

The current optimisation in shlex, where the input string is iterated by byte instead of codepoint, gets in the way of this however. In order to support any Unicode codepoint as an operator, the lexer has to receive potentially multibyte characters in one go, not individual UTF-8 high bytes. Alternatively, the caller could be asked to specify operator characters as bytes, but this brings its own safety problems; what if the user specifies a high byte as an operator character?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions