Adding support for operators

I am thinking of making a PR to add support for operators to shlex. This would follow the same model as Python-shlex, where the caller specifies which characters they want to treat as operators. Then, the lexer ensures that unquoted strings contain either runs of only operator or non-operator characters (in a quoted string they can still be mixed). No actual interpretation of the operators is done, they just represent a separate set of words.

The current optimisation in shlex, where the input string is iterated by byte instead of codepoint, gets in the way of this however. In order to support any Unicode codepoint as an operator, the lexer has to receive potentially multibyte characters in one go, not individual UTF-8 high bytes. Alternatively, the caller could be asked to specify operator characters as bytes, but this brings its own safety problems; what if the user specifies a high byte as an operator character?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for operators #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Adding support for operators #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions