Skip to content

rkelly (and rkelly-remix) unable to parse non-ASCII identifiers #35

@MatmaRex

Description

@MatmaRex

The following is valid JavaScript syntax:

ą.ś = {ą: 'ś'}

rkelly-remix fails badly when trying to parse it:

require 'rkelly'
puts RKelly::Parser.new.parse("ą.ś = {ą: 'ś'}").to_ecma
C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/rkelly-remix-0.0.6/lib/rkelly/tokenizer.rb:201:in `raw_tokens': undefined method `name' for nil:NilClass (NoMethodError)
        from C:/Ruby200-x64/lib/ruby/gems/2.0.0/gems/rkelly-remix-0.0.6/lib/rkelly/parser.rb:40:in `parse'
        from rkelly.rb:2:in `<main>'

According to ECMAScript Language Specification - ECMA-262 Edition 5.1, identifiers can contain any Unicode letters and digits (and some other things too). According to rkelly and rkelly-remix, they may only contain [_\$0-9A-Za-z].

@nene, It shouldn't be very hard to extend the tokenizer to support this? I could probably do the boring patch creating. :)

(This was encountered in the wild with JSDuck trying to parse this file: https://gerrit.wikimedia.org/r/#/c/147828/1/modules/ve/ui/inspectors/ve.ui.SpecialCharacterInspector.js – this diff removes the quotes around accented characters to satisfy jscs's disallowQuotedKeysInObjects rule. Original error log can be seen here: https://integration.wikimedia.org/ci/job/VisualEditor-jsduck/1705/console.)

(Let me also note that the error messages returned when encountering unknown syntax are unpleasant, e.g. the above undefined method name for nil:NilClass (NoMethodError) or something bad happened, please report a bug with sample JavaScript (RuntimeError) for var a = 100abc;.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions