Skip to content

java.lang.ArrayIndexOutOfBoundsException in JSONReader #34

@ferdinand-beyer

Description

@ferdinand-beyer

We sometimes encounter this error when reading relatively large JSON files:

java.lang.ArrayIndexOutOfBoundsException - Index 1024 out of bounds for length 1024
at  charred.JSONReader/readString at JSONReader.java:144

We are using read-json with an InputStream and default options, running on AWS ECS.

This error seems non-deterministic, it seems that reading the same file will sometimes work and sometimes throw. (Maybe this is because :async? defaults to true and we have multiple cores?)

I think I have tracked the cause down to this code in JSONReader.java:

      int startpos = reader.position();
      int len = buffer.length;
      for(int pos = startpos; pos < len; ++pos) {

	final char curChar = buffer[pos];  // *** This is JSONReader.java:144 ***

	if (curChar == '"') {
	  final Object rv = cb.toString(buffer,startpos,pos,cv);
	  reader.position(pos + 1);
	  return rv;
	} else if (curChar == '\\') {
	  cb.append(buffer,startpos,pos);
	  final int idata = reader.readFrom(pos+1);
	  if (idata == -1)
	    throw new EOFException();
	  final char data = (char)idata;
	  switch(data) {
	  case '"':
	  case '\\':
	  case '/': cb.append(data); break;
	  case 'b': cb.append('\b'); break;
	  case 'f': cb.append('\f'); break;
	  case 'r': cb.append('\r'); break;
	  case 'n': cb.append('\n'); break;
	  case 't': cb.append('\t'); break;
	  case 'u':
	    final char[] temp = tempRead(4);
	    cb.append((char)(Integer.parseInt(new String(temp, 0, 4), 16)));
	    break;
	  default: throw new CharredException("JSON parse error - Unrecognized escape character: " + data);
	  }

	  buffer = reader.buffer(); // *** Modify `buffer`, but not `len`. Does not check for `null`

	  startpos = reader.position();
	  //pos will be incremented in loop update
	  pos = startpos - 1;
	}

We eventually refresh the buffer with:

	  buffer = reader.buffer();

Yet, we don't update the len. So if the CharReader returns the next buffer, with a different length, we could eventually run out of bounds here:

	final char curChar = buffer[pos];

Besides, couldn't buffer also be null here?!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions