Question for the implementation of ParseFileParallel

Hello.

I also have a question about the implementation of `ParseFileParallel`.
Actually, you use `ProcessBlocksImpl` by assigning `block_begin` and `block_end` for each thread in the multi threaded configuration.

My concern is how your code is handling the case where the buffer has an uncomplete line at the end of blocks.
For example, Let's assume we have `block_begin` 4 and `block_end` 8 for thread 2 in `ProcessBlocksImpl`. I have an virtual obj lines for this example:
```
# BLOCK 4 Start
v 0.0 0.0 0.0
...
# BLOCK 4 End

# BLOCK 5 Start
v 0.0 0.0 0.0
...
# BLOCK 5 End

# BLOCK 6 Start
v 0.0 0.0 0.0
...
# BLOCK 6 End

# BLOCK 7 Start
v 0.0 0.0 0.0
v 0.0 0.0 0.0
...
v 0.0 0.0
# BLOCK 7 End

# BLOCK 8 Start
0.0
v 0.0 0.0 0.0
...
# BLOCK 8 END
```
In this case, when processing BLOCK 7, it encounters an uncomplete line `v 0.0 0.0`, missing one element of the vertex. I think your code is not handling this case in the multi thread case. In a single thread case, your code is handling this case by copying the rest of the line into the back_buffer with the remainder variable and `stop_parsing_after_eol` false.

I guess the problem is caused by `stop_parsing_after_eol` set as true in the multi thread case.

https://github.com/guybrush77/rapidobj/blob/744374a5d21fe01704eab0f36e633e8d620265e5/include/rapidobj/rapidobj.hpp#L7124-L7133
On the above code, you are setting `stop_parsing_after_eol` as true for all the threads except for the last one. As a result, 
https://github.com/guybrush77/rapidobj/blob/744374a5d21fe01704eab0f36e633e8d620265e5/include/rapidobj/rapidobj.hpp#L6932-L6962
When `i` becomes `block_end - 1` (the last `i`), it will at most process one line and then exit the `ProcessBlocksImpl` without handling the rest of the text data in the branch `else if (stop_parsing_after_eol)`. Even though we set `stop_parsing_after_eol` as false in other threads, we need more code to handle the last line of BLOCK 7 which has a missing element. I think you have to read the next block (BLOCK 8 in my example) and then process one line to get the missing element.

I might be confused with your code because I have looked through your code for two days, 
but what I still have seen works like that. 
If you have any idea for this, please let me know.

	for (size_t i = 0; i != tasks.size(); ++i) {
	bool is_last = i + 1 == tasks.size();
	auto begin = tasks[i];
	auto end = is_last ? num_blocks : (tasks[i + 1] + 1);
	bool stop_parsing_after_eol = !is_last;
	auto chunk = &(*chunks)[i];

	threads.emplace_back(ProcessBlocks, source, i, begin, end, stop_parsing_after_eol, chunk, context);
	threads.back().detach();
	}

	for (size_t i = block_begin; i != block_end; ++i) {
	auto remainder = size_t{};

	bool last_block = (i + 1 == block_end) \|\| reached_eof;

	if (!last_block) {
	file_offset = (i + 1) * kBlockSize;

	if (auto ec = reader->ReadBlock(file_offset, kBlockSize, back_buffer + kMaxLineLength)) {
	chunk->error = Error{ ec };
	return;
	}

	} else if (stop_parsing_after_eol) {
	if (auto ptr = static_cast<const char*>(memchr(text.data(), '\n', kMaxLineLength))) {
	auto pos = static_cast<size_t>(ptr - text.data());
	line = text.substr(0, pos);
	if (EndsWith(line, '\r')) {
	line.remove_suffix(1);
	}
	++chunk->text.line_count;
	if (auto rc = ProcessLine(line, chunk, context); rc != rapidobj_errc::Success) {
	chunk->error = Error{ make_error_code(rc), std::string(line), chunk->text.line_count };
	}
	} else {
	++chunk->text.line_count;
	auto ec = make_error_code(rapidobj_errc::LineTooLongError);
	chunk->error = Error{ ec, std::string(text, 0, kMaxLineLength), chunk->text.line_count };
	}
	return;
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question for the implementation of ParseFileParallel #29

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question for the implementation of ParseFileParallel #29

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions