Gh 3902 new in-memory graph GraphMemIndexedSet#3903
Draft
arne-bdt wants to merge 3 commits intoapache:mainfrom
Draft
Gh 3902 new in-memory graph GraphMemIndexedSet#3903arne-bdt wants to merge 3 commits intoapache:mainfrom
arne-bdt wants to merge 3 commits intoapache:mainfrom
Conversation
Minor bugs fixed: - FastArrayBunch does not hold references to orphans after #tryRemove and #removeUnchecked - FastHashMap#computeIfAbsent does not creat an invalid state if call of absentValueSupplier fails - FastHashMap#compute now grows on insert like all other insert operations of this map Improvements: - iterators and spliterators don't need a runnable for concurrency check any more. They check the size directly, which has been shown to be faster. - FastHashBase#fillPositionsArray now iterates over the dense keys array, which should be faster in most cases. - Removed FastHashSet#IndexedKey and the corresponding iterator and spliterator implementations --> introduced FastHashBase#forEachKey and #forEachKeyParallel instead - added JavaDoc - added tests
Fixing jena-benckmarks: - updated pom.xml files to version "6.2.0-SNAPSHOT" - fixed spliterator benchmarks to support new Sized parameter
GaphMemIndexedSet is based on the architecture of GraphMemRoaring, but replaces the RoaringBitmaps by simple index-lists and a reverse index. Benefits: - memory footprint is comparable to GraphMemFast for smaller graphs up to 1M triples. For larger graphs (BSBM 25M and 50M) the footprint is even smaller. - Graph#add speed is comparable to GraphMemFast (depending on the graph) - Graph#delete speed is faster than GraphMemFast, especially for large graphs - Graph#find / #stream: - S__, _P_, O__ --> slightly slower than GraphMemFast - SPO --> faster than GraphMemFast - ___ --> faster than GraphMemFast - SP_, S_O, _PO --> faster than GraphMemFast in most cases - Graph#contains: - SP_, S_O, _PO --> dependent on insert order --> the only non-optimal thing I discovered - other match pattern behave like #find - GraphMem#copy --> faster than GraphMemFast - supporting the same indexing strategies as GraphMemRoaring
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
GitHub issue resolved #3902
Pull request Description:
GaphMemIndexedSetis based on the architecture ofGraphMemRoaring, but replaces theRoaringBitmapsby simple index-lists and a reverse index.Benefits;
Graph#addspeed is comparable toGraphMemFast(depending on the graph)Graph#deletespeed is faster than GraphMemFast, especially for large graphsGraph#find/#stream:S__,_P_,O__--> slightly slower than GraphMemFastSPO--> faster than GraphMemFast___--> faster than GraphMemFastSP_,S_O,_PO--> faster than GraphMemFast in most casesGraph#contains:SP_,S_O,_PO--> dependent on insert order --> the only non-optimal thing I discovered#findGraphMem#copy--> faster than GraphMemFastGraphMemRoaringcould be deprecated, due to worse performance in all discriplines.By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.
See the Apache Jena "Contributing" guide.