-
Notifications
You must be signed in to change notification settings - Fork 3.4k
[Python] Optimize Prepend: handle alignment and byte writes in one pass #8809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| encode.Write(packer.voffset, self.Bytes, self.Head(), x) | ||
| new_head = self.head - N.VOffsetTFlags.bytewidth | ||
| self.head = UOffsetTFlags.py_type(new_head) | ||
| encode.Write(packer.voffset, self.Bytes, new_head, x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is self.head = UOffsetTFlags.py_type(new_head) coming from in these changes? because i would expect self.Head() == self.head and Write itself did not change 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure about you question, new_head is a local variable being used which should be faster than accessing fields self.head or function calls selfHead().
Or maybe you are asking why I added the call to UOffsetTFlags.py_type, to be honest I was shuffling a lot of code around and I saw that in existing code pretty much every time head is assigned it is having this function call on it, like in WriteVtable:
self.head = UOffsetTFlags.py_type(objectStart)
I should have looked closer and I see now that py_type is just int. So really all the XYZ.py_type calls in this code are basically converting to int but everything is already an integer so we don't need this call. I updated #8808 to avoid more casting and the improvements are even better and the code changes are cleaner too.
| def Prepend(self, flags, off): | ||
| self.Prep(flags.bytewidth, 0) | ||
| self.Place(off, flags) | ||
| size = flags.bytewidth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this primary faster because it skips self.Pad(alignSize) ? I guess this is a value judgment if we want more performance or simpler code here - im fine with both
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly, calling Pad is what is making this original implementation expensive, inlining the implementations and avoiding Pad helps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After I improved #8808 I see this change really gives a very small bump for all that inlining, feel free to take it or drop it, the existing code is definitely more readable and most of the gains come from part 1 and 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given this is such a small improvement I'm tempted to drop it -- especially with no commentary :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, most improvements come from the previous commits!
|
Neat i left a few comments but overall i think this can be merged 👍 |
631232b to
86b8a93
Compare
86b8a93 to
9042940
Compare
Part 3 of 3
Optimize Prepend: handle alignment and byte writes in one pass
This is a break-up of #8766
Improves performance of the PythonTest over part 2 by 5.4%
python-perf-prep.txt
python-perf-prepend.txt