Add UML bit field extract instructions #14467
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds
BFXU/BFXS/DBFXU/DBFXSinstructions to the recompiler framework for extracting unsigned or signed bit fields (something @galibert wanted). They works similarly to the three-parameter form of MAME’sBIThelper, or the ARMubfx/sbfxinstructions.This does add one piece of undefined behaviour: if the
widthoperand modulo the operand size in bits is zero, the result is undefined. I don’t particularly like undefined behaviour, but this makes the back-end implementations substantially simpler.These generate the most optimal code when the
shiftandwidthoperands are both immediate values and the field doesn’t wrap around from MSB to LSB of the source, but the code for other cases is pretty decent for x86-64 and AArch64. I didn’t put a great deal of effort into optimising for i686, particularly for the 64-bit forms – I’m really done caring, and i686 just doesn’t have enough registers to have much fun.I updated the Hyperstone E1 recompiler to use the new instructions. The majority of these cases get converted to
SHRanyway (extracting the FP field from SR), so it doesn’t really affect the generated code much, but it’s a clearer seeing it expressed this way.In addition to the main purpose, this also includes a few bits and pieces:
LZCNTinstruction if available to implement the UMLLZCNTinstruction, and also optimises theBSR-based implementation. This isn’t a frequently-used instruction, but it’s still easy optimisation.DROLANDandDROLINSwhen flags need to be calculated. This doesn’t happen frequently, but I had to mess with related code for other reasons, so it’s basically free.SETinstruction to produce tighter native code on i686 and x86-64.