Skip to content

Conversation

@cuavas
Copy link
Member

@cuavas cuavas commented Nov 3, 2025

This adds BFXU/BFXS/DBFXU/DBFXS instructions to the recompiler framework for extracting unsigned or signed bit fields (something @galibert wanted). They works similarly to the three-parameter form of MAME’s BIT helper, or the ARM ubfx/sbfx instructions.

This does add one piece of undefined behaviour: if the width operand modulo the operand size in bits is zero, the result is undefined. I don’t particularly like undefined behaviour, but this makes the back-end implementations substantially simpler.

These generate the most optimal code when the shift and width operands are both immediate values and the field doesn’t wrap around from MSB to LSB of the source, but the code for other cases is pretty decent for x86-64 and AArch64. I didn’t put a great deal of effort into optimising for i686, particularly for the 64-bit forms – I’m really done caring, and i686 just doesn’t have enough registers to have much fun.

I updated the Hyperstone E1 recompiler to use the new instructions. The majority of these cases get converted to SHR anyway (extracting the FP field from SR), so it doesn’t really affect the generated code much, but it’s a clearer seeing it expressed this way.

In addition to the main purpose, this also includes a few bits and pieces:

  • Uses the x86-64 LZCNT instruction if available to implement the UML LZCNT instruction, and also optimises the BSR-based implementation. This isn’t a frequently-used instruction, but it’s still easy optimisation.
  • Avoids unnecessary x86-64 REX prefixes in a few situations.
  • Standardises on abbreviated integer type names in the x86-64 back-end as there was a mixture of standard and abbreviated names.
  • Slightly optimises code i686 generation for DROLAND and DROLINS when flags need to be calculated. This doesn’t happen frequently, but I had to mess with related code for other reasons, so it’s basically free.
  • Slightly adjusts generated code for one case of the Hyperstone E1 SET instruction to produce tighter native code on i686 and x86-64.

cpu/drcbex64.cpp: Also added LZCNT implementation using x86 LZCNT
instruction and optimised the BSR-based implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant