Skip to content

improved shifting by 48-63 in llshr#784

Open
ZERICO2005 wants to merge 1 commit intomasterfrom
opt_llshr
Open

improved shifting by 48-63 in llshr#784
ZERICO2005 wants to merge 1 commit intomasterfrom
opt_llshr

Conversation

@ZERICO2005
Copy link
Copy Markdown
Contributor

@ZERICO2005 ZERICO2005 commented Apr 15, 2026

The compiler often emits __llshru(63) to get the signbit of (u)int64_t, which takes around ~1550F clock cycles instead of 2F bit 7, b. This contributes to the very slow performance of long double operations which often need to shift by 52 or 63 bits. Similarly, __llshrs(63) also appears when doing a "branchless" long long llabs(long long).

I have added an optimized path for __llshr(u/s) that handles shift amounts of 48-63, taking no more than 100F to complete. The new worst case is shifting by 47 (1165F + 110R + 108W + 141)

@ZERICO2005 ZERICO2005 added the crt label Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

1 participant