Skip to content

Faster gamma calculation#524

Draft
tompng wants to merge 4 commits intoruby:masterfrom
tompng:gamma_lagrange
Draft

Faster gamma calculation#524
tompng wants to merge 4 commits intoruby:masterfrom
tompng:gamma_lagrange

Conversation

@tompng
Copy link
Copy Markdown
Member

@tompng tompng commented Apr 10, 2026

This is a proof of concept implementation of faster gamma calculation using localized Lagrange interpolation of b**x / x!

The Problem with Direct Interpolation

Directly interpolating gamma(x) with equidistant nodes fails due to its singularities and Runge's phenomenon.

Shared Philosophy: Spouge as Lagrange Interpolation

Although Spouge's approximation is analytically derived from contour integration, from a different perspective, it can be interpreted as a Lagrange interpolation of the scaled function:
$$f(x) = \frac{\left(x+a-1\right)!}{\left(2(x+a)\right)^{x+\frac{1}{2}}e^{-x-a}}$$
at the nodes x = -a+1, -a+2, ..., -1, with a constant adjusted to converge as $x \to \infty$.

The proposed approach shares this foundational philosophy—scaling the function to tame its growth, then interpolating. However, while Spouge's formula converges for any x > 0, it is computationally heavy, requiring numerous expensive full-precision divisions and square root calculations. The method proposed in this PR is essentially a highly computation-optimized specialization of this interpolation philosophy.

Approach in this PR

Interpolate the following scaled reciprocal function instead:
$$f(x) = \frac{b^x}{x!}$$
Since this is an entire function (no poles), it avoids Runge's phenomenon. Scaling by b**x shapes the function into a nearly symmetric bell curve centered at b.
By performing Lagrange interpolation at x = b-l, b-l+1, ..., b+l, we achieve highly accurate localized interpolation. The center b is dynamically determined based on the input x.
The formula is simple, node values f(integer) can be easily calculated, and it opens up room for various optimizations.

Small digits case

e.g., BigMath.gamma(1.25, prec)
Uses Binary Splitting Method (BSM). Quasilinear time complexity.

Full-digit case

e.g., BigMath.gamma(BigDecimal(1).div(3, prec), prec)
Uses Baby-Step Giant-Step (BSGS) combined with Montgomery batch inversion to eliminate heavy full-precision divisions, reducing them to scalar multiplications. Time complexity is roughly O(N^2). x must not be too large.

x > threshold_depend_on_prec case

Falls back to Spouge's approximation. Time complexity is O(N*M(N)) where M(N) is the time complexity of full-precision multiplication/division.

Benchmark

Calculation master branch This PR
BigMath.gamma(1.25, 10000) 49s 0.24s
BigMath.gamma(1.25, 100000) 6000s(estimated) 4.2s
BigMath.gamma(BigDecimal(1).div(3, 10000), 10000) 56s 8.2s
BigMath.gamma(BigDecimal(1).div(3, 100000), 100000) 7000s(estimated) 803s

Comparison with mpmath(gmpy-backend)

Calculation digits mpmath first run mpmath second run (cached) This PR
gamma(1.25) 5000 3.2s 0.12s 0.09s
gamma(1.25) 10000 26.9s 0.69s 0.24s
gamma(1.25) 20000 226s 3.7s 0.57s
gamma(1/3) 5000 3.2s 0.12s 2.2s
gamma(1/3) 10000 24s 0.68s 8.2s
gamma(1/3) 20000 226s 3.8s 33s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant