Skip to content

Refactor BLAS code#2049

Open
jessegrabowski wants to merge 13 commits intopymc-devs:mainfrom
jessegrabowski:blas-refactor
Open

Refactor BLAS code#2049
jessegrabowski wants to merge 13 commits intopymc-devs:mainfrom
jessegrabowski:blas-refactor

Conversation

@jessegrabowski
Copy link
Copy Markdown
Member

@jessegrabowski jessegrabowski commented Apr 13, 2026

I was inspired by the linalg refactor so I wanted to make a pass at BLAS. My objectives here is to make the BLAS code more maintainable. I am doing this by:

  • Convert blas.py into a module
  • Split out one file per blas Op (GEMM, GEMV, GER)
  • Move string codegen into dedicated .h or .c files

Point 3 has two levels we could pursue. Level one is to extract all static code into headers that can be pulled into the string codegen. This is done for all three BLAS functions. The second level is to move all string codegen into a helper function, then only codegen the call to that function. I did this only to GER in the last commit. It's significantly more readable, but it also has the overhead of being a function vs 100% inline. We can discuss.

I think one more step I want to explore in this PR is moving all of the c code and potentially the COps to link/c/blas instead of tensor/blas, as a part of this idea that "C should just be another backend" raised in #2006

@jessegrabowski
Copy link
Copy Markdown
Member Author

Some fun facts I didn't know:

  • We link against blas_fortran, not blas_c. Wow!
  • There is a bug reported in 2006 (!!) about Accelerate's fortran blas bindings. Their sdot function is wrong. That has never been patched. We have to run hack code to check if the bug is there and monkeypatch the fortran_blas sdot to the c_blas sdot on mac every time we import pytensor. Who knew?
  • I also learned that pytorch and mlx have specialized GEMM (admm) and BatchedDot (bmm) functions. We should almost certainly kill the BlasOpt rewrite database and just make them normal rewrites. We can write pullbacks for GEMM, GEMV, and GER trivially, and we gain the ability to do nice dispatches.

@ricardoV94
Copy link
Copy Markdown
Member

ricardoV94 commented Apr 13, 2026

We can write pullbacks for GEMM, GEMV, and GER trivially, and we gain the ability to do nice dispatches.

We can, but we don't usually bother with grads for specialized Ops. I wouldn't expose user-facing GEMM but always start with the canonical Dot + alpha forms for which we have other rewrites/batch-rules/etc...

@jessegrabowski
Copy link
Copy Markdown
Member Author

I'm not thinking about something user-facing for sure. The pullbacks are so simple that it might result in better graphs than trying to start from general forms and rewrite both the forward and backward graph. Not sure. I want to modernize the rewrites next.

@ricardoV94
Copy link
Copy Markdown
Member

The pullbacks are so simple that it might result in better graphs than trying to start from general forms and rewrite both the forward and backward graph. Not sure. I want to modernize the rewrites next.

Simple or not is also more code we need to test and maintain. How hard are the BLAS patterns really? Two dots and some scalar multiplications? If we can't handle those we have bigger problems

@jessegrabowski
Copy link
Copy Markdown
Member Author

jessegrabowski commented Apr 13, 2026

If we can't handle those we have bigger problems

Well, empirically...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants