Installing flash-attn without compiling it

If you ever run into instructions that tell you to do this:

pip install flash-attn --no-build-isolation

Do not try to do this. It is a trap. For some reason attempting to install this runs a compilation process which can take multiple hours. I tried to run this in Google Colab on an A100 machine that I was paying for and burned through $2 worth of "compute units" and an hour and a half of waiting before I gave up.

Update: I may be wrong about this, the setup.py for the project includes code that attempts to install wheels directly from the GitHub releases. That didn't work for me and I don't understand why.

Thankfully I learned that there's an alternative: the Flash Attention team provide pre-built wheels for their project exclusively through GitHub releases. You can find them attached to the most recent release on https://github.com/Dao-AILab/flash-attention/releases

But which one should you use out of the 83 files listed there?

Google Colab has a "ask Gemini" feature so I tried "Give me as many clues as possible as to what flash attention wheel filename would work on this system" and it suggested I look for a cp310 one (for Python 3.10) on linux_x86_64 (Colab runs on Linux).

In browsing through the list of 83 options I thought flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl might be the right one (shrug?). So I tried this:

!wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.3/flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
!pip install --no-dependencies --upgrade flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

This seemed to work (and installed in just a couple of seconds):

import flash_attn
flash_attn.__version__
2.6.3

But the thing I was trying to run (deepseek-ai/Janus) failed with an error:

NameError: name '_flash_supports_window_size' is not defined

At this point I gave up. It's possible I picked the wrong wheel, but there may have been something else wrong.

Created 2024-10-24T21:47:17-07:00, updated 2024-10-24T22:20:43-07:00 · History · Edit