If you ever run into instructions that tell you to do this:
pip install flash-attn --no-build-isolation
Do not try to do this. It is a trap. For some reason attempting to install this runs a compilation process which can take multiple hours. I tried to run this in Google Colab on an A100 machine that I was paying for and burned through $2 worth of "compute units" and an hour and a half of waiting before I gave up.
Update: I may be wrong about this, the setup.py for the project includes code that attempts to install wheels directly from the GitHub releases. That didn't work for me and I don't understand why.
Thankfully I learned that there's an alternative: the Flash Attention team provide pre-built wheels for their project exclusively through GitHub releases. You can find them attached to the most recent release on https://github.com/Dao-AILab/flash-attention/releases
But which one should you use out of the 83 files listed there?
Google Colab has a "ask Gemini" feature so I tried "Give me as many clues as possible as to what flash attention wheel filename would work on this system" and it suggested I look for a cp310
one (for Python 3.10) on linux_x86_64
(Colab runs on Linux).
In browsing through the list of 83 options I thought flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
might be the right one (shrug?). So I tried this:
!wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.3/flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
!pip install --no-dependencies --upgrade flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
This seemed to work (and installed in just a couple of seconds):
import flash_attn
flash_attn.__version__
2.6.3
But the thing I was trying to run (deepseek-ai/Janus) failed with an error:
NameError: name '_flash_supports_window_size' is not defined
At this point I gave up. It's possible I picked the wrong wheel, but there may have been something else wrong.
Created 2024-10-24T21:47:17-07:00, updated 2024-10-24T22:20:43-07:00 · History · Edit