Prepend install lib dir to PATH on Windows so cuDNN sub-DLLs resolve#1465
Open
TroyHernandez wants to merge 1 commit into
Open
Prepend install lib dir to PATH on Windows so cuDNN sub-DLLs resolve#1465TroyHernandez wants to merge 1 commit into
TroyHernandez wants to merge 1 commit into
Conversation
cuDNN 9 is split across several DLLs. cuDNN's lazy load of its sub-DLLs (e.g. cudnn_graph64_9.dll) does not find the install lib dir unless it is on PATH, so cuDNN-backed CUDA ops fail with 'Could not locate cudnn_graph64_9.dll' even though the DLL sits in torch/lib. Prepend the dir (once) at load, mirroring the existing load_cudatoolkit_libs() PATH handling.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses the Windows cuDNN DLL load failure tracked in #1287 (the
Could not locate cudnn_graph64_9.dllreports there).Problem
On Windows CUDA installs, cuDNN-backed ops fail at runtime even though every cuDNN 9 DLL is present in
<install>/torch/lib:cuda_is_available()isTRUEand non-cuDNN CUDA ops (e.g.torch_mm) work. This lines up with the diagnosis in #1287 that the loader isn't findingtorch/lib.Cause
cuDNN 9 is split across several DLLs.
lanternLoadLibraryloadslantern.dllwithLoadLibraryEx(..., LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR | LOAD_LIBRARY_SEARCH_DEFAULT_DIRS), which resolves the static dependency chain (libtorch,cudnn64_9.dll) from the install lib dir. Butcudnn64_9.dllthen lazily loadscudnn_graph64_9.dllby name, and that load does not find the install lib dir unless it is onPATH.Fix
Prepend the install lib dir to
PATHat load on Windows, so cuDNN's by-name sub-DLL loads resolve. This mirrors the existingload_cudatoolkit_libs()handling, which already does the sameSys.setenv(PATH = ...)for the separate cudatoolkit-package case. The prepend is idempotent, so repeatedlibrary()/reload calls don't growPATH.Verification
Prepending
torch/libtoPATHresolves the failure on torch 0.17.0 / libtorch 2.8.0+cu126, Windows 10, R 4.6.0 and R-devel:nnf_conv2dand a fulltorch_scaled_dot_product_attentionforward then run and return correct results on GPU.Refs #1287