NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 2.5k
Star 13.9k

Code
Issues 597
Pull requests 828
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 66 Milestones 1

New pull request New

828 Open 10,669 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[None][fix] Pass dtype to AllReduce ctor to enable MNNVL all-reduce fo…

#15547 opened Jun 23, 2026 by nv-guomingz Collaborator

Loading…

[https://nvbugs/6293536][fix] Stage KV block offsets through a fresh host buffer

#15546 opened Jun 23, 2026 by thorjohnsen Collaborator

Loading…

1 task done

[None][Fix] Fix passing scaled timestep to time_embedder in Cosmos3

#15545 opened Jun 23, 2026 by bastefaniak

Loading…

1 task

[None][test] Refine Qwen3.5 397B test cases

#15544 opened Jun 23, 2026 by nv-guomingz Collaborator

Loading…

1 task done

[TRTLLM-13575][feat] Add eplb support for qwen3.5

#15543 opened Jun 23, 2026 by nv-guomingz Collaborator

Loading…

1 task done

[TRTLLM-13212][refactor] Unify sampling stacks into single facade with leaf backend modules

#15542 opened Jun 23, 2026 by zhaoyangwang-nvidia Collaborator • Draft

1 task done

[None][test] Add modularized perf tests (attention + MoE discrete/continuous)

#15541 opened Jun 23, 2026 by ruodil Collaborator

Loading…

1 task done

[None][fix] Allow fail-early when reuse block and legacy mamba cache

#15540 opened Jun 23, 2026 by Wanli-Jiang Collaborator

Loading…

1 task done

[https://nvbugs/6344108][fix] skip TestNemotron3Super120B on pre-blackwell

#15539 opened Jun 23, 2026 by bo-nv Collaborator

Loading…

1 task

[https://nvbugs/6166097][fix] Fix CuteDSL NVFP4 EPLB weight layout

#15538 opened Jun 23, 2026 by nv-xtf

Loading…

1 task done

[TRTLLM-13580][test] Add model-derived PyTorch attention backend test suite

#15536 opened Jun 23, 2026 by yuxianq Collaborator

Loading…

[https://nvbugs/6150288][fix] Use persistent per-stream workspace in cublas_mm for CUDA-graph safety

#15534 opened Jun 23, 2026 by pamelap-nvidia Collaborator

Loading…

2 of 4 tasks

[None][chore] Clean deprecated CppMambaCacheManager

#15533 opened Jun 23, 2026 by bo-nv Collaborator

Loading…

1 task done

[None][feat] Qwen-Image: NVFP4 SVDQuant (NVFP4 residual + rank-r BF16 LoRA)

#15532 opened Jun 23, 2026 by jingyu-ml

Loading…

[#14874][feat] AutoDeploy : Perf optimization for gpt-oss-120b for low conc AutoDeploy

<NV> AutoDeploy Backend

#15531 opened Jun 23, 2026 by taylor-yb-lee Collaborator

Loading…

1 task done

[None][chore] Autodeploy disable the pipeline cache by default

#15530 opened Jun 22, 2026 by nvchenghaoz Collaborator

Loading…

1 task

[None][CI] Waive flaky test_vbench_dimension_score_wan (nvbugs/6357628)

#15529 opened Jun 22, 2026 by chang-l Collaborator

Loading…

[None][feat] Support FP8 base weights for MoE LoRA

#15528 opened Jun 22, 2026 by brb-nv Collaborator • Draft

1 task

[https://nvbugs/6276842][test] Loosen rtol/atol on encoder CUDA graph logits parity check

#15527 opened Jun 22, 2026 by tingyangk Collaborator

Loading…

1 task done

[None][feat] Add prefix-aware scheduling config flag to support opt-out

#15526 opened Jun 22, 2026 by SimengLiu-nv Collaborator

Loading…

1 task done

[TRTLLM-13543][feat] WideEP FT: add EPLB mask-only reconfigure (1b.1)

#15525 opened Jun 22, 2026 by chienchunhung Collaborator

Loading…

[TRTLLM-12557][feat] WideEP FT: add AlltoAll watchdog (1a.4)

#15524 opened Jun 22, 2026 by chienchunhung Collaborator

Loading…

[None][fix] Preserve Kimi 2.5 tool call IDs

#15523 opened Jun 22, 2026 by hvagadia Contributor

Loading…

[#14882][fix] Make kv_cache_aware router robust to a missing KV-event stream

#15522 opened Jun 22, 2026 by GodlyDonuts

Loading…

[doc] Clarify dtype='auto' resolution for LLM and KvCacheConfig

#15520 opened Jun 22, 2026 by ojas4414

Loading…

Previous 1 2 3 4 5 … 33 34 Next

Previous Next

ProTip! Updated in the last three days: updated:>2026-06-20.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!