FunASR integration for LangChain β transcribe audio to LangChain Documents with self-hosted speech-to-text.
Powered by SenseVoice / Paraformer / Fun-ASR-Nano: runs locally, no cloud API, strong on Chinese and 50+ languages.
pip install langchain-funasrfrom langchain_funasr import FunASRLoader
loader = FunASRLoader("meeting.wav", model="iic/SenseVoiceSmall", device="cuda")
docs = loader.load()
print(docs[0].page_content)Use the parser directly with blob pipelines:
from langchain_core.document_loaders import Blob
from langchain_funasr import FunASRParser
parser = FunASRParser(model="FunAudioLLM/SenseVoiceSmall", hub="hf", device="cuda")
docs = list(parser.lazy_parse(Blob.from_path("audio.wav")))| Arg | Default | Notes |
|---|---|---|
model |
iic/SenseVoiceSmall |
Any FunASR model (SenseVoice / Paraformer / Fun-ASR-Nano) |
hub |
ms |
ms (ModelScope) or hf (HuggingFace) |
device |
cpu |
e.g. cuda, cuda:0 |
language |
auto |
SenseVoice: auto/zh/en/yue/ja/ko |
vad_model |
fsmn-vad |
Built-in VAD handles long audio of any length |
- Self-hosted β no API keys, no data leaving your machine.
- Fast β SenseVoice is non-autoregressive, far faster than Whisper.
- Strong on Chinese + 50+ languages.
β If this helps, star FunASR and SenseVoice.
Apache-2.0