🦜🔗 langchain-funasr

FunASR integration for LangChain — transcribe audio to LangChain Documents with self-hosted speech-to-text.

Powered by SenseVoice / Paraformer / Fun-ASR-Nano: runs locally, no cloud API, strong on Chinese and 50+ languages.

Install

pip install langchain-funasr

Usage

from langchain_funasr import FunASRLoader

loader = FunASRLoader("meeting.wav", model="iic/SenseVoiceSmall", device="cuda")
docs = loader.load()
print(docs[0].page_content)

Use the parser directly with blob pipelines:

from langchain_core.document_loaders import Blob
from langchain_funasr import FunASRParser

parser = FunASRParser(model="FunAudioLLM/SenseVoiceSmall", hub="hf", device="cuda")
docs = list(parser.lazy_parse(Blob.from_path("audio.wav")))

Options

Arg	Default	Notes
`model`	`iic/SenseVoiceSmall`	Any FunASR model (SenseVoice / Paraformer / Fun-ASR-Nano)
`hub`	`ms`	`ms` (ModelScope) or `hf` (HuggingFace)
`device`	`cpu`	e.g. `cuda`, `cuda:0`
`language`	`auto`	SenseVoice: `auto`/`zh`/`en`/`yue`/`ja`/`ko`
`vad_model`	`fsmn-vad`	Built-in VAD handles long audio of any length

Why FunASR

Self-hosted — no API keys, no data leaving your machine.
Fast — SenseVoice is non-autoregressive, far faster than Whisper.
Strong on Chinese + 50+ languages.

⭐ If this helps, star FunASR and SenseVoice.

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
langchain_funasr		langchain_funasr
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦜🔗 langchain-funasr

Install

Usage

Options

Why FunASR

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🦜🔗 langchain-funasr

Install

Usage

Options

Why FunASR

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages