๐๏ธ Qwen3 ASR 0.6B Demo
This demo uses the Qwen3-ASR-0.6B model for automatic speech recognition on CPU.
Note: Running on CPU. Inference may take 10-30 seconds depending on audio length.
๐ค Input
Language
Convert numbers to digits, etc. (Chinese & Japanese only)
๐ Output
๐ก Keyboard Shortcuts
- Ctrl/Cmd + Enter: Start transcription
- Esc: Clear audio input
๐ Tips for Best Results
- Use clear audio recordings
- Supported audio formats: WAV, MP3, FLAC, OGG
- Mono, 16kHz sample rate recommended
- For long audio, consider splitting into shorter segments
๐ Supported Languages
This model supports 90+ languages including:
- English, Chinese, Spanish, French, German
- Japanese, Korean, Arabic, Hindi, Russian
- And many more...
โน๏ธ Model Details
- Architecture: Qwen3 Audio-Speech-Recognition
- Parameters: 0.6 Billion
- License: Apache 2.0
- Model Card: Qwen/Qwen3-ASR-0.6B