Kaldi (software)
Open-source speech recognition software toolkit
From Wikipedia, the free encyclopedia
Kaldi is an open-source speech recognition toolkit written in C++ for speech recognition and signal processing, freely available under the Apache License v2.0.
| Kaldi | |
|---|---|
| Developers | Daniel Povey and others |
| Stable release | 5.5.636
/ February 2020 |
| Written in | C++ |
| Operating system | Unix systems (Linux, BSD, OSX 10.{8,9} etc.), Windows (via Cygwin) |
| Type | Speech recognition |
| License | Apache License v.2.0[1] |
| Website | kaldi-asr |
| Repository | https://github.com/kaldi-asr/kaldi |
Kaldi aims to provide software that is flexible and extensible,[2] and is intended for use by automatic speech recognition (ASR) researchers for building a recognition system.
It supports linear transforms, MMI, boosted MMI and MCE discriminative training, feature-space discriminative training, and deep neural networks.[3]
Kaldi is capable of generating features like mfcc, fbank, fMLLR, etc. Hence in recent deep neural network research, a popular usage of Kaldi is to pre-process raw waveform into acoustic feature for end-to-end neural models.
Kaldi has been incorporated as part of the CHiME Speech Separation and Recognition Challenge over several successive events.[4][5][6] The software was initially developed as part of a 2009 workshop at Johns Hopkins University.[7]
Kaldi is named after the legendary Ethiopian goat herder Kaldi who was said to have discovered the coffee plant.[8]