2020年5月27日 星期三

DeepSpeech: Speech to text AI model.

  • Overview

"DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu’s Deep Speech research paper."
DeepSpeech provide lots of lauange api support, Python Javascript, c, and it's easily use to involve in application

  • Install DeepSpeech

Follow user guide instruction.

  • Demo

Using command line tool to inference sound data.

$> deepspeech --model deepspeech-0.7.0-models.pbmm --audio audio/2830-3980-0043.wav

Output:
Loading model from file deepspeech-0.7.0-models.pbmm
TensorFlow: v1.15.0-24-gceb46aa
DeepSpeech: v0.7.1-0-g2e9c281
Loaded model in 0.0093s.
Loading scorer from files deepspeech-0.7.0-models.scorer
Loaded scorer in 0.00023s.
Running inference.
experience proves this
Inference took 1.480s for 1.975s audio file.
 
The red color string is inference text data of input sound data.

Using Python API to inference sound data.

Output:
your power is sufficient i said sound data


Reference:

沒有留言:

張貼留言

Linux driver: How to enable dynamic debug at booting time for built-in driver.

 Dynamic debug is useful for debug driver, and can be enable by: 1. Mount debug fs #>mount -t debugfs none /sys/kernel/debug 2. Enable dy...