Whisper ASR Box is a general-purpose speech recognition toolkit. Whisper Models are trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification.
## rbb Features (for GPU acceleration and persistent cache)
## rbb Features (for GPU acceleration and persistent cache)
To support voice_activity_detection the faster_whisper model has to be used:
To support voice_activity_detection the faster_whisper model has to be used:
...
@@ -18,24 +14,42 @@ docker run -d -p 9000:9000 \
...
@@ -18,24 +14,42 @@ docker run -d -p 9000:9000 \
--env-file ./.env \
--env-file ./.env \
--gpus all \
--gpus all \
-v$PWD/cache:/data/whisper \
-v$PWD/cache:/data/whisper \
-v ISILON_transcript_files:/files \
image_name
image_name
```
```
## Environment Variables
## Environment Variables
Key configuration options:
Key configuration options (see .env.example for default values):