From c64686c4da9ab232f90f61973f6831068cc9a692 Mon Sep 17 00:00:00 2001
From: Subliminal Guy <subliminal_kid@posteo.de>
Date: Sun, 1 Jun 2025 13:12:38 +0200
Subject: [PATCH] Change README

---
 README.md | 66 +++++++------------------------------------------------
 1 file changed, 8 insertions(+), 58 deletions(-)

diff --git a/README.md b/README.md
index e8cd150..b657d98 100644
--- a/README.md
+++ b/README.md
@@ -1,62 +1,26 @@
-![Release](https://img.shields.io/github/v/release/ahmetoner/whisper-asr-webservice.svg)
-![Docker Pulls](https://img.shields.io/docker/pulls/onerahmet/openai-whisper-asr-webservice.svg)
-![Build](https://img.shields.io/github/actions/workflow/status/ahmetoner/whisper-asr-webservice/docker-publish.yml.svg)
-![Licence](https://img.shields.io/github/license/ahmetoner/whisper-asr-webservice.svg)
-
 # Whisper ASR Box
 
 Whisper ASR Box is a general-purpose speech recognition toolkit. Whisper Models are trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification.
 
-## Features
+## rbb Features (for GPU acceleration and persistent cache)
 
-Current release (v1.8.2) supports following whisper models:
+To support voice_activity_detection the faster_whisper model has to be used:
 
-- [openai/whisper](https://github.com/openai/whisper)@[v20240930](https://github.com/openai/whisper/releases/tag/v20240930)
 - [SYSTRAN/faster-whisper](https://github.com/SYSTRAN/faster-whisper)@[v1.1.0](https://github.com/SYSTRAN/faster-whisper/releases/tag/v1.1.0)
-- [whisperX](https://github.com/m-bain/whisperX)@[v3.1.1](https://github.com/m-bain/whisperX/releases/tag/v3.1.1)
-
-## Quick Usage
-
-### CPU
 
-```shell
-docker run -d -p 9000:9000 \
-  -e ASR_MODEL=base \
-  -e ASR_ENGINE=openai_whisper \
-  onerahmet/openai-whisper-asr-webservice:latest
-```
-
-### GPU
-
-```shell
-docker run -d --gpus all -p 9000:9000 \
-  -e ASR_MODEL=base \
-  -e ASR_ENGINE=openai_whisper \
-  onerahmet/openai-whisper-asr-webservice:latest-gpu
-```
 
-#### Cache
+Before starting the container, create a .env file with the content from the .env.example file.
 
-To reduce container startup time by avoiding repeated downloads, you can persist the cache directory:
+The container then has to be started with the following commands:
 
 ```shell
 docker run -d -p 9000:9000 \
-  -v $PWD/cache:/root/.cache/ \
-  onerahmet/openai-whisper-asr-webservice:latest
+  --env-file ./.env \
+  --gpus all \
+  -v $PWD/cache:/data/whisper \
+  image_name
 ```
 
-## Key Features
-
-- Multiple ASR engines support (OpenAI Whisper, Faster Whisper, WhisperX)
-- Multiple output formats (text, JSON, VTT, SRT, TSV)
-- Word-level timestamps support
-- Voice activity detection (VAD) filtering
-- Speaker diarization (with WhisperX)
-- FFmpeg integration for broad audio/video format support
-- GPU acceleration support
-- Configurable model loading/unloading
-- REST API with Swagger documentation
-
 ## Environment Variables
 
 Key configuration options:
@@ -72,20 +36,6 @@ Key configuration options:
 For complete documentation, visit:
 [https://ahmetoner.github.io/whisper-asr-webservice](https://ahmetoner.github.io/whisper-asr-webservice)
 
-## Development
-
-```shell
-# Install poetry
-pip3 install poetry
-
-# Install dependencies
-poetry install
-
-# Run service
-poetry run whisper-asr-webservice --host 0.0.0.0 --port 9000
-```
-
-After starting the service, visit `http://localhost:9000` or `http://0.0.0.0:9000` in your browser to access the Swagger UI documentation and try out the API endpoints.
 
 ## Credits
 
-- 
GitLab