Whisper Web UI
Whisper Web UI is a Streamlit web application (whisper_webui.py
) and a command-line interface (whisper_cli.py
) for transcribing audio files using the Whisper Large v3 model via either the OpenAI or Groq API. It offers a user-friendly interface for uploading audio, processing it, and obtaining transcriptions quickly and efficiently.
Features
- Automatic compression for files larger than 25MB
- Support for multiple audio formats (mp3, mp4, mpeg, mpga, m4a, wav, webm)
- Transcription using Whisper Large v3 model through OpenAI, Groq, or Fal API
- Display of transcription time and results
- Option to copy transcript to clipboard
- Ability to save transcript to a file
- Both web-based and command-line interfaces
Installation
-
Clone this repository:
git clone https://github.com/yourusername/audio-transcription-app.git
cd audio-transcription-app -
Install the required dependencies:
pip install streamlit groq openai pydub pyperclip
-
Set up your API keys as environment variables:
export OPENAI_API_KEY='your_openai_api_key_here'
export GROQ_API_KEY='your_groq_api_key_here'
Usage
Streamlit Web Application (whisper_webui.py
)
-
Run the Streamlit app:
streamlit run whisper_webui.py
-
Open your web browser and navigate to the provided local URL (typically
http://localhost:8501
). -
Use the interface to upload an audio file, process it, and view the transcription results.
Command-Line Interface (whisper_cli.py
)
The CLI version offers more flexibility and options for transcription. Here's how to use it:
python whisper_cli.py [-h] -i INPUT [-o OUTPUT] [--compress-only] [-c] [--api {openai,groq}]
Options:
-i INPUT
,--input INPUT
: Input audio file (required)-o OUTPUT
,--output OUTPUT
: Output filename for the transcript--compress-only
: Compress the audio file only (no transcription)-c
,--clipboard
: Copy the transcription text to the system clipboard--api {openai,groq}
: Choose API for transcription (default: openai)
Examples:
-
Transcribe an audio file using OpenAI API:
python whisper_cli.py -i input.mp3 -o transcript.txt
-
Transcribe using Groq API and copy to clipboard:
python whisper_cli.py -i input.wav --api groq -c
-
Compress an audio file without transcribing:
python whisper_cli.py -i large_file.mp3 --compress-only
Note on Fal API Usage
If using the Fal.ai API for transcription, the application uploads your audio file to tmpfiles.org. This step is necessary because the Fal API requires input files to be accessible via a public URL.
Please ensure that you have the necessary rights to upload and make your audio public. Do not use this method for sensitive recordings.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is released under the MIT License.