A small week end project: Oreille is a wrapper on OpenAPI Whisper API. It provides support for long audio files.

OpenAPI Whisper support only files that are less than 25 MB. Oreille will break the audio file into chunks of 25 MB’s or less. https://platform.openai.com/docs/guides/speech-to-text/longer-inputs

Oreille will also compute the correct timing of the subtitle when merging the output of Whisper. So once you export the subtitle the timestamp of the subtitle will be right.

You can open and save WAV files with pure python. For opening and saving non-wav files – like mp3 – you’ll need ffmpeg or libav.

View project on Github