Earlier this January, I deployed an old NVIDIA 1080 TI CUDA capable GPU with an old Intel NUC and used OpenAI Whisper to transcribe 557 video recordings that consume 865 GB of storage on my NAS. The transcription process generated TXT files as well as SRT/VTT synchronized subtitles for each video.
A simple one-liner
cat *.txt | wc -wcounted that 10,429,652 words were spoken and VLC calculated the playback duration of 1356 hours in these 557 recording sessions.
All of this is available right here for you to watch.