Thanks for your interest in Cloud Genius®

On January 1st, 2009, I founded Cloud Genius with a goal to help people like you realize their dream to gain hands-on skills.

Over the past 14+ years, 1760 people with 15 nationalities from 4 continents successfully completed our programs to up-skill themselves. People like you have designed, built and deployed production grade cloud services and accomplished their career goals. This is what makes me super-proud.

💡

Transcribing 14+ years of my presentations

This year, I plan to use Open AI whisper to transcribe all my presentations recorded over the last 14+ years since starting Cloud Genius.

I plan to post my transcription results from OpenAI whisper on this website. You will receive my updates as I run the machine learning models.

Whisper is an open source tool from Open AI. As an easy test, I am going to feed Whisper a test video of a native speaker that teaches how to enunciate. This should be an easy one for whisper to transcribe.

Whisper from Open AI successfully transcribed the first one. Here it is. Subscribe now to receive updates. It's FREE! https://t.co/L7VPi7MqnA My NVIDIA GPU is just getting a bit warm.
— The Last Cloud₿ender⚡ (@K9LVN) January 7, 2023

So I install yt-dlp to make it easy to fetch videos locally as needed and installFFmpeg for A/V processing.

sudo apt update && sudo apt install -y yt-dlp ffmpeg

Then, I use my handy script to set up conda.

echo Intel CPU assumed
echo Using my preferred $HOME/miniconda install location

rm -rf $HOME/miniconda ~/miniconda.sh
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O ~/miniconda.sh
bash ~/miniconda.sh -b -p $HOME/miniconda
rm -rf ~/miniconda.sh

source $HOME/miniconda/bin/activate
conda init bash
conda config --set auto_activate_base false
conda update -n base -c defaults conda -y
conda update --all -y

To get me a clean python environment for whisper related work, I set up a new conda env named w with python 3.9 and activate it for use with whisper. Finally, I install whisper using pip in that clean conda environment.

conda activate base
conda create --name w python=3.9 -y
conda activate w

pip install git+https://github.com/openai/whisper.git

Now, my machine is ready to run whisper and help me transcribe whatever I require. I download that example video and asked whisper to transcribe it. I chose to run whisper with medium model and asked whisper to assume English as the spoken language. Otherwise, whisper spends a few seconds detecting the language being spoken. Yes, whisper supports many languages. Look at the related GitHub page for its complete capabilities.

[00:00.000 --> 00:06.320]  How to Enunciate.
[00:06.320 --> 00:10.000]  Want to get the attention, the respect, and even the dates you've been missing out on?
[00:10.000 --> 00:11.760]  You can start by speaking clearly.
[00:11.760 --> 00:20.160]  You will need Mirror Voice recorder Cork or pencil and sense of humor.
[00:20.160 --> 00:21.160]  Step 1.
[00:21.160 --> 00:24.960]  Stand in front of the mirror and pretend you're having a conversation with a friend.
[00:24.960 --> 00:29.280]  It's much easier to identify the places where you slur if you watch yourself speak.
[00:29.280 --> 00:30.280]  Step 2.
[00:30.280 --> 00:35.080]  Stretch your face as wide as it will go, and then scrunch it up as small as you can.
[00:35.080 --> 00:38.000]  Move your jaw from side to side and back and forth.
[00:38.000 --> 00:41.440]  Stick your tongue out as far as it will go, and retract it.
[00:41.440 --> 00:43.480]  Repeat these steps several times.
[00:43.480 --> 00:47.600]  Stretching your face, jaw, and tongue makes it easier to form words clearly.
[00:47.600 --> 00:48.760]  Step 3.
[00:48.760 --> 00:53.200]  Stand in front of the mirror and repeat vocal exercises that'll help you loosen your tongue,
[00:53.200 --> 00:54.200]  lips, and jaw.
[00:54.200 --> 00:59.520]  Try to make every sound distinct, emphasizing both consonants and vowels.
[00:59.520 --> 01:01.680]  Say and repeat these clearly.
[01:01.680 --> 01:12.280]  B-b-b, w-w-w, b-b-b, w-w-w, p-p-p, f-f-f, p-p-p, f-f-f, gutta-butta, gutta-butta.
[01:12.280 --> 01:14.560]  Red leather, yellow leather.
[01:14.560 --> 01:15.840]  Step 4.
[01:15.840 --> 01:20.180]  Repeat tongue twisters slowly and deliberately to yourself in the mirror, and make sure that
[01:20.180 --> 01:22.960]  you can hear each separate consonant and syllable.
[01:22.960 --> 01:27.480]  Over time, say the phrases faster and faster, making sure you can still hear each part of
[01:27.480 --> 01:29.320]  the word clearly.
[01:29.320 --> 01:31.080]  Say and repeat these clearly.
[01:31.080 --> 01:34.160]  A noisy noise annoys an oyster.
[01:34.160 --> 01:36.080]  Lovely lemon liniment.
[01:36.080 --> 01:39.080]  Twelve twins twirled twelve twigs.
[01:39.080 --> 01:40.080]  Step 5.
[01:40.080 --> 01:43.000]  Record yourself reading a paragraph from your favorite book.
[01:43.000 --> 01:48.380]  Now gently hold a pencil or the small end of a cork just behind your front teeth.
[01:48.380 --> 01:53.680]  Carefully read the paragraph aloud several times, making every letter as clear as possible.
[01:53.680 --> 01:57.680]  Remove the cork or pencil and record yourself reading the paragraph again.
[01:57.680 --> 02:00.720]  The second recording will be much clearer.
[02:00.720 --> 02:01.840]  Step 6.
[02:01.840 --> 02:06.120]  Focus on making consonants and syllables as clear as possible, since they provide the
[02:06.120 --> 02:09.080]  most structure in words and sentences.
[02:09.080 --> 02:10.400]  Step 7.
[02:10.400 --> 02:13.880]  Take ten minutes each day to repeat your speech exercises.
[02:13.880 --> 02:17.260]  You may look silly as you talk to the mirror, but you'll sound great when you're speaking
[02:17.260 --> 02:18.260]  in public.
[02:18.260 --> 02:23.640]  Did you know As an aspiring actress, Kathleen Turner perfected her diction by biting down
[02:23.640 --> 02:51.280]  on pencil erasers while practicing speech exercises.

💡

Whisper transcription is output in TXT, SRT and VTT formats.

SRT and VTT formats are suitable for adding time-synchronized closed captioning.

You are reading The Cloud Seminar, a project by Cloud Genius founder @K9LVN. You can subscribe in the meantime if you'd like to stay up to date and receive emails when new content is published.

Related Articles

Some stats from 1356 hours of Cloud Genius® videos