Thanks for your interest in Cloud Genius®

· 5 min read
Thanks for your interest in Cloud Genius®
Me speaking to a live audience in our conference in Seattle and simulcasting it over the web.

On January 1st, 2009, I founded Cloud Genius with a goal to help people like you realize their dream to gain hands-on skills.

Over the past 14+ years, 1760 people with 15 nationalities from 4 continents successfully completed our programs to up-skill themselves. People like you have designed, built and deployed production grade cloud services and accomplished their career goals. This is what makes me super-proud.

Transcribing 14+ years of my presentations

This year, I plan to use Open AI whisper to transcribe all my presentations recorded over the last 14+ years since starting Cloud Genius.

I plan to post my transcription results from OpenAI whisper on this website. You will receive my updates as I run the machine learning models.

Whisper is an open source tool from Open AI. As an easy test, I am going to feed Whisper a test video of a native speaker that teaches how to enunciate. This should be an easy one for whisper to transcribe.  

So I install yt-dlp to make it easy to fetch videos locally as needed and installFFmpeg for A/V processing.

sudo apt update && sudo apt install -y yt-dlp ffmpeg

Then, I use my handy script to set up conda.

echo Intel CPU assumed
echo Using my preferred $HOME/miniconda install location

rm -rf $HOME/miniconda ~/
wget -O ~/
bash ~/ -b -p $HOME/miniconda
rm -rf ~/

source $HOME/miniconda/bin/activate
conda init bash
conda config --set auto_activate_base false
conda update -n base -c defaults conda -y
conda update --all -y

To get me a clean python environment for whisper related work, I set up a new conda env named w with python 3.9 and activate it for use with whisper. Finally, I install whisper using pip in that clean conda environment.

conda activate base
conda create --name w python=3.9 -y
conda activate w

pip install git+

Now, my machine is ready to run whisper and help me transcribe whatever I require. I download that example video and asked whisper to transcribe it. I chose to run whisper with medium model and asked whisper to assume English as the spoken language. Otherwise, whisper spends a few seconds detecting the language being spoken. Yes, whisper supports many languages. Look at the related GitHub page for its complete capabilities.

[00:00.000 --> 00:06.320]  How to Enunciate.
[00:06.320 --> 00:10.000]  Want to get the attention, the respect, and even the dates you've been missing out on?
[00:10.000 --> 00:11.760]  You can start by speaking clearly.
[00:11.760 --> 00:20.160]  You will need Mirror Voice recorder Cork or pencil and sense of humor.
[00:20.160 --> 00:21.160]  Step 1.
[00:21.160 --> 00:24.960]  Stand in front of the mirror and pretend you're having a conversation with a friend.
[00:24.960 --> 00:29.280]  It's much easier to identify the places where you slur if you watch yourself speak.
[00:29.280 --> 00:30.280]  Step 2.
[00:30.280 --> 00:35.080]  Stretch your face as wide as it will go, and then scrunch it up as small as you can.
[00:35.080 --> 00:38.000]  Move your jaw from side to side and back and forth.
[00:38.000 --> 00:41.440]  Stick your tongue out as far as it will go, and retract it.
[00:41.440 --> 00:43.480]  Repeat these steps several times.
[00:43.480 --> 00:47.600]  Stretching your face, jaw, and tongue makes it easier to form words clearly.
[00:47.600 --> 00:48.760]  Step 3.
[00:48.760 --> 00:53.200]  Stand in front of the mirror and repeat vocal exercises that'll help you loosen your tongue,
[00:53.200 --> 00:54.200]  lips, and jaw.
[00:54.200 --> 00:59.520]  Try to make every sound distinct, emphasizing both consonants and vowels.
[00:59.520 --> 01:01.680]  Say and repeat these clearly.
[01:01.680 --> 01:12.280]  B-b-b, w-w-w, b-b-b, w-w-w, p-p-p, f-f-f, p-p-p, f-f-f, gutta-butta, gutta-butta.
[01:12.280 --> 01:14.560]  Red leather, yellow leather.
[01:14.560 --> 01:15.840]  Step 4.
[01:15.840 --> 01:20.180]  Repeat tongue twisters slowly and deliberately to yourself in the mirror, and make sure that
[01:20.180 --> 01:22.960]  you can hear each separate consonant and syllable.
[01:22.960 --> 01:27.480]  Over time, say the phrases faster and faster, making sure you can still hear each part of
[01:27.480 --> 01:29.320]  the word clearly.
[01:29.320 --> 01:31.080]  Say and repeat these clearly.
[01:31.080 --> 01:34.160]  A noisy noise annoys an oyster.
[01:34.160 --> 01:36.080]  Lovely lemon liniment.
[01:36.080 --> 01:39.080]  Twelve twins twirled twelve twigs.
[01:39.080 --> 01:40.080]  Step 5.
[01:40.080 --> 01:43.000]  Record yourself reading a paragraph from your favorite book.
[01:43.000 --> 01:48.380]  Now gently hold a pencil or the small end of a cork just behind your front teeth.
[01:48.380 --> 01:53.680]  Carefully read the paragraph aloud several times, making every letter as clear as possible.
[01:53.680 --> 01:57.680]  Remove the cork or pencil and record yourself reading the paragraph again.
[01:57.680 --> 02:00.720]  The second recording will be much clearer.
[02:00.720 --> 02:01.840]  Step 6.
[02:01.840 --> 02:06.120]  Focus on making consonants and syllables as clear as possible, since they provide the
[02:06.120 --> 02:09.080]  most structure in words and sentences.
[02:09.080 --> 02:10.400]  Step 7.
[02:10.400 --> 02:13.880]  Take ten minutes each day to repeat your speech exercises.
[02:13.880 --> 02:17.260]  You may look silly as you talk to the mirror, but you'll sound great when you're speaking
[02:17.260 --> 02:18.260]  in public.
[02:18.260 --> 02:23.640]  Did you know As an aspiring actress, Kathleen Turner perfected her diction by biting down
[02:23.640 --> 02:51.280]  on pencil erasers while practicing speech exercises.
Whisper transcription is output in TXT, SRT and VTT formats.

SRT and VTT formats are suitable for adding time-synchronized closed captioning.

You are reading The Cloud Seminar, a project by Cloud Genius founder @K9LVN.  You can subscribe in the meantime if you'd like to stay up to date and receive emails when new content is published.