On January 1st, 2009, I founded Cloud Genius with a goal to help people like you realize their dream to gain hands-on skills.
Over the past 14+ years, 1760 people with 15 nationalities from 4 continents successfully completed our programs to up-skill themselves. People like you have designed, built and deployed production grade cloud services and accomplished their career goals. This is what makes me super-proud.
This year, I plan to use Open AI whisper to transcribe all my presentations recorded over the last 14+ years since starting Cloud Genius.
I plan to post my transcription results from OpenAI whisper on this website. You will receive my updates as I run the machine learning models.
Whisper is an open source tool from Open AI. As an easy test, I am going to feed Whisper a test video of a native speaker that teaches how to enunciate. This should be an easy one for whisper to transcribe.
So I install yt-dlp
to make it easy to fetch videos locally as needed and installFFmpeg
for A/V processing.
sudo apt update && sudo apt install -y yt-dlp ffmpeg
Then, I use my handy script to set up conda.
echo Intel CPU assumed
echo Using my preferred $HOME/miniconda install location
rm -rf $HOME/miniconda ~/miniconda.sh
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O ~/miniconda.sh
bash ~/miniconda.sh -b -p $HOME/miniconda
rm -rf ~/miniconda.sh
source $HOME/miniconda/bin/activate
conda init bash
conda config --set auto_activate_base false
conda update -n base -c defaults conda -y
conda update --all -y
To get me a clean python environment for whisper related work, I set up a new conda env
named w
with python 3.9
and activate it for use with whisper. Finally, I install whisper using pip
in that clean conda environment.
conda activate base
conda create --name w python=3.9 -y
conda activate w
pip install git+https://github.com/openai/whisper.git
Now, my machine is ready to run whisper and help me transcribe whatever I require. I download that example video and asked whisper to transcribe it. I chose to run whisper with medium model
and asked whisper to assume English
as the spoken language. Otherwise, whisper spends a few seconds detecting the language being spoken. Yes, whisper supports many languages. Look at the related GitHub page for its complete capabilities.
[00:00.000 --> 00:06.320] How to Enunciate.
[00:06.320 --> 00:10.000] Want to get the attention, the respect, and even the dates you've been missing out on?
[00:10.000 --> 00:11.760] You can start by speaking clearly.
[00:11.760 --> 00:20.160] You will need Mirror Voice recorder Cork or pencil and sense of humor.
[00:20.160 --> 00:21.160] Step 1.
[00:21.160 --> 00:24.960] Stand in front of the mirror and pretend you're having a conversation with a friend.
[00:24.960 --> 00:29.280] It's much easier to identify the places where you slur if you watch yourself speak.
[00:29.280 --> 00:30.280] Step 2.
[00:30.280 --> 00:35.080] Stretch your face as wide as it will go, and then scrunch it up as small as you can.
[00:35.080 --> 00:38.000] Move your jaw from side to side and back and forth.
[00:38.000 --> 00:41.440] Stick your tongue out as far as it will go, and retract it.
[00:41.440 --> 00:43.480] Repeat these steps several times.
[00:43.480 --> 00:47.600] Stretching your face, jaw, and tongue makes it easier to form words clearly.
[00:47.600 --> 00:48.760] Step 3.
[00:48.760 --> 00:53.200] Stand in front of the mirror and repeat vocal exercises that'll help you loosen your tongue,
[00:53.200 --> 00:54.200] lips, and jaw.
[00:54.200 --> 00:59.520] Try to make every sound distinct, emphasizing both consonants and vowels.
[00:59.520 --> 01:01.680] Say and repeat these clearly.
[01:01.680 --> 01:12.280] B-b-b, w-w-w, b-b-b, w-w-w, p-p-p, f-f-f, p-p-p, f-f-f, gutta-butta, gutta-butta.
[01:12.280 --> 01:14.560] Red leather, yellow leather.
[01:14.560 --> 01:15.840] Step 4.
[01:15.840 --> 01:20.180] Repeat tongue twisters slowly and deliberately to yourself in the mirror, and make sure that
[01:20.180 --> 01:22.960] you can hear each separate consonant and syllable.
[01:22.960 --> 01:27.480] Over time, say the phrases faster and faster, making sure you can still hear each part of
[01:27.480 --> 01:29.320] the word clearly.
[01:29.320 --> 01:31.080] Say and repeat these clearly.
[01:31.080 --> 01:34.160] A noisy noise annoys an oyster.
[01:34.160 --> 01:36.080] Lovely lemon liniment.
[01:36.080 --> 01:39.080] Twelve twins twirled twelve twigs.
[01:39.080 --> 01:40.080] Step 5.
[01:40.080 --> 01:43.000] Record yourself reading a paragraph from your favorite book.
[01:43.000 --> 01:48.380] Now gently hold a pencil or the small end of a cork just behind your front teeth.
[01:48.380 --> 01:53.680] Carefully read the paragraph aloud several times, making every letter as clear as possible.
[01:53.680 --> 01:57.680] Remove the cork or pencil and record yourself reading the paragraph again.
[01:57.680 --> 02:00.720] The second recording will be much clearer.
[02:00.720 --> 02:01.840] Step 6.
[02:01.840 --> 02:06.120] Focus on making consonants and syllables as clear as possible, since they provide the
[02:06.120 --> 02:09.080] most structure in words and sentences.
[02:09.080 --> 02:10.400] Step 7.
[02:10.400 --> 02:13.880] Take ten minutes each day to repeat your speech exercises.
[02:13.880 --> 02:17.260] You may look silly as you talk to the mirror, but you'll sound great when you're speaking
[02:17.260 --> 02:18.260] in public.
[02:18.260 --> 02:23.640] Did you know As an aspiring actress, Kathleen Turner perfected her diction by biting down
[02:23.640 --> 02:51.280] on pencil erasers while practicing speech exercises.
SRT and VTT formats are suitable for adding time-synchronized closed captioning.
You are reading The Cloud Seminar, a project by Cloud Genius founder @K9LVN. You can subscribe in the meantime if you'd like to stay up to date and receive emails when new content is published.