Otter.ai as a Tool to Practice Conversational Ability and Note Taking
On using voice-to-text as a way to practise conversation and take notes whilst controlling for distraction.

Using voice-to-text as a way to practise conversation and take notes whilst controlling for distraction.
Whilst I love writing and note taking, Iāve never been as smitten with speech as a way to communicate. But because of my difficulties with conversational ability, Iām missing out on an essential way to convey information. Therefore what better way than combining two birds with one stone; practising conversational ability and speeding up note taking.
Iām always interested in trying out new ways of taking notes and writing. And Iām a sucker for shiny new apps as a way to do that. So today I tried something really interesting, exciting and scary at the same time!
For the first time, I recorded what I wanted to say using the app Otter.ai. Then with a lot of tweaking for clarity, published what I had to say here on Medium.
And when I say a lot of tweaking, I mean I pretty much re-wrote everything⦠but more on that towards the end.
Here is my first foray into voice-to-text transcription and how it came about as a way to become more confident talking whilst building on my notes collection.
Video is a sensory explosion if you are trying to identify ways to improve yourself
This idea of recording myself came about after my first meeting with my coach.
He encouraged me to video myself so I can watch it back to identify areas of speech to improve upon; āI just set up a Teams meeting with myself,ā he said. Heās a musician and he does this regularly to listen to himself to improve his technique. I get this. Itās a great idea, but I have this nagging feeling video may be one step too far.
Video is visual. Itās auditory. Itās emotional. Itās a whole plethora of senses in one production type. I have to think about the sound, the lighting, the location and what Iām wearing. Well, I donāt *have* to, but anyone who is half a perfectionist like me, needs to to feel like weāre doing our best.
And the idea of sitting at my desk in āmeeting modeā doesnāt reflect how many conversations arise; in the corridor at work, at the supermarket, on the train and at social gatherings. Thereās other āstuffā going on in the background. Distractions if you like.
I therefore needed something else. A much smaller, less overwhelming, first step to ease into the trauma of needing to listen to oneself speak and be critical about it, but still tries to mimic background distractions.
Voice-to-text to improve my conversational ability whilst taking notes
Iāve also always considered recording voice notes as a way to increase my writing speed, but I never considered I could both record my voice and transcribe what I was saying at the same time.
On discovery of the app Otter.ai which transcribes speech, I was faced with an opportunity to do this. It was a great āmeet half wayā option. Donāt get me wrong it was still a scary idea, recording myself and then listening to it, but at least I could do it in a way that seemed to reduce the mountain of sensory issues associated with video.
And what better time to do it than whilst doing a mundane, but gently distracting task; folding a massive pile of washing. I was onto not two birds with one stone, but three!
The experience was weird at first but I got used to it
To be honest the whole affair seemed a bit stupid; sitting on my bedroom floor talking to myself in a trying-to-be-coherent way whilst folding washing. I did settle into it after about 5 minutes, but the bizarreness didnāt really fade.
I found I spoke slowly, partly to make sure Otter.ai captured all my words, but also because I felt I needed to, to get into the flow and be more structured in what I was saying.
This is the opposite to what happens in a conversation. In a conversation, things are much faster paced, both the person talking and the speed at which I am required to respond. Therefore in a group I find it quite hard because the natural conversational rhythm is very fast. And I just canāt keep up.
Iām hoping that with time if I talk to myself a lot more, then I will be able to begin to talk faster, or be able to be a bit more coherent in the way I talk.
The Advantages of Using Voice-to-Text
I identified many advantages to using this method to improve my speech and writing abilities:
- I āwroteā a lot of words very quickly; approximately 1800 in 13 minutes.
- Otter.ai is pretty effective at picking up words (in English) and sending them to different apps -> Obsidian for me.
- It made the mundane task (of folding washing) seem less mundane
- It helped slow down my rate of speech so I could focus on what I wanted to say
- I could replay the note to hear my speech patterns and behaviours; it appears that I end a lot of my sentences on a higher note. It makes it sound as if I am unsure of what I am saying as it sounds like Iām asking a question. Now I have to make the conscious choice of lowering my voice at the end of saying something; feels weird but I clearly need to do it.
- It allowed me to practise my conversational ability
- It would be an effective way to transcribe written notes
- I could listen to myself to get used to the sound of my own voice (I hate the sound of my own voice BTW).
The disadvantages of using voice-to-text
The disadvantages were fewer, but important none-the-less.
- It wasnāt quicker to write a blog post due to the vast amount of editing. Like I didnāt even edit, I just wrote a different article. In that sense it worked as notes for maybe raising key points, but it was not even close to a finished article.
- It needs a lot of practise before I can use it as a bypass to typing on the keyboard as my main form of writing
- The times at which I will use it are infrequent i.e. Iām alone in the house
Trying to get three birds with one stone
I think I started out too ambitious (as I always do). I had a grand idea that as well as improving my conversational and talking abilities, I would end up with a blog post, folded washing and a slightly better ability to structure a conversation.
However I ended up with folded washing not quite sorted properly, not particularly useful notes and a sense that I maybe wasnāt truly focused on improving my conversational abilities.
Still, I learnt what birds were in the bush, and for a first scoping, what more could I have wished for?
OK, I lied, I tried to get four birds
In the light of my speech not being coherent enough to convert fairly easily into a blog post, out of curiosity (and hope) I fed my āchat noteā into ChatGPT.
The aim was to try and get it to re-structure it into something coherent. Whilst it was an interesting exercise, it was slow and didnāt give me what I was hoping, despite re-wording the prompt about 10 times.
Notion AI did a better job, but Iām not a fan of third person:
āThis transcript delves into the speakerās experience of attempting to hone their conversational aptitude by talking to themselves while folding laundry. By doing so, they emphasize the significance of regular practice, as it can assist them in accessing their thoughts with greater ease, diminishing any fear they may have while engaging in conversations, and providing them with a space to take care of themselves. Ultimately, the speaker draws attention to the importance of practice, even if it may be an arduous or uncomfortable task at times, as it can help them to evolve in their conversational abilities and be more confident in their interactions with others. In addition, they suggest that this practice can help them to become more understanding of their own feelings and emotions, enabling them to develop better relationships with those around them.ā
Now I know I write long sentences, but gee whizz, that alone needs some work!
Despite its short comings, voice transcription was a great first step to improving my conversational ability. Being able to do it alongside gently distracting tasks helped mimic the distractions faced in typical scenarios when I need to talk to other people.
However I need to work on structure if I am to use it for note-taking.
As such this may be the first time I tried voice transcription but it wonāt be the last!