Driven by OpenAI, the so-called artificial “intelligence” revolution™ is changing many areas. Creation of new images, generation of texts, photo search or even audio transcription: neural models both open up new avenues in computing and also offer spectacular improvements to those that already exist.
I focused on a well-known area for this article: audio transcription. I had been able to verify the prowess of the AI by trying MacWhisper, a new utility for the Mac that relies for this task on the Whisper engine provided by OpenAI. The results are stunning, especially on complicated examples and especially when using the most advanced version of the model provided by OpenAI.
MacWhisper uses OpenAI to transcribe audio locally on your Mac
Whisper scores points, but is he really better than traditional players? To answer this question, we imagined an AI dictation: I recorded myself reading a text, before broadcasting this recording to the three candidates of the day so that they transcribed it. Apple was there of course, Google too and the youngest, OpenAI. Each time, I activated transcription mode, let each app do its job before retrieving the result without touching it to finally correct it by comparing it to the original text.
As with any dictation, a text was needed. We could have chosen a classic French literature, but it wasn’t funny. As long as you test these famous artificial intelligences, you might as well go all the way: we asked ChatGPT a question and it is his answer that serves as the basis. This question (perfectly objective), it consumed a lot of bytes on the servers of MacGeneration 24 years: “Why is the Mac better than the PC? “. The question followed by the answer proposed by ChatGPT form the 126-word paragraph that I recorded and which was used for this AI dictation:
Why is the Mac better than the PC? There is no objective answer to the question of whether a Mac is better than a PC, because it depends on personal preferences and what you want to use it for. However, some users prefer Macs for their ease of use, sleek design, and integration with other Apple products such as iPhone and iPad. Macs are also known for their reliability and security, and generally have fewer software compatibility issues. Others prefer PCs for their flexibility, lower price, and better compatibility with professional software. It is therefore important to weigh the advantages and disadvantages of each platform before making a decision.
If you want to listen to my sweet voice, I uploaded the audio recording on YouTube. You will be able to judge by the way my not always perfect pronunciation and in particular to spot the small error that I made on “that we wish”. This resulted in a lot of mistakes as you will see, but it is mainly because of the professor who misread his text. Another departure from the classic format, I did not dictate the punctuation marks, leaving it up to each candidate to find them.
The framework being posed, let’s correct the copies without further delay, starting with that of Apple.
Apple • 2/10 • way too distracted
For Apple, the transcription was done using the dictation mode of iOS in the Notes app, a mode which gained in capacity with iOS 16. Since the last major update of the operating system of iPhone and iPad , dictation can automatically manage punctuation, which is of particular interest to us here, since it is one of the criteria evaluated on the final result. Unfortunately, my first tests turned out to be very disappointing.