PhoTranscriptor - A Free Vietnamese Speech-to-Text Tool for Researchers

Lately, I've been buried under hours of audio recordings—each file stretching close to or beyond an hour. Naturally, I went on a hunt for a reliable Vietnamese speech-to-text tool to make my life easier.

That's when I stumbled upon PhoWhisper (yes, it's pronounced "Phở" 😆). It's a fine-tuned version of the original Whisper model by OpenAI, trained on 844 hours of Vietnamese audio covering a wide range of accents. The result? Excellent transcription quality for Vietnamese speech.

But here's the catch:
For non-coders, setting it up and running the model locally can be quite a hassle. Plus, if your computer doesn't have a GPU, the transcription process can be painfully slow.

For context, transcribing a 40-minute audio clip on my GPU-less laptop used to take 4 hours—now it's down to just 10 minutes on Colab!

PhoWhisper Interface Screenshot

Why you'll love this tool:

🛠️ How to Use PhoTranscriptor (Google Colab Version)

Step 1: Open the notebook

Step 2: Make a copy to your Google Drive

Step 3: Run the setup cells

Step 4: Click the public URL to open the interface

Step 5: Upload your audio file

Step 6: Start the transcription

Step 7 (Optional): Copy and edit the transcript

Try it out:

📺 Step-by-step video tutorial:
Watch Video Tutorial

Feel free to make a copy of the Colab notebook to your own Google Drive to get started. I've included all the instructions in the video as well.

Happy transcribing and good luck with your research!