PhoTranscriptor - A Free Vietnamese Speech-to-Text Tool for Researchers
Lately, I've been buried under hours of audio recordings—each file stretching close to or beyond an hour. Naturally, I went on a hunt for a reliable Vietnamese speech-to-text tool to make my life easier.
That's when I stumbled upon PhoWhisper (yes, it's pronounced "Phở" 😆). It's a fine-tuned version of the original Whisper model by OpenAI, trained on 844 hours of Vietnamese audio covering a wide range of accents. The result? Excellent transcription quality for Vietnamese speech.
But here's the catch:
For non-coders, setting it up and running the model locally can be quite a hassle. Plus, if your computer doesn't have a GPU, the transcription process can be painfully slow.
For context, transcribing a 40-minute audio clip on my GPU-less laptop used to take 4 hours—now it's down to just 10 minutes on Colab!

Why you'll love this tool:
- Free to use
- High-quality Vietnamese speech recognition (approx. 10% error rate)
- Handles long and large audio files
- Secure and private — your audio files are automatically deleted after session
- Fast processing with Google Colab (limited by free-tier quotas)
🛠️ How to Use PhoTranscriptor (Google Colab Version)
Step 1: Open the notebook
- Go to this link:
👉 PhoWhisper on Google Colab
Step 2: Make a copy to your Google Drive
- Click File → Save a copy in Drive
This allows you to run the notebook on your own Google account.
Step 3: Run the setup cells
- Press the Play button (▶️) next to each code cell, starting from the top.
- Wait for the environment to finish setting up (it will install necessary libraries and load the model).
- When complete, a public URL (usually starting with https://) will appear.
Step 4: Click the public URL to open the interface
- This opens the drag-and-drop web interface in a new tab.
- You can now interact with the transcription tool visually.
Step 5: Upload your audio file
- In the interface, drag and drop your .mp3, .wav, or other audio file formats.
- The file is stored temporarily and will be deleted once the browser session ends.
Step 6: Start the transcription
- Click the Transcribe button in the interface.
- You'll see real-time progress of the transcription.
Step 7 (Optional): Copy and edit the transcript
- Once finished, the transcript will appear on the screen.
- Copy it to your editor and listen back to the audio if you'd like to clean it up.
Try it out:
📺 Step-by-step video tutorial:
Watch Video Tutorial
Feel free to make a copy of the Colab notebook to your own Google Drive to get started. I've included all the instructions in the video as well.
Happy transcribing and good luck with your research!