AIMarch 28, 20266 min read
Real-time Spanish-to-Chinese Subtitle Translator with Whisper
A desktop application that captures system audio, transcribes Spanish speech in real-time using Whisper, and translates to Chinese using GPT.
WhisperPyQt5PythonTranslationSpanish
The Problem
Working in Argentina, I found myself in meetings where rapid Spanish was spoken. While my Spanish is improving, technical discussions require precise understanding. I built a real-time subtitle translator to bridge this gap.
System Design
The application captures system audio (via WASAPI on Windows), streams it to Whisper for transcription, then uses GPT-4 for translation.
Technical Implementation
Using PyAudio for audio capture and PyQt5 for the UI overlay:
import pyaudio
import numpy as np
from openai import OpenAI
client = OpenAI()
def transcribe_audio(audio_chunk):
result = client.audio.transcriptions.create(
model="whisper-1",
file=audio_chunk,
language="es",
)
return result.text
def translate_to_chinese(text):
response = client.chat.completions.create(
model="gpt-4",
messages=[{
"role": "system",
"content": "Translate Spanish to Chinese accurately."
}, {
"role": "user",
"content": text
}]
)
return response.choices[0].message.content
UI Design
The PyQt5 overlay sits on top of other windows, showing subtitles in a sleek, semi-transparent bar at the bottom of the screen — similar to YouTube's subtitle display.
Results
- ~1.5 second latency from speech to Chinese subtitle
- Accurate technical vocabulary translation
- Works with multiple Spanish accents (Argentine, Colombian, Spanish)
Future Improvements
- Offline mode using local Whisper models
- Multi-language support
- Meeting transcription summaries