Editor Flow for AI Translation Dubbing

When I joined Supernative, the team had built a powerful backend for AI dubbing, but the user experience was still in its infancy. My role was to design an editor flow from scratch, balancing advanced functionality with user-centric simplicity.

The problem

Supernative enables ultra-realistic AI dubbing that retains a speaker's voice across languages. However, we faced two significant challenges.

Editing was costly: Every AI translation incurred a fee, so we needed users to finalize edits before dubbing. This ensured an economical business model for Supernative and fair pricing for users.
Waiting times: Larger files took time to upload and begin making edits.

Users were accustomed to seeing the video they were editing and watching changes update in real time. However, due to translation costs, this wasn’t feasible for Supernative. As a result, we focused on creating an intuitive, user-friendly editing flow and ensuring users clearly understood their progress in the process.

Understanding the users

Our primary users were content creators who were 18–34 years old, that wanted to share their content with a global audience. Through user interviews, we learned their priorities.

Control over translations: Creators needed the ability to stop certain words from translating (names, brands, and industry terms).
Efficiency in workflows: Reducing waiting times and eliminating redundant tasks was crucial.

My role

As the sole designer, I led the end-to-end design process, from research to prototyping, working closely with developers and the co-founder. Beyond the product, I also crafted the brand identity and logo to align with the product’s innovative spirit.

Research and ideation

I began by meeting with the co-founder and developer to understand the technical goals and constraints, such as upload times and third-party fees. Competitor analysis revealed gaps we could address, and user interviews with content creators helped uncover pain points and workflows.

Our goal

Minimize costs: Finalize edits before dubbing to avoid unnecessary translation fees.
Reduce waiting times: Make upload and editing processes feel fast and seamless.
Empower creators: Supernative’s transcription and speaker detection is incredibly accurate, but users needed the ability to correct occasional errors.

These findings informed user personas that reflected creators' needs, goals, and behaviors.

Design process

Uploading content

Uploading videos could take a while, which delayed certain functionality like viewing and editing the transcript or video. However, we found there were actions the user could take while the video was still uploading. We saw this as an opportunity to jump users straight into the product and improve the perceived wait time.

Selecting output language

As soon as a user uploads a file, the user is taken to the first step in the upload flow where they can select which language they would like their video dubbed in.

Naming speakers

For the next step, we asked users to name all of the speakers featured in the video. Not only did this remove waiting times, but it actually helped the AI work more accurately by knowing the exact number of different speakers.

Users were able to complete these steps while their video was uploading in the background, reducing perceived wait times.

‍Editor dashboard

With the video uploaded, now users could access the transcript and make final edits before dubbing. I designed a dashboard where users could quickly access all of our editor tools.

Transcript editing

We gave users the ability to add, delete, or change words. Audio automatically adjusts to match the timing with the speaker's voice seamlessly.

Word preservation

To ensure users had full control of preventing certain words and brand names from translating, we added the ability to lock words directly in the transcript. Highlighting a word would trigger a context menu where users could lock words.

Labeling speakers

Since we already had users input their speakers names, now they could label the speakers in the transcript. The AI already splits the transcript up every time the speaker changes. So users could simply label a section of the transcript once, and that label would carry across the rest of the transcript for matching voices.

Managing speakers

Supernative AI auto-detected speakers, but users could manually split speakers if needed.

Lipsync toggle

Users could enable or disable lipsync for diverse content needs.

User feedback

I tested prototypes with real users to identify friction points and gather feedback.

Users enjoyed having a dashboard for all of the editor tools, but were struggling to navigate certain sections of the transcript. To improve this experience we added real time transcription that highlighted each word in real time. Allowing users to select any word and have the video jump to the corresponding section (or visa versa).

Results

Faster workflows: By enabling pre-dub edits during upload, users perceived uploads as faster and could immediately start editing.
Cost efficiency: By finalizing edits before dubbing, we were able to avoid unnecessary fees.
User satisfaction: Creators loved the editor’s real-time transcription, especially for long transcripts and non-native languages.
‍

Takeaways

This project taught me the importance of designing with empathy for both user needs and business constraints. By involving users throughout the process, we turned complex AI technology into a user interface that was intuitive and user friendly. This project taught me to trust the process and user input. While revisions can feel like a step backwards at times, that process has led to some of my proudest designs. By trusting user feedback and testing assumptions, you can discover new opportunities and take the project to levels that felt unreachable.