CaptionSync allows you to provide your own translated transcript on Translation requests. Please read the notes below before proceeding.
- For the best translation results, we would recommend using our own professional translators to create the translated transcript, instead of uploading your own transcript.
- You can submit any existing scripts or transcripts as Guidance for the Translator.
- If you prefer to submit your own transcript, the cost is very low, but you need to get the format exactly correct; not following the format guidelines exactly will result in poor results or a failure to get results.
- If you choose to have us generate the translated transcript, we will repair a failed submission or poor results.
- If you choose to provide the translated transcript, we are not able to provide transcript review or debugging.
Important Notes about the format of the transcript:
- The transcript can only contain characters that are representable in ANSI. This means the transcript file needs to be saved as ANSI (also called Western European Windows in some editors).
- The translated transcript must match the original language transcript, namely the number of segments need to be the same. A segment is a section of the transcript starting either with a speaker ID (>>) or a standalone parenthetical (e.g., [ Comment ]). If you break your transcripts into more segments, you'll get better results.
If they do not match, your translated transcript will be rejected for having a different number of segments than the original transcript.
- The translated transcript cannot contain markers (i.e., caption break markers, sync markers and style markers, like ^M00:05:43, ^B00:00:35, or ^E00:34:46) -- this is what we call a Clean Text Transcript (.clean.txt).
- Ensure you don't use syntactical structures that can cause a Translation submission to fail.
- The feature of including inline parenthetical comments in translated outputs has not yet been implemented. One possible workaround is to format the inline comments in the translated transcript using parentheses, e.g., instead of [Música], use (Música).
- The timing for the translated results is based upon the original captions. So ensure you're happy with the native language results before requesting translated captions.
- The success of the synchronization process, especially for long files, will largely depend on the correct formatting of the transcript, and correct use of segments.
- Start from the original language Clean Text Transcript (.clean.txt) - this is a transcript without markers and with paragraph breaks only at the start of each new speaker. If you did not originally request the .clean.txt output for the native language results, use the Redo feature to do so.
- Your translated transcript should be saved as a .txt file, and with ANSI character encoding (also called Western European Windows in some editors).
- For translation requests, speaker IDs need to be in the >> format (e.g., >> Joe: Hi). If you need to change the original transcript, use our Redo feature. Each speaker ID in the original transcript must be matched by an equivalent one in the translated transcript.
Ensure there is a space before a speaker ID (>>). There should be at least one space after each sentence end and before the next speaker ID.
- Ensure you add a space after sentence terminators (., !, ?, ..., etc). There should be at least one space after each sentence end (including the ellipsis ...).
- If there is a standalone parenthetical comment (e.g. [ Music ]) in the original transcript, then it must be matched in the translated transcript (e.g. [ Música ]).
- The translated parenthetical comments cannot be larger than your caption length. Make sure you choose a line/caption length that accommodates for unusually large translated parenthetical comments. A parenthetical comment cannot span captions.
- The translated speaker IDs cannot be larger than 58 characters. Make sure you choose a line length that accommodates for unusually large translated speaker IDs and the first word on their speech -- the size of a line, that includes a speaker ID in it, must accommodate the max size of the speaker ID (58) plus the first word the speaker says. A speaker ID cannot span lines.
- Do not add extra standalone parenthetical descriptions after the speaker ID, on the translated transcript. A speaker ID cannot span lines/captions.
- Don't include an abundance of punctuation or special characters on the speaker IDs. E.g., it is correct to write:
>> Dr. Patrick Smith, Geology Lecturer: Hi.
>> Dr. *Patrick Smith*, our Geology Lecturer!: Hi.
The settings for your original submission are a Line Length of 32 characters and 2 Lines per Caption.
Original speaker ID:
>> Patrick West Smith, Geology Professor: Starting ...
(41 + 9 characters)
>> Patrick West Smith: [Geology Professor] Starting ...
(22 + 20 + 9 characters)
Translated Transcript for automated captioning:
Correct: >> Patrick Smith: Começando ...
(17 + 10 characters)
Correct: >> Patrick West Smith: [Professor de Geologia] Começando ...
(22 + 24 + 10 characters)
Correct: >> Patrick West Smith: [Professor Adjunto de Geologia] Começando ...
(22 + 32 + 10 characters)
Incorrect: >> Patrick George West Smith, Professor de Geologia: Começando ...
(52 + 10 characters)
Incorrect: >> Patrick George West Smith, Professor Adjunto de Geologia nesta Universidade: Começando ...
(79 + 10 characters)
Incorrect: >> Patrick West Smith [Professor Adjunto de Geologia]: Começando ...
(54 + 10 characters)
Incorrect: >> Patrick George West Allen Smith: [Professor Adjunto de Geologia nesta Universidade] Começando ...
(35 + 51 + 10 characters)
Incorrect: >> Patrick West Smith: [ Professor Adjunto de Geologia ] Começando ...
(22 characters) + additional caption
- Attached are 3 samples: one original transcript in English, one translated transcript in Spanish, and one translated transcript in Portuguese.
- Since the text in the translated transcript doesn't appear in the media file, we use the original language timing to synchronize the new results. The translated results will not have the same number of words, or even sentences -- so the timing cannot be mapped word for word and is scaled over each "segment".
- This feature generates approximate results: the timing for translation results is generally good, but it is an approximation because the translated text doesn't occur in the media file. If you wish to improve the timing of the translated captions, learn how in our Editing and Repairing your Translated Results article.
Please sign in to leave a comment.