Closed Captioning your files with CaptionSync allows you to submit your own verbatim transcript. This article shows how to format it correctly.
How do I format and upload my own transcript?
CaptionSync allows you to upload your own formatted transcript. An accurate transcript is the basis from which our automated captioning process generates captions. So please follow these guidelines:
- Check the table of contents below and format your transcript according to the guidelines. At the very least, use properly formatted speaker IDs and parenthetical comments, and save your transcript as a UTF-8 .txt file.
- Note that you need to submit a verbatim transcript of the audio content; not a script or a screenplay. Ensure you also remove all the ancillary text such as title, date, author, header, footer, etc.
- If your content is "sweetened", i.e., it's mixed with music, noise, sound-effects, talk-over, pauses or unclear audio, ensure you use sync markers to improve the results. A complete description of how to use sync markers in the transcript is available in our Sync Marker Summary article.
- We recommend using Notepad, Word or TextEdit to create the transcript file.
- Save your file as a plain .txt format. It's also possible to save your transcript from a .doc file.
- For English transcripts, we recommend that it contains only standard US ASCII characters.
- If you wish to submit in other languages (Spanish, French or German), choose UTF-8 as the encoding.
- If you wish to submit a transcript containing lyrics of songs, please read our article on including lyrics in the transcript.
- When you have your transcript ready, you can upload it in a Captioning-Only submission.
- If you prefer to not have to format your own transcript, you can request Transcription too and send your existing text as guidance for the transcriber. Our transcribers are trained to generate well-formatted transcripts that yield optimal results with our automated system.
Learn more about accuracy and quality results for various forms of transcription.
A sample of a formatted transcript is attached at the bottom of this article.
TABLE OF CONTENTS:
1.1. Transcribe Verbatim
1.2. Spell Out Words Instead of Using Symbols
1.3. Omit Background Noise/Sounds from the Transcription
1.4. Speaker IDs
1.5. Use Square Brackets for non-spoken Content
1.6. Ensure Square Brackets are Matched and Symmetrical
1.7. Avoid Abbreviations
1.8. Avoid High ASCII
1.9. Extra Spaces and Carriage Returns/Line Feeds Are Unnecessary
1.10 Including URLs in the transcript
2.1. Proper Punctuation Terminates a Sentence
2.2. Double Dash Does Not Terminate a Sentence
2.3. Multiple Chevrons Start a New Sentence
2.4. Ellipses Terminates a Sentence if There is a Space After Them
2.5. Caption Breaks Can be Forced with Markup
3.1. Caption Breaks
3.2. Sync Markers
3.3. Style
3.4. Position
3.5. Escape Sequences
1. General Guidelines
1.1. Transcribe Verbatim:
The words in the audio should be transcribed exactly as the speaker says them, in the same order, with no additions or deletions. Ensure you remove all the ancillary text such as title, date, author, pagination, etc. Transcribers sometimes extract the meaning of the language in the audio by summarizing slightly or by leaving out segments of speech that may not interfere with the meaning. For goals other than automated captioning, this can serve a useful purpose, but for automated captioning, the transcript must match the audio, even for sentence fragments. Furthermore, if the audio is not readily audible simply transcribe [inaudible].
Example:
Speaker: Alright ladies and gentlemen. If we could get started, please.
Transcription for automated captioning:
Correct: Alright ladies and gentlemen. If we could get started, please.
Incorrect: If we could get started ladies and gentlemen.
Exceptions - What does not need to be transcribed:
- Speaker hesitations and disfluencies, such as “um”, “uh,” “mmm” do not need to be transcribed. It is ok to include these in the transcription, but not absolutely necessary.
- If a speaker backs up a bit and repeats a short phrase, it is not absolutely necessary to transcribe this. Again, it is fine to include these in the transcription, but not required.
Example:
Speaker: I have…I have some …um…administrative announcements.
Transcription for automated captioning:
OK: I have some administrative announcements.
1.2. Spell Out Words Instead of Using Symbols:
Special symbols in the text can lead to uncertainty in exactly what was said, making it more difficult for automated captioning. Also, many special symbols are not included in the standard character set for captioning. Instead, transcribe the exact words of the speaker. For example, if the speaker says something like “backslash”, don’t try to use a backslash symbol in the transcription. Instead, spell it out. This is true for all mathematical and other representational expressions, such as N2 (use “N squared”), or division (use “divided by”) or multiplication signs, for example.
A complete description on how to transcribe Math content is available in our Transcription Guidelines for Math content article.
Exceptions - Digits, such as “6” instead of “six,” or “25” instead of “twenty-five,” are OK.
Example:
Speaker: It is written like this: eight backslash twenty-five.
Transcription for automated captioning:
Correct: It is written like this: Eight backslash twenty-five.
Correct: It is written like this: 8 backslash 25.
Incorrect: It is written like this: 8 \ 25.
1.3. Omit Background Noise/Sounds from the Transcription:
Background sounds should not be transcribed, or if the transcriber feels it is necessary for the caption reader to read to understand, then these noises should be transcribed using square brackets to set them off.
Example:
Speaker: Let’s pause so you can discuss among yourselves.
Background noise while students discuss among themselves for a few minutes.
Speaker: Ok, let’s compare notes.Transcription for automated captioning:
Correct: Let's pause so you can discuss among yourselves. Ok, let's compare notes.
Correct: Let's pause so you can discuss among yourselves. [Background discussions] Ok, let's compare notes.
Incorrect: Let's pause so you can discuss among yourselves. Now we hear some background noise. Ok, let's compare notes.
1.4. Speaker IDs:
Speaker intros can be formatted in a number of acceptable ways:
- Standalone multiple chevrons. e.g. >> Hi!
- Multiple chevrons, name colon. e.g. >> Brent: Hi!
- Open square brace, name colon, close square brace. No space before closing brace. e.g. [Brent:] Hi!
Example:
Speaker named Paul: Let’s take a look at this function.
Transcription for automated captioning:
Correct: [Paul:] Let's take a look at this function.
Correct: >> Paul: Let's take a look at this function.
Correct: Let's take a look at this function.
Correct: >> Let's take a look at this function.
Incorrect: Paul: Let's take a look at this function.
Make sure the speaker IDs are not larger than 58 characters. A speaker ID cannot span captions. Make sure you choose a line length that accommodates for unusually large speaker IDs and the first word on their speech -- the size of a caption, that includes a speaker ID in it, must accommodate the max size of the speaker ID (58) plus the first word the speaker says. Don't include an abundance of punctuation or special characters on the speaker IDs. E.g. it is correct to write >> Dr. Patrick Smith, Geology Lecturer: , but not >> Dr. *Patrick Smith*, our Geology Lecturer!: .
Example:
The settings on your submission are a Line Length of 32 characters and 2 Lines per Caption.
Speaker ID:
Correct: >> Pat Smith, lecturer: Starting ...
(23 + 9 characters)
Correct: >> Patrick George West Smith, Geology Professor: Starting ...
(48 + 9 characters)
Correct: >> Patrick West Smith: [Volcanic Geology Professor] Starting ...
(22 + 29 + 9 characters)Incorrect: >> Patrick George West Smith, Volcanic Geology Professor in this College: Starting ...
(73 + 9 characters)
Incorrect: >> *Patrick George West Smith* [our Geology Professor!]: Starting ...
(56 + 9 characters) and special characters
Incorrect: >> Patrick George Allen West Smith: [Volcanic Geology Professor in this College] Starting ...
(35 + 45 + 9 characters)
Incorrect: >> Patrick George Allen West Smith: [ Volcanic Geology Professor in this College ] Starting ...
(35 characters) + additional caption
Please note that our transcribers identify speaker changes just with a double chevron, e.g., >> Speech . If you're making a Captioning and/or Transcription request, and wish to have our transcribers identify speakers by name (e.g. >> Pat: Speech ), you need to make that request in the Guidance for Transcriber field, on the New Submission page.
1.5. Use Square Brackets for non-spoken Content:
Non-spoken content (e.g., music, applause, noise, etc), or any content that is not present in the audio (e.g., credits or Speaker IDs), must be enclosed in square brackets (parenthetical comments). The transcript should contain only what the speaker said, and nothing more. Any other content must be in square brackets.
A complete description of how to use sync markers around parenthetical comments is available in our Sync Marker Summary article.
Note that CaptionSync differentiates between parentheticals with spaces and those without, i.e.:
[ Laughter ] is different than [Laughter]. The former is a standalone descriptive caption, whereas the latter is an inline comment within a caption. So speaker introductions should not have a space before the closing brackets.
Example:
Speaker named Paul throws his chalk then says: Let’s take a look at this function.
Transcription for automated captioning:
Correct: [ Throws chalk ] [Paul:] Let's take a look at this function.
Correct: [ Throws chalk ] Let's take a look at this function.
Correct: Let's take a look at this function.
Incorrect: (Throws chalk) Paul: Let's take a look at this function.
Example:
Children playing, music and multiple speakers: Where's the yellow bike? I left it (inaudible). (Shouts) Did you see it?
Transcription for automated captioning:
^M00:03:36
[ Children playing ]
^M00:03:45
[ Music ]
^M00:04:28
>> Where's the yellow bike?
>> I left it [inaudible].
>> [Shouts] Did you see it?
Example:
Multiple speakers saying the same thing.
Transcription for automated captioning:
>> [All Together] We'll be back! We'll be back!
>> [Al Unísono] Feliz Cumpleaños!
1.6. Ensure Square Brackets are Matched and are Symmetrical:
Ensure that every opening square bracket has a matching closing one and the spacing matches.
Example:
Student question: (inaudible)
Transcription for automated captioning:
Correct: [ Student question: inaudible. ]
Correct: [ Student question: [inaudible.] ]
Incorrect: [ Student question: [inaudible]
Incorrect: [ Laughter]
Incorrect: This will be the last time [inaudible ] happens.
Ensure square brackets are symmetrical.
Example:
^M00:05:35
[ Applause ]
^M00:05:43
>> Paul: Guess I don't need a mic. [Laughter] Let's start with...Transcription for automated captioning:
Incorrect: ^M00:05:35
[ Applause ]
^M00:05:43
>> Paul: Guess I don't need a mic. [Laughter ] Let's start with...Incorrect: ^M00:05:35
[Applause ]
^M00:05:43
>> Paul: Guess I don't need a mic. [Laughter] Let's start with...
1.7. Avoid Abbreviations:
Avoid using abbreviations in the text whenever possible, as they are not always clear to an automated parser. “St.” for example, could mean “saint” or “street”. “No.” could be a statement, or an abbreviation for “number”.
Example:
Speaker: Use a number one pencil.
Transcription for automated captioning:
Correct: Use a number one pencil.
Correct: Use a #1 pencil.
Incorrect: Use a No. one pencil.
1.8. Avoid High ASCII Characters:
Depending on the media type, captions usually use a very restricted character set. Most characters in the so-called "high ASCII" set are not permitted. Characters such as special symbols (e.g. the degree symbol: °), or single quotes (e.g. ’) should not be used. Because they are not permitted in the captioning output, they are replaced by a space by the automated system – this can result in some odd-looking captions. For special symbols (like degrees), type out the name; and for single quotes, use the apostrophe symbol. Formatting like bold, different font types, bullet points, etc., are also not required and can interfere with the automation process. Keep the formatting as simple as possible.
If you are using Microsoft Word, you need to turn off "Smart quotes" to prevent it from automatically using quotes instead of apostrophes. To do this, go to Tools -> AutoCorrect Options -> AutoFormat As You Type, and turn off both smart quotes and symbol characters. This will make all subsequent typing without those high ASCII characters, but does not correct what has already been typed!
Example:
Speaker: Mom’s favorite is Nestlé chocolate cooked at 200 degrees.
Transcription for automated captioning:
Correct: Mom's favorite is Nestle chocolate cooked at 200 degrees.
Incorrect: Mom’s favorite is Nestlé chocolate cooked at 200°.
1.9. Extra Spaces and Carriage Returns/Line Feeds Are Unnecessary:
The system disregards carriage returns, line feeds, tab characters, and extra spaces. So don’t waste time making it pretty.
Example:
Hello
there
BobSmith. Know the
time... or are you
free?From our system’s perspective this is the same as:
Hello there Bob Smith. Know the time... or are you free?
1.10. Including URLs in the transcript:
You can include URLs in the transcript, even in a shortened format, as long as they're not longer than the line length selected under the Advanced Settings.
Example:
The line length is set to 32 characters.
Transcription for automated captioning:
Correct:
https://
support.automaticsync.com
/hc/en-us/articles/202355665-
Transcription-Guidelines-
for-Captioning#1.10
Incorrect: https://support.automaticsync.com/hc/en-us/articles/202355665-Transcription-Guidelines-for-Captioning#1.10
2. Caption Breaks
2.1. Proper Punctuation Terminates a Sentence:
The text is first broken into sentences then sentences are sub-divided into captions as needed. Therefore punctuation terminating a sentence is very important.
Example:
Can you correct your spelling please! Yes, that’s better.
This would be broken into two sentences:
Can you correct your spelling please!
Yes, that’s better.
2.2. Double Dash Does Not Terminate a Sentence:
You can certainly use the double dash. While it is considered the most favorable break place for wrapping pop-on, it is not interpreted as the end of a sentence.
Example:
What time is the -- Whoa!
This is considered one sentence and would look like:
What time is the -- Whoa!
2.3. Multiple Chevrons Start a New Sentence:
Double chevrons (or any number greater than one) will force the start of a new sentence. It does not need terminating punctuation before it.
Example:
Hello Joe, can you tell me where >> Stop right there
This would be broken into two sentences (captions) as follows:
Hello Joe, can you tell me where
>> Stop right there
2.4. Ellipses Terminates a Sentence if There is a Space After Them:
Ellipses (multiple periods) will force the start of a new sentence only if it is followed by a space.
Example:
Then the sky darkened... ...and it rained
This is broken into two sentences as follows:
Then the sky darkened...
...and it rained
Example:
But...he knew the code.
This will not break the sentence:
But...he knew the code.
2.5. Caption Breaks Can be Forced with Markup:
This is detailed next...
3. Markup
The EIA-608 captioning character set is essentially ASCII with a couple of exceptions. For broadcast outputs, like .SCC, .CAP, .ASC, or .XMS, the following characters are not printable:
* \ ^ _ ` { | } ~
All control codes are case insensitive (e.g. ^it or ^IT are interpreted identically).
3.1. Caption Breaks:
AST uses the following markup to force caption breaks:
Markup |
Description |
^* |
Force end of caption immediately preceding this point |
Example:
This caption needs to^*break back there. Then continues -- with previous rules... As before >>> right? [ cough ]
[Paula:] But, thisdoesn't...break.
This gets interpreted as:
This caption needs to
break back there.
Then continues -- with previous rules...
As before
>>> right?
[ cough ]
[Paula:] But, this doesn't...break.
3.2. Sync Markers:
AST uses the following markups to communicate timing information. The frame (:ff) is optional. This is particularly useful to isolate intro music or heavy “sweetening”.
Markup | Description |
^Bhh:mm:ss:ff |
Begin synchronization at this timestamp. |
^Ehh:mm:ss:ff |
End synchronization at this timestamp. |
^Mhh:mm:ss:ff |
Arbitrary midstream marker at this timestamp. |
^Fhh:mm:ss:ff |
Hard end caption at this timestamp. Example: a ^E00:00:30 marker will put an end marker after the current text at 30 seconds, but that caption will be allowed to end normally -- i.e., it is subject to all of the caption timing rules about minimum hang, distance from subsequent caption, and caption gapping. While a ^F00:00:30 puts an end marker after the current text at 30 seconds, and forces that caption to end at 00:00:30. |
A complete description of how to use sync markers in the transcript is available in our Sync Marker Summary article.
Notes:
- The frames are optional in timestamps.
- All markup codes are case insensitive.
- You can use the ^B and ^E as many times as you like, but they must be logical.
Example:^B00:00:01 Hello Walter.
>> What time does this end?
^B00:00:09:26 Never!!
This is invalid since you cannot have two begins in a row.
- ^B and ^M tags refer to time at the beginning of the caption -- they will start a new caption if placed in the middle.
- ^E tags refer to the time at the end of the caption -- it will end the caption where it is placed.
Example:This text is ignored
^B00:01:02
Robert Smith, correct?.
>> What's the number called
^E00:01:04:20
This text will be ignored too
^B00:02:01:20
Great tune! We're working again...
This is valid, but keep in mind that text before the ^B or after the ^E is ignored.
Robert Smith, correct? This caption starts after 00:01:02:00
>> What's the number called This caption ends after 00:01:04:20
Great tune! This caption starts after 00:02:01:20
We're working again...
- The timestamps must be increasing!
Example:^M00:04:01:20 Slow down!
^M00:04:01 ...or else!!
This is invalid because the second timestamp is smaller than the first (00:04:01:00 < 00:04:01:20).
3.3. Style:
AST uses the following markup to apply style or print special characters:
Markup | Description |
^IT |
Adds Italics to the current style. In effect until reset to Normal |
^UL |
Adds Underline to current style. In effect until reset to Normal |
^ST |
Adds Bold to current style. In effect until reset to Normal |
^NO |
Resets all formatting to Normal |
^MU |
Prints the music symbol character |
^P |
Forces a paragraph break in the clean transcript, i.e., creates a new paragraph |
Example:
I need the following words in italics. ^ITUsing these markers, this text will be in italics.^NO ^M00:00:21:20 Let's add a marker, then a music symbol ^MU. Don't worry about spaces!
This gets presented as follows:
I need the following words in italics.
Using these markers, this text will be in italics.
Let's add a marker, then a music symbol ♪.
Don't worry about spaces!
Note that only the short forms described in the table above are supported: ^IT, ^NO and ^MU.
Note that for some outputs a suitable replacement for the ♪ symbol will be presented, as not all of them support this symbol.
3.4. Position:
AST uses the following markup to apply positioning to individual captions. Note that results will only be visible in formats that store positioning data:
Markup | Description |
^TO |
Caption at top of CEA-608 area (top of the screen). |
^BO |
Caption at bottom of CEA-608 area (bottom of the screen). |
^RI |
Right justification of the caption. |
^LE |
Left justification of the caption. |
^CE |
Center justification of the caption. |
3.5. Escape Sequences:
If the captions are not for the EIA-608 character set (e.g. broadcast constrained), the following escape sequences can be used:
Markup | Description |
\\ |
Prints the \ |
\^ |
Prints the ^, and interprets following characters as spoken words |
\* |
Prints the * |
\[ | Prints the [, and does not apply descriptive text processing rules |
\] | Prints the ], and does not apply descriptive text processing rules |
Notes:
- If ^ or * are seen without the backslash they are passed through for webcasts. If they are seen for broadcast, the transcript is rejected.
Example:This webcast shows the \^MUSIC symbol syntax plus \^\*. \[ and that this text is in the audio! \]
This gets presented as follows:
This webcast shows the ^MUSIC symbol syntax plus ^*.
[ and that this text is in the audio! ]
The following escape sequences can be used in both broadcast and web captions:
\. |
Do not treat this period as end of sentence |
\? |
Do not treat this question mark as end of sentence |
\! |
Do not treat this exclamation point as end of sentence |
Example:
This punctuation should be banned\. and\? or limited. Right?
This gets presented as follows:
This punctuation should be banned. and? or limited.
Right?
Comments
0 comments
Please sign in to leave a comment.