Hey folks,
I need to create an Open Captioned DCP from an .mov file along with an .srt subtitle file. Opening the two in VLC player works as it should but the problem comes when trying to create the DCP. All of the dialogue in the film is displayed as ">> dialogue" and plays as such in VLC but in the DOM preview window the >> signs are displaying as ">>" I'm going to spend the next few hours converting and see if the problem in the preview window also exists on the actual DCP captions. If so, does anyone on here know if there is a way to convince DOM to display the captions as intended and not attempt to convert them to html?
captions displaying > correctly
-
- Site Admin
- Posts: 2548
- Joined: Thu Nov 14, 2013 2:53 pm
Re: captions displaying > correctly
Hi
If there's a problem in the preview I'm afraid it'll almost certainly also be there in the DCP.
Can you send the .srt file over to carl@dcpomatic.com so I can take a look?
If there's a problem in the preview I'm afraid it'll almost certainly also be there in the DCP.
Can you send the .srt file over to carl@dcpomatic.com so I can take a look?
-
- Posts: 2804
- Joined: Tue Apr 15, 2014 9:11 pm
- Location: Germany
Re: captions displaying > correctly
Must be a charset/encoding issue. Can probably be solved by converting the SRT file with 'some' editor and choosing a different encoding. Which version of DCP-o-matic are you using?
Hmm... seems to be a bug in DCP-o-matic - no matter what encoding I use, > is shown as >
It also arrives in the captions XML file that way: <Text VAlign="top" VPosition="86.9545">>> dialogue</Text>
(using DCP-o-matic 2.15.152) So probably Carl needs to fix it.
One reason probably is that, as you see, in the resulting DCP captions XML file, > and < is also used to introduce the 'Text' separators. And the code dealing with that needs to deal properly with these formal separators and > in content.
So the reason is the XML for the captions in DCPs. VLC will probably use the SRT directly and thus has no problems.
- Carsten
Hmm... seems to be a bug in DCP-o-matic - no matter what encoding I use, > is shown as >
It also arrives in the captions XML file that way: <Text VAlign="top" VPosition="86.9545">>> dialogue</Text>
(using DCP-o-matic 2.15.152) So probably Carl needs to fix it.
One reason probably is that, as you see, in the resulting DCP captions XML file, > and < is also used to introduce the 'Text' separators. And the code dealing with that needs to deal properly with these formal separators and > in content.
So the reason is the XML for the captions in DCPs. VLC will probably use the SRT directly and thus has no problems.
- Carsten
Last edited by Carsten on Sat Jul 31, 2021 10:19 am, edited 1 time in total.
-
- Posts: 2804
- Joined: Tue Apr 15, 2014 9:11 pm
- Location: Germany
Re: captions displaying > correctly
Okay, I seem to be wrong, again. Maybe it IS a bug in DCP-o-matic, but only in the display/preview rendering part. Seems that the DCP you created is formally correct. That's a 'lighter' bug then:
From the Cinecanvas spec (and I am sure it is the same in SMPTE subtitles):
---
1.4 Predefined Entities
Since XML uses the ‘<’, ‘>’ and ‘&’ characters for special purposes, their use as content must be escaped.
1.5 Elements
Similarly, any Unicode character can be specified by using its decimal code-point preceded by “&#” and terminated with “;”. For example, “A” represents the character ‘A’.
Unicode characters can also be specified using hexadecimal notation by preceding its code-point value with “&#x”. For example, “A” represents the character ‘A’.
---
The escape character sequence > is what you see in preview.
So, I assume that DCP-o-matic and DCP-o-matic player just don't deal correctly with these escaped characters when displaying captions. The DCPs created, though, appear to be 100% correct. I can try one of these on our cinema projector later. The question remains wether it is safe to use these special chars in DCP captions. Nowadays, different software is used in cinema projection equipment to render captions, and it is possible that some equipment fails on them. I can test a few, but not all.
- Carsten
From the Cinecanvas spec (and I am sure it is the same in SMPTE subtitles):
---
1.4 Predefined Entities
Since XML uses the ‘<’, ‘>’ and ‘&’ characters for special purposes, their use as content must be escaped.
1.5 Elements
Similarly, any Unicode character can be specified by using its decimal code-point preceded by “&#” and terminated with “;”. For example, “A” represents the character ‘A’.
Unicode characters can also be specified using hexadecimal notation by preceding its code-point value with “&#x”. For example, “A” represents the character ‘A’.
---
The escape character sequence > is what you see in preview.
So, I assume that DCP-o-matic and DCP-o-matic player just don't deal correctly with these escaped characters when displaying captions. The DCPs created, though, appear to be 100% correct. I can try one of these on our cinema projector later. The question remains wether it is safe to use these special chars in DCP captions. Nowadays, different software is used in cinema projection equipment to render captions, and it is possible that some equipment fails on them. I can test a few, but not all.
- Carsten
-
- Posts: 15
- Joined: Fri Jun 04, 2021 12:48 am
Re: captions displaying > correctly
I've been offsite since creating the feature DCP. I'll be back onsite to ingest and test later tonight. I'll update with the results.
-
- Posts: 15
- Joined: Fri Jun 04, 2021 12:48 am
Re: captions displaying > correctly
Unfortunately it looks like the >> signs are displaying incorrectly on the finished DCP as well.
In all reality, I think we're just going to screen this one without captions because after watching a portion of the program, it looks like the studio didn't hire anyone to actually transcribe the dialogue and just had some kind of voice recognition software handle it. They're pretty bad.You do not have the required permissions to view the files attached to this post.
-
- Site Admin
- Posts: 2548
- Joined: Thu Nov 14, 2013 2:53 pm
Re: captions displaying > correctly
Thanks for the update. I have a fix for this bug and it should be gone in the next test release.
-
- Posts: 2804
- Joined: Tue Apr 15, 2014 9:11 pm
- Location: Germany
Re: captions displaying > correctly
so, Carl, what is it? Shouldn't > be the correct escape sequence for > ?
- Carsten
- Carsten
-
- Site Admin
- Posts: 2548
- Joined: Thu Nov 14, 2013 2:53 pm
Re: captions displaying > correctly
It was just a bug: there was some code to convert > to > but then some more code to convert & to & so > got changed to > and then to &gt;
It shouldn't happen from 2.15.157 onwards.
It shouldn't happen from 2.15.157 onwards.
-
- Posts: 15
- Joined: Fri Jun 04, 2021 12:48 am
Re: captions displaying > correctly
Thanks Carl. You're a champ.