RTL subtitles

Anything and everything to do with DCP-o-matic.
joelh
Posts: 2
Joined: Tue May 22, 2018 9:00 pm

Re: RTL subtitles

Post by joelh »

Alex,

I would like to receive your advice if possible. My RTl problem is as follows.
I created a Hebrew subtitle file for a Czech movie for the NFA(Czech national film archive). I used Subtitle Edit and worked in UTF8. Using KMplayer or POTplayer all the subs look correct on a PC. VLC however has a problem, when words with Hebrew diacritics (Nikud) appear. In all such words the diacritics are interpreted as separate signs and are not placed properly (they appear sequentially as if they were separate chars).
Now DCP comes in : The NFA need a DCP version and I suggested they burn it in. I sent the NFA my srt file. They sent me a screener to check their result. Everything looks correct but words that have Hebrew diacritics - just as in my problem with VLCPlayer. Now I sent them a version using Subtitle edit with DCP/Interop export. Hope that solves the problem but I am not sure.
What is the proper/ safe solution to this problem?
Many thanks for some advice.
Alex Asp
Posts: 92
Joined: Mon Apr 11, 2016 3:59 am

Re: RTL subtitles

Post by Alex Asp »

Hi Joelh,

Frankly, I am surprised to see such a request.
The diacritics are almost never used in Hebrew translations neither for Hollywood blockbusters, nor for independent or arthouse movies.
And the reason is total unpredictability of the final outcome, just as you have described.

There is only one way to control the diacritics position, and the only way to achieve it is to render the subtitles as graphic files.
Since DOM is currently does not support graphic subtitles, there is no way to produce subtitles with diacritic marks in proper position.

However there is a runaround for this situation, however cumbersome it may be. You might want to render your subtitles as graphics for Blu-ray, and then mux then into Blu-ray compatible .ts stream. Bring that stream into DOM project, and burn subtitles into a picture.
Alex Asp
Posts: 92
Joined: Mon Apr 11, 2016 3:59 am

Re: RTL subtitles

Post by Alex Asp »

One more thing. Diacritic marks, at least in Hebrew apart from a few specific cases are not composite characters (a character plus a diacritic marc) but each mark is simply a different character. Therefore even if the Mac OS or Win OS displays them properly, the servers are not provided with special instructions on how to place them, and DOM has no specific tools to control that This is why they seem all over the place
joelh
Posts: 2
Joined: Tue May 22, 2018 9:00 pm

Re: RTL subtitles

Post by joelh »

Hi Alex,

many thanks for addressing this issue here. I am an absolute novice in DCP and I am surprised to hear that no guaranteed solution exists at the present time. I felt that some non-Hebrew names (people, places) should be translated using diacritics, otherwise the spectators couldn't make out much of them when transcribed in Hebrew. Without it all these names are practically unsolvable puzzles. I used them very sparingly, so more than 99 pct of the text is diacritics free.
As I remarked previously, I created a DCP/interop output with Subtitle Edit, an outstanding free software program. It created an XML file, containing pointers to the the subs properly orderered, while each sub was created on a separate small PNG file. So if I get your description correctly, this might be a graphical workaround as well. Or isn't it? I sent these to NFA lab and am waiting for their response now.
In the worst case I will replace those cca 30-40 words with plain Hebrew characters to make it simple and usable now. However , this is not a local, isolated issue. I have another 3 movies subtitling in the making, with the same problem.
Since a part of my background is in computing and engineering (subtitling is a recent hobby) I wonder whether there is any initiative under way to introduce proper standards for subtitling to account for the fact that there are some RTL languages which need practical workable solutions. the UTF8 character set seems to provide the proper background, so what stops the DCP server developers to implement such (or similar) solutions?
Alex Asp
Posts: 92
Joined: Mon Apr 11, 2016 3:59 am

Re: RTL subtitles

Post by Alex Asp »

Hi Joelh,

I'm not sure that transliterated names would be such a problem for an experienced Hebrew reader, even without the diacritic marks. But you can send me those words/names in question, and I'll see if this is such a critical issue.

Unfortunately the graphical workaround will not work with Interop XML + graphics in DOM. Not yet anyway, as DOM does not support this functionality.
It will work only if you create a Blu-ray image (you can do it with black for video and silence for audio). Then you could import this file (.m2ts) into DOM, and use the subtitle stream to burn the subtitles into the image.
I used this method in DOM version 1, when you couldn't even import SRT RTL file into the program

The big and expensive DCP creating programs will let you use the XML + graphics, if you want to go to a commercial DCP facility.

I don't know what stops DCP server developers to implement full support of all RTL languages, but my guess is the market for them (especially Hebrew and Farsi) is too small to bother.