[off-topic dev question] About the clustered encoding

Anything and everything to do with DCP-o-matic.
ydedy
Posts: 18
Joined: Thu Jan 14, 2016 11:56 am

[off-topic dev question] About the clustered encoding

Post by ydedy »

Just out of curiosity, I was wondering how the clustered encoding on DCP-o-matic is handled since I plan to build a similar app/software in the same spirit with ffmpeg as well.
Does the main server send the frames one by one to each of the encoding servers then once every frames are done the main server is concatenating all the frames into one file? Or the chunks sent to the nodes are larger (like a couple of seconds long)?

I tried reading the source code but couldn't find a clear answer to this.

Cheers
carl
Site Admin
Posts: 2548
Joined: Thu Nov 14, 2013 2:53 pm

Re: [off-topic dev question] About the clustered encoding

Post by carl »

Hi,

In the main DCP-o-matic there is a thread per encoding server core (so if there's a server which says it can do 4 in parallel there will be 4 threads in the main DCP-o-matic for that server). There is also a queue of frames that need to be encoded, and a queue of encoded frames that need to be written to disk.

Each encoding thread takes a frame off the queue, sends it to the server, waits for the reply and puts the encoded frame on the queue to be written. Then another thread takes frames off the "to be written" queue and puts them onto disk.

I haven't done much careful optimisation of this stuff so there may well be a better way!

Cheers,
Carl
ydedy
Posts: 18
Joined: Thu Jan 14, 2016 11:56 am

Re: [off-topic dev question] About the clustered encoding

Post by ydedy »

Hello,

Thanks, this is pretty interesting! About the to-be-encoded frames, are they processed by ffmpeg in a lossless fashion before being queued then sent to the server or this is another library that slices the source data to the queue?

We are using a rather heavy cluster (6x Xeon E5-2630 v4 spread among three servers) and we are very satisfied with the performance. I personally find that there have been a lot of improvement in the encoder itself so not much to complain about.
carl
Site Admin
Posts: 2548
Joined: Thu Nov 14, 2013 2:53 pm

Re: [off-topic dev question] About the clustered encoding

Post by carl »

It depends where they come from. If they are from a "video" file (i.e. one which needs a decoder that maintains state between frames) they are sent as plain uncompressed bitmaps. If they are from "image" files (e.g. TIFF or whatever) they are sent compressed.

Sometimes the outgoing network connection from the "main" DoM is the bottleneck; in those cases it would probably be better to pass all the compressed video data over the network to each server and give each one a decoder, then tell each one which particular frames it should encode. But that is quite complicated!
ydedy
Posts: 18
Joined: Thu Jan 14, 2016 11:56 am

Re: [off-topic dev question] About the clustered encoding

Post by ydedy »

This makes everything clearer now! I was always wondering why the outgoing connection from the main DoM wasn't really saturating the link since I didn't know that there was a prior image processing.

All "video" files are sent as plain uncompressed bitmaps, even intra-frame codecs like ProRes, DNxHD/HR or FFV1?

There would be a way to improve the performance by making sure that every node uses the same mount point containing the source sequences.
But on DoM that would be quite terrible for the non-tech savvy user to deal with.
carl
Site Admin
Posts: 2548
Joined: Thu Nov 14, 2013 2:53 pm

Re: [off-topic dev question] About the clustered encoding

Post by carl »

All "video" files are sent as plain uncompressed bitmaps, even intra-frame codecs like ProRes, DNxHD/HR or FFV1?
Yes. I don't know if we can find out from FFmpeg that a particular codec's frames can be decoded without all frames being passed into the decoder. It would be nice...
There would be a way to improve the performance by making sure that every node uses the same mount point containing the source sequences.
But on DoM that would be quite terrible for the non-tech savvy user to deal with.
Yes, I think it's kind of important for the user that all that stuff is handled by DoM. DoM could 'stream' the sources out to the decoders, to solve that; so DoM sends all the source data rather than just particular frames. But I guess there's a point where this again becomes less efficient: if you're sending uncompressed data for 100 frames but you only actually need 1 it would probably have been better to just send that 1 uncompressed, if you see what I mean.