View Bug Details

IDProjectCategoryView StatusLast Update
0000328DCP-o-maticFeaturespublic2024-01-03 12:54
ReporterTheo Kooijmans Assigned To 
PrioritynormalSeverityfeatureReproducibilityhave not tried
Status acknowledgedResolutionopen 
Platform64 bitOSWindowsOS Version7
Summary0000328: Use Nvidia CUDA to speedup render times
Description

Is it possible to use the Nvidia CUDA technology to speedup render times?
Adobe Creative Cloud and others used this technology very succesfull to dramaticly increase rendertimes.

Now rendering takes place on the CPU core, the NVidia GTX card has hunderds of cores to do this...

TagsNo tags attached.
Branch
Estimated weeks required
Estimated work requiredMajor

Relationships

related to 0001778 closedcarl Explore use of fastvideo SDK 
related to 0002586 confirmedcarl Support grok GPU acceleration 
has duplicate 0001606 closedcarl Make use of GPU for rendering 

Activities

carl

2014-03-31 12:12

administrator   ~0000313

There is some work being done by a guy on the openjpeg mailing list to accelerate that library with CUDA. If he succeeds it will work in DCP-o-matic too...

tomashnyk@gmail.com

2017-02-18 12:49

reporter   ~0001616

Hm, it seems it is not happening as open source in the end: https://groups.google.com/d/msg/openjpeg/A_OSmrEoNkM/x7hGc6L3BgAJ and https://groups.google.com/d/msg/openjpeg/A_OSmrEoNkM/mwM7JI0XBQAJ

Any possibility of using this: http://apps.man.poznan.pl/trac/jpeg2k ? Thye claim they are much faster then OpenJPEG.

Also, at the announcement of the latest OpenJPEG version ( http://www.openjpeg.org/2016/09/28/OpenJPEG-2.1.2-released ), they claim some performance improvemets are around hte corner: "Note that meanwhile, in the master branch, an important improvement has been merged, namely T1 optimizations and multithreading support (contribution from Even Rouault … Thanks a lot !). A (much) faster OpenJPEG is on track … Stay tuned for v2.2.0." - Would that be relevant for performance of DOM? It already uses my four cores 100 %.

Carsten

2017-02-21 13:23

manager   ~0001617

DOM already contains some optimizations that are not part of the official OpenJPEG line.

http://dcpomatic.com/release-notes.php?v=2.7.1

Technically, it would be no problem to add GPU support to DOM. However, Carl doesn't want to add proprietary or commercial code to the project. Also, GPU support should be compatible with the multi-platform development approach of DCP-o-matic.

Keep in mind, Cinema servers do not play 'any' JPEG2000 codestream, but only a special restricted subtype. Not every highspeed J2C codec is able to keep the codestream within these limits. To deviate from this route means a risc of crashing servers.

So, Carl is definitely keeping an eye on GPU support, but he is for good reasons also careful about it. DOMs J2C codestream robustness has never been questioned as long as I remember following this project, and that is a very valuable asset for a free/open source software project.

  • Carsten

carl

2017-02-22 13:48

administrator   ~0001620

As far as I know the stuff that OpenJPEG are now including is basically already in DoM. DoM multi-threads encoding itself, and it has a set of similar T1 optimizations that I offered to the project years ago.

I am not an expert on the innards of J2K nor CUDA, so this is a difficult task. I have done some work on the poznan.pl code but it is not, so far as I can see, suitable for cinema work out of the box, so it needs some modifications.

carl

2019-02-13 21:31

administrator   ~0003083

Last edited: 2019-02-13 23:46

@carl: poznan-jpeg2k test script is run/test. dumpj2k shows numerous differences.

g2only

2019-09-22 02:21

reporter   ~0003423

I'm curious (guess that's why they named me George, ha!), but on the macOS/OSX side of life is utilizing the "Metal" gpu api (https://developer.apple.com/metal/) of any use? At least on the mac side...

carl

2020-08-23 22:40

administrator   ~0003906

@carl: poznan notes/queries

  • tier1/coeff_coder vs tier1/ebcot/ ?
  • encode_tile() i.e. T1 by far the biggest problem (ca 150ms, next worst is conversion 27ms, fwt 13ms)
  • curious that T2 appears so quick

Bug History

Date Modified Username Field Change
2014-03-30 12:45 Theo Kooijmans New Bug
2014-03-31 12:12 carl Note Added: 0000313
2014-03-31 12:12 carl Status new => acknowledged
2015-06-12 11:49 carl Summary Use Nvidia CUDA to speedup rendertimes => Use Nvidia CUDA to speedup render times
2015-06-12 11:51 carl Target Version => 2.x
2015-06-12 12:19 carl Estimated work required => Major
2015-06-12 12:20 carl Severity major => feature
2017-02-18 12:49 tomashnyk@gmail.com Note Added: 0001616
2017-02-21 13:23 Carsten Note Added: 0001617
2017-02-22 13:48 carl Note Added: 0001620
2019-02-13 21:31 carl Note Added: 0003083
2019-02-13 23:46 carl Note Edited: 0003083
2019-09-19 23:58 carl Relationship added related to 0001606
2019-09-22 02:21 g2only Note Added: 0003423
2020-08-23 22:40 carl Note Added: 0003906
2020-08-23 22:41 carl Relationship replaced has duplicate 0001606
2020-08-23 22:41 carl Relationship added related to 0001778
2024-01-03 12:54 carl Relationship added related to 0002586