View Bug Details

IDProjectCategoryView StatusLast Update
0002592DCP-o-maticFeaturespublic2023-08-13 12:17
Reporteroverlookmotel Assigned Tocarl  
PrioritynormalSeverityminorReproducibilityhave not tried
Status acknowledgedResolutionopen 
Product Version2.16.59 
Target Version2.16.x 
Summary0002592: Use 3D LUT for RGB/YUV to XYZ colour space conversion?
Description

As far as I can see, libdcp does RGB to XYZ colour space conversion using a fairly calculation-heavy process.
https://github.com/cth103/libdcp/blob/main/src/rgb_xyz.cc#L270-L338

Applying 2.6 gamma for XYZ uses a LUT (presumably to avoid slow pow() operations) performed "piecewise" - so actually 1 LUT for small values and a 2nd LUT for larger values.

Judging by rgb_xyz_lut_test, this LUT is a slight approximation, and produces inaccuracies of +/-1 for at least some values.
https://github.com/cth103/libdcp/blob/main/test/rgb_xyz_test.cc#L154-L162

(also I'm not sure if rgb_xyz_lut_test is testing all relevant values - 0.000001 may not be small enough to cover the very bottom of the scale where errors are most likely)

For at least some input colour spaces, I wonder if it'd be workable to use a 3D LUT which maps from all possible RGB/YUV inputs to XYZ output.

The two most common sources I make DCPs from are:

  • H264 (usually 8-bit YUV)
  • ProResHQ (usually 10-bit YUV)

Memory required for 3D LUTs mapping all possible input RGB/YUV values to 12-bit XYZ would be:

8 bit input: 2^24 2 bytes/pixel = 32 MiB
10 bit input: 2^30
2 bytes/pixel = 2 GiB
12 bit input: 2^36 * 2 bytes/pixel = 128 GiB

These numbers could be reduced by 25% by packing the bits, at the cost of extra operations to unpack them.

Obviously 128 GiB is too much memory! But 32 MiB seems quite modest, and for computers with plenty of RAM, 2 GiB is also not entirely unreasonable. I often see a single browser tab in Chrome eating up 200+ MB.

I assume that a single LUT lookup per pixel would be faster than all the calculations involved in YUV -> RGB -> XYZ conversion, and it'd also improve accuracy by dispensing with the approximated gamma LUT.

This would be a trade-off between memory usage and processing speed and/or accuracy, but for some common inputs, I think the trade-off might be worthwhile (though I have no idea if colour space conversion is a drop in the ocean compared to the computation required to compress JPEG2000).

Do you think this might be worth investigating? I'd be happy to write a test implementation and benchmark it if you think this may have potential (but in Rust, not C++, I'm afraid).

TagsNo tags attached.
Branch
Estimated weeks required
Estimated work requiredUndecided

Activities

overlookmotel

2023-07-28 21:44

developer   ~0005862

I got my maths wrong! Size of the LUTs would be:

8 bit: 2^24 x 36 bits = 72 MiB
10 bit: 2^30 x 36 bits = 4.5 GiB
12 bit: 2^36 x 36 bits = 288 GiB

(this time assuming tightly packed data)

If input is legal range YUV (typical for ProRes), and it's acceptable to clamp values to legal range before using the LUT, LUTs could be smaller:

8 bit: 47.8 MiB
10 bit: 3.0 GiB
12 bit: 191 GiB

4.5 GiB or 3.0 GiB are a bit large for comfort, but I'd say still workable for some users. On a machine with 16 GB RAM, it'd work for me, for example.

Bug History

Date Modified Username Field Change
2023-07-28 17:05 overlookmotel New Bug
2023-07-28 21:44 overlookmotel Note Added: 0005862
2023-08-13 11:19 carl Assigned To => carl
2023-08-13 11:19 carl Status new => acknowledged
2023-08-13 12:17 carl Target Version => 2.16.x
2023-08-13 12:17 carl Estimated work required => Undecided