Keeping notes here, mostly for myself, but also in case someone else is looking at this.
I bisected it down to ply_pixel_buffer_fill_with_argb32_data_at_opacity_with_clip(), using gcc 4.4's shiny new #pragma GCC optimize ("1") feature. That doesn't work within a function, so I now had to resort to adding some debug statements and diff'ing logs.
It seems fine up until this point:
x += cropped_area.x - fill_area->x;
y += cropped_area.y - fill_area->y;
opacity_as_byte = (uint8_t) (opacity * 255.0);
The call to make_pixel_value_translucent() is not to blame either.
Problem happens in ply_pixel_buffer_blend_value_at_pixel() in the if statement: apparently blend_two_pixel_values() delivers a different result under -O2 and -O1:
Indeed, making blend_two_pixel_values() non-inline and just building this function with -O1 makes it work.
Within blend_two_pixel_values(), I only get hits for the "then" branch. The decomposition of the pixel_value_{1,2} input args into rgba are identical in both cases, but the output value is not in many cases:
Keeping notes here, mostly for myself, but also in case someone else is looking at this.
I bisected it down to ply_pixel_ buffer_ fill_with_ argb32_ data_at_ opacity_ with_clip( ), using gcc 4.4's shiny new #pragma GCC optimize ("1") feature. That doesn't work within a function, so I now had to resort to adding some debug statements and diff'ing logs.
It seems fine up until this point:
x += cropped_area.x - fill_area->x;
y += cropped_area.y - fill_area->y;
opacity_as_byte = (uint8_t) (opacity * 255.0);
The call to make_pixel_ value_transluce nt() is not to blame either.
Problem happens in ply_pixel_ buffer_ blend_value_ at_pixel( ) in the if statement: apparently blend_two_ pixel_values( ) delivers a different result under -O2 and -O1:
-ply_pixel_ buffer_ blend_value_ at_pixel old: 4281008158 new: 4281073951 buffer_ blend_value_ at_pixel old: 4281008158 new: 4281008414
+ply_pixel_
Indeed, making blend_two_ pixel_values( ) non-inline and just building this function with -O1 makes it work.
Within blend_two_ pixel_values( ), I only get hits for the "then" branch. The decomposition of the pixel_value_{1,2} input args into rgba are identical in both cases, but the output value is not in many cases:
$ diff -u /tmp/good /tmp/bad | wdiff -d
# some examples only:
THEN rgba1(0 0 0 1) rgb2(43 0 30) rgb(43 0 30)
THEN rgba1(1 1 1 2) rgb2(43 0 30) [-rgb(44-] {+rgb(43+} 1 [-31)-] {+30)+}
THEN rgba1(3 3 3 4) rgb2(43 0 30) [-rgb(45 3 33)-] {+rgb(42 1 30)+}
With that we can hopefully construct a minimal test case. To be continued tomorrow...