get_local_id etc. cannot be called from functions
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
pocl |
Fix Released
|
High
|
Carlos Sánchez de La Lama |
Bug Description
get_local_id(), get_group_id(), and some similar routines cannot be called from functions that are called from kernel functions. The example below illustrates this, using the "noinline" attribute to avoid having to use a much larger function that llvm would not inline.
This OpenCL code does not run; it aborts with the error message
Undefined symbols for architecture x86_64:
"__local_id_x", referenced from:
_dot_product1 in parallel.so.o
ld: symbol(s) not found for inferred architecture x86_64
Assertion failed: (error == 0), function pocl_pthread_run, file devices/
void
dot_product1 (__global const float4 *a,
__global const float4 *b, __global float *c)
__attribute_
void
dot_product1 (__global const float4 *a,
__global const float4 *b, __global float *c)
{
int gid = get_local_id(0);
c[gid] = dot(a[gid], b[gid]);
}
__kernel void
dot_product (__global const float4 *a,
__global const float4 *b, __global float *c)
{
dot_product1 (a, b, c);
}
I removed the forced inlining from the passes, as in some cases it might not be the optimal to inline everything (for example to make better use of instruction caches) and it is not a strict requirement for the replication algorithm. But then this might happen. The "privatization" step on the passes has to be fixed to either 1) inline always any function leading to a get_xxx or 2) do not privatize the related global variables if there is a get_xxx called from a subfunction. I think I will do 1) as the other option is thread-unsafe.