non-pointer struct kernel arguments fail due to varying ABIs

Bug #987905 reported by Pekka Jääskeläinen on 2012-04-24
This bug affects 2 people
Affects Status Importance Assigned to Milestone

Bug Description

pocl assumes the kernels arguments are mapped as they are in sources. Structs are problematic as different ABIs map them differently. E.g. on AMD64/Linux it seems to split the struct to two integers some times which causes the kernel launching to fail as it maps the arguments to wrong positions.

This is not trivial to fix. Some possibilities:
- force the kernel ABI to be fixed at Clang side
- create a more sophisticated work group launcher that converts from the pocl argument table to the correct types and then calls the kernel with the correct ABI

The first is cleaner but possibly hard to get through. PTX uses a PTX_Kernel calling convention for the OpenCL kernels. For pocl the best would be that all targets used some (sensible) fixed calling convention for the kernels. For other functions, the default ABI can be used.

Possible solution is creating the launcher in C code and let clang compile it, so it takes whatever ABI conventions are there in effect. Such launcher will take kernel parameters from globals (they are common for all work groups so they do not need to be thread safe), and those globals would be were clSetKernelArg writes the set arguments.

Now I remember that something like this was planned but the problem is that when working at LLVM IR level you do not know the C types of the arguments of the kernel, thus I think parsing the kernel sources would be needed to implement this. But then again, that's needed also to implement clGetKernelArgInfo().

The -cl-kernel-arg-info seems to be implemented in Clang SVN (3.2). That is the official switch used to store and fetch the argument info which allows implementing clGetKernelArgInfo and allows generating a launcher that adheres to the target's ABI.

llvm/tools/clang$ cat test/CodeGenOpenCL/
// RUN: %clang_cc1 %s -cl-kernel-arg-info -emit-llvm -o - | FileCheck %s

kernel void foo(int *X, int Y, int anotherArg) {
  *X = Y + anotherArg;

// CHECK: metadata !{metadata !"kernel_arg_name", metadata !"X", metadata !"Y", metadata !"anotherArg"}

The SPIR kernel calling conventions, when adopted, should fix issues like these.

Andreas Klöckner (inform) wrote :

Still no structs? :(

Unfortunately no. Personally I see passing structs to kernels as a portability trap due to the various ways structs can be laid out in memory, so this hasn't caused us trouble in our work projects as we have avoided such code. Also, one can always work around it by using a pointer to the struct. Thus, it hasn't been on top of my personal priority list, and seems the situation has been the same for other developers of pocl.

I cannot promise any delivery time for this feature from my side as I have my hands full with higher priority features, but if you (or anyone else) wishes to contribute to pocl development, and get this admittedly irritating bit done with, I can give some pointers:

This patch can be mimicked on how to add a possibility to set OpenCL kernels at source level to force the SPIR_KERNEL calling convention for kernels. This calling convention would be nice as it would then be the same for all targets, and the arguments are always at a matching position to the clSetKernelArg.

Now it seems it's not possible to force the particular CC without using the SPIR target for building, which is overkill and might ruin optimizations (like Kalle pointed out at the list) unless we figure out a handy way to convert the SPIR target back to the real target before optimizations.

The other way is to look at the argument metadata and create a launcher that casts from the clSetKernelArg args to the real types before calling the kernel which is in the target's CC. IMO the single kernel calling convention way is much cleaner and should be relatively easy as there's a patch we can mimic to add the needed parts to Clang. Then we just need to #define the 'kernel' keyword to '__attribute((spir_kernel))___ kernel' or similar and we get a kernel we always know how to call regardless of the target ABI/CC.

Furthermore, I did only a quick look at the Clang sources so there might be an even easier way to override the CC for the kernels now that we access the Clang/LLVM API directly instead of via the opt/clang binaries.

Andreas Klöckner (inform) wrote :

Thanks for outlining a possible solution. I don't have the spare cycles at the moment to work on this, but knowing what needs to be done is certainly helpful. :)

Moved here: seems this issue is clearly not specified in the OpenCL specs either (struct alignment).

Changed in pocl:
status: New → Incomplete
status: Incomplete → Won't Fix
status: Won't Fix → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers