IRC conversation log (partially redacted for privacy of other users): [Jan24 12:30:44] BTW Oblomov, I was just looking over our convo a while ago about ICDs and ICD loaders, and I finally checked /etc/OpenCL/vendors and saw the nvidia icd, but I still don't see it in clinfo [Jan24 12:33:09] So you said that could mean that there is missing hardware, which is not true, or that there are missing libraries, which I am not sure how to look for [Jan24 12:36:14] ldd $(cat /etc/OpenCL/vendors/nvidia.icd) [Jan24 12:38:27] Yeah no such file or directory [Jan24 12:42:54] so the file is not there? [Jan24 12:42:40] Oblomov: libnvidia-opencl.so.1 is the output so it is there [Jan24 12:43:12] locate libnvidia-opencl.so.1 [Jan24 12:43:59] Wait lol no I just thought about it, Oblomov where is it supposed to be? [Jan24 12:44:10] nvidia.icd is a regular file [Jan24 12:44:36] yes [Jan24 12:44:51] the file contains a string which is the name or path of the library [Jan24 12:45:07] but the actual library is searched for in the due paths [Jan24 12:44:50] Oh so libnvidia-opencl.so.1 is the name of an apt package? [Jan24 12:45:12] so you have to find that library [Jan24 12:45:20] no it's the name of the library [Jan24 12:45:37] most likely installd by some nvidia-compute-* package if this is ubuntu [Jan24 12:45:33] Ok it's not anywhere in PATH then, where would it be [Jan24 12:52:11] Like what "due paths" do you mean [Jan24 12:53:03] I have not looked through my libs before, I apologize if it's a dumb question, but where am I supposed to look to find such a library? [Jan24 13:43:46] Liver_K: dpkg -L nvidia-compute-525 [Jan24 14:46:54] fwiw I'll be tagging the next clinfo release soon (ideally before the end of week, so if any of you guys wants to do some testing of the latest properties please do [Jan24 14:53:28] Athas: btw, if you want to test that double vs float thing, ARM on android doesn't support cl_khr_fp64 [Jan24 15:19:31] Oblomov: Ok; I'll substitute my version after nvidia-compute- [Jan24 15:35:19] does anybody have access to IBM's opencl platform? [Jan24 15:35:31] I think it's the only thing I haven't seen a clinfo for [Jan24 18:28:30] Oblomov: I have libnvidia-compute-390 - libnvidia-opencl.so.1 is at /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 [Jan24 18:48:23] Now what lol [Jan24 19:39:46] Wait it is a symlink [Jan24 19:40:06] Points to /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 [Jan24 19:41:19] *Point to libnvidia-opencl.so.390.157 [Jan24 19:43:19] What do I do with that linked file? [Jan25 00:13:43] can you ldd it? [Jan25 10:35:20] Oblomov: https://dpaste.com/AHM6UB8AQ [Jan25 10:36:49] looks like it should load, so I don't understand why it wouldn't work [Jan25 10:36:57] do you have the nvidia and nvidia-uvm modules loaded? [Jan25 10:38:29] I have nvidia-headless-390 and all its deps [Jan25 10:40:05] lsmod | grep nvidia [Jan25 10:41:39] Liver_K: ^ [Jan25 10:42:54] And everything there in /proc/modules is already loaded? [Jan25 10:44:34] yes, but lsmod is typically used instead [Jan25 10:44:43] do you have the nvidia and nvidia-uvm modules loaded? [Jan25 10:45:17] Humm would have been helpful to know that a few times before heh. nvidia, nvidia_drm, and nvidia_modeset are loaded, but nothing with nvidia_uvm [Jan25 10:47:47] modprobe nvidia-uvm [Jan25 10:47:52] How do I know that is even an available module? [Jan25 10:48:19] Where do I check all of them, loaded or not? [Jan25 10:49:38] lsmod tells you if it's loaded [Jan25 10:50:47] Oh is that the "used by" column? [Jan25 10:51:23] no, it should be in the first column if it's loaded [Jan25 10:51:03] It displays all of them and only loaded ones are used by something else? [Jan25 10:51:05] Oh [Jan25 10:51:38] the used by are the users [Jan25 10:51:56] but if it says the nvidia module is used by nvidia_uvm, it means nvidia-uvm is loaded [Jan25 10:51:58] Oblomov: I was asking where to check for ALL modules, not just loaded ones, but it looks like modprobe manpage says they are all in /lib/modules [Jan25 10:59:05] Oblomov: None of them have uvm in the used by to clarify [Jan25 10:59:17] I am just trying to find it somewhere before I try to load it [Jan25 11:11:34] Liver_K: you need the uvm module to run compute loads [Jan25 11:11:39] be it cuda or opencl [Jan25 11:12:46] I actually have an enable-gpu module that modprobes nvidia and nvidia-uvm [Jan25 11:14:59] Oblomov: I don't think there exists an nvidia-umv module [Jan25 11:15:47] Modprobing it fails, but I expected that since I grepped for it everywhere under /lib/modules/$(uname -r) and absolutely nothing with the term "uvm" [Jan25 11:16:00] So I don't have it for some reason [Jan25 11:56:01] Liver_K: ok maybe 390 didn't have that yet [Jan25 11:56:18] in which case I have no idea why it doesn't appear in your clinfo [Jan25 11:56:24] even if cuda works [Jan25 11:56:31] what gpu is this? [Jan25 12:08:55] GeForce GTX 570 [Jan25 12:09:09] 390 is the newest version that works [Jan25 12:09:19] Still being maintained AFAIK [Jan25 12:09:45] Oblomov: CUDA does not work, I have not done anything to set that up [Jan25 12:10:55] oh god fermi [Jan25 12:11:12] Fermi? [Jan25 12:12:56] Oblomov: What is this "nvidia-uvm" module supposed to do? [Jan25 12:13:33] Liver_K: enable unified virtual memory [Jan25 12:13:39] but it's a “recent” thing [Jan25 12:13:43] Yeah lol no way is that gonna be on this card [Jan25 12:13:48] But is it really required? [Jan25 12:14:17] no [Jan25 12:14:30] do you have the nvidia-smi tool installed? [Jan25 12:14:09] Ok so must be something else [Jan25 12:14:23] Oblomov: Not currently, I was going to do that soon [Jan25 12:14:26] I'll do it now [Jan25 12:24:51] https://dpaste.com/HJZFR4UC5 [Jan25 12:28:57] does clinfo work now? [Jan25 12:30:14] wait, is this a headless setup? [Jan25 12:29:59] No the GPU was visible with lshw before, clinfo still only reports my pocl platform [Jan25 12:30:15] Oblomov: Here is the bug report though if you want it: http://0x0.st/oFZI.log.gz [Jan25 12:30:51] Should I restart the machine though and run clinfo after that to try? [Jan25 12:31:32] no [Jan25 12:31:14] Oblomov: Oh sorry didn't see your last question, yes this is a headless setup [Jan25 12:31:21] nvidia-headless-390 is installed [Jan25 12:32:16] can you run strace -o /tmp/strace.log clinfo -l and pastebin /tmp/strace.log? [Jan25 12:32:15] Sure give me a min [Jan25 12:36:16] Oblomov: http://0x0.st/oFZD.log [Jan25 12:36:25] Oh sorry you said pastebin that [Jan25 12:38:01] Here https://dpaste.com/4SS4J8V54 [Jan25 12:40:53] so, it's looking for the nvidia-uvm stuff without finding it [Jan25 12:41:04] so I'm guessing that's the reason why it won't work [Jan25 12:41:19] it even tries to modprobe it [Jan25 12:41:15] Lol [Jan25 12:41:46] Does that mean the platform is actually there but clinfo doesn't like the old driver without uvm? [Jan25 12:42:10] "there" as in installed and usable [Jan25 12:42:47] it means the ubuntu driver is fsckd up for some reason because it's missing components [Jan25 12:43:16] do you have apt-file installed? [Jan25 12:43:05] Heh fsckd up thas funny [Jan25 12:43:30] Oblomov: Yes [Jan25 12:44:41] which version of ubuntu is this? [Jan25 12:44:28] 22.04 LTS [Jan25 12:44:33] *Ubuntu server [Jan25 12:44:57] jammy? [Jan25 12:44:40] yea [Jan25 12:46:21] I mean should I try getting the actual Linux driver from nvidia and trying to install that? [Jan25 12:46:48] no [Jan25 12:46:33] Ok [Jan25 12:48:00] did you install nvidia-dkms-390 [Jan25 12:47:48] It was a dep yes [Jan25 12:49:58] and nvidia-kernel-common-390 [Jan25 12:50:23] so from what I'm seeing it should do th thing: https://packages.ubuntu.com/jammy/amd64/nvidia-kernel-common-390/filelist [Jan25 12:50:22] Yes I have that installed as well [Jan25 12:50:51] sudo you run sudo /sbin/create-uvm-dev-node ? [Jan25 12:50:59] if it's there Jan25 12:58:06] Yeah just let me look through that first [Jan25 13:06:50] Alright Oblomov I looked for nvidia-uvm manually in the /proc/devices file like the script does, and there is none, so I ran it anyway to see what happened and still nothing happened that I can see (since the string isn't found there) [Jan25 13:07:41] I give up [Jan25 13:07:44] report the issue to ubuntu [Jan25 13:07:49] So this is a driver issue or ubuntu's fault? [Jan25 13:09:35] Cause if ubuntu has the wrong driver or something in their apt db then why can't I get the one from nvidia? [Jan25 13:14:35] Except I don't think I could find a headless driver from nvidia when i looked [Jan25 13:16:04] it's weird because judging from the files everything should be there, but the uvm component is not being loaded [Jan25 13:16:15] and that's needed for compute [Jan25 13:16:10] What about the loader profile? Could that be causing any problem? I don't know if it's something you set or it's automatically set for each loaded platform, but what I see for pocl is an opencl 3.0 profile [Jan25 13:16:24] And I don't think this older nvidia driver would be ocl 3.0 [Jan25 13:16:54] that's irrelevant [Jan25 13:16:36] K [Jan25 13:17:20] the nvidia-opencl platform doesn't load up because it can't find the uvm device [Jan25 13:17:24] (by the strace) [Jan25 13:17:45] so the question is, why isn't the uvm device being created and/or why don't you have nvidia-uvm [Jan25 13:17:57] find /lib/moduls -name \*uvm\* finds nothing then? [Jan25 13:17:44] Maybe I can look for a package relating to that [Jan25 13:18:20] no it's built with the nvidia dkms kernel sources [Jan25 13:18:51] can you tell me what this finds? find /lib/modules -name \*uvm* [Jan25 13:18:54] ehm [Jan25 13:18:55] can you tell me what this finds? find /lib/modules -name \*uvm\* [Jan25 13:23:39] No that finds nothing [Jan25 13:25:17] Also la -R /lib/modules/5.15.0-58-generic/ | grep uvm outputs nothing [Jan25 13:25:46] And double checked with something I knew was there to make sure it is working [Jan25 13:29:06] what about in the dkms tree? [Jan25 13:28:53] Hmm where's that again [Jan25 13:29:18] something like /var/lib/dkms [Jan25 13:31:37] Yeah grepping recursively in there also yields nil [Jan25 13:31:46] Checked with a known true term again [Jan25 13:33:08] you definitely have some issue. the uvm part should be part of the nvidia 390 kernel module [Jan25 13:33:09] Gah I am mad at ubuntu now [Jan25 13:33:12] And nvidia [Jan25 14:00:24] Oblomov: You are probably more qualified than I to report that bug, if you are willing to :) [Jan25 14:01:22] I don't report bugs for situations I don't have full access to [Jan25 14:52:03] Alright then so I can report it better, can you tell me a bit more about what these components for the kernel module are? [Jan25 14:54:12] Gotta love vague question heh [Jan25 14:59:00] Liver_K: there should be either an nvidia-uvm module produced with the other, or something that produces a /dev/nvidia-uvm device [Jan25 14:59:06] and this is not happening [Jan25 14:59:47] Alright I'll slap it on launchpad sometime then [Jan25 15:00:07] This is extremely annoying >| I will have to use cuda [Jan25 15:02:05] Oh so nvidia-uvm is supposed to be its own kernel module [Jan25 19:14:54] Oblomov: And you're sure the nvidia provided driver wouldn't be any different? [Jan26 01:45:55] Liver_K: you can try it if you want