NVIDIA
Grid k2 on Intel S5520UR board
hi guys, i'm trying to get grid k2 to work on intel S5520 board. ESXI 6.5 installed. and device visible on the online portal under hardware PCI device. but when i ssh in, and run "nvidia-smi" it gives me "Failed to initialize NVML: Unknown Error" so i ran "dmesg | grep NVIDIA" and it showed " [root@localhost:~] dmesg | grep NVIDIA VMB: 323: name: /NVIDIA_k.v00 2018-04-26T20:32:02.331Z cpu0:65536)VisorFSTar: 1954: NVIDIA_k.v00 for 0x42af88e bytes 2018-04-26T20:32:07.707Z cpu15:65999)Elf: 2043: module nvidia has license NVIDIA 2018-04-26T20:32:08.101Z cpu15:65999)NVRM: loading NVIDIA UNIX x86_64 Kernel Module 367.124 Wed Jan 3 15:55:25 PST 2018 2018-04-26T20:32:08.111Z cpu11:65979)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:08.111Z cpu11:65979)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:08.111Z cpu11:65979)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:10.109Z cpu11:65979)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:10.118Z cpu11:65979)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:10.118Z cpu11:65979)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:10.118Z cpu11:65979)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:10.913Z cpu11:65979)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:13.009Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:13.009Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:13.009Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:13.269Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:13.275Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:13.275Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:13.275Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:13.921Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:14.343Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:14.343Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:14.343Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:15.115Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:15.121Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:15.121Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:15.121Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:15.917Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:15.929Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:15.929Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:15.929Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:16.713Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:16.722Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:16.722Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:16.722Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:17.519Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:17.532Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:17.532Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:17.532Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:18.316Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:18.321Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:18.321Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:18.321Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:19.119Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:20.157Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:20.157Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:20.157Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:20.218Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:20.223Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0. 2018-04-26T20:32:20.223Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'. 2018-04-26T20:32:20.223Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device! 2018-04-26T20:32:20.920Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed. 2018-04-26T20:32:48.658Z cpu11:67584)ALERT: NVIDIA: module load failed during VIB install/upgrade. 2018-04-26T20:32:48.677Z cpu14:67585)NVIDIA: Starting vGPU Services. 2018-04-26T20:32:48.715Z cpu7:67588)NVIDIA: Starting Xorg service. " can somebody help please? thanks
hi guys, i'm trying to get grid k2 to work on intel S5520 board. ESXI 6.5 installed. and device visible on the online portal under hardware PCI device. but when i ssh in, and run "nvidia-smi" it gives me "Failed to initialize NVML: Unknown Error"
so i ran "dmesg | grep NVIDIA"
and it showed
"
[root@localhost:~] dmesg | grep NVIDIA
VMB: 323: name: /NVIDIA_k.v00
2018-04-26T20:32:02.331Z cpu0:65536)VisorFSTar: 1954: NVIDIA_k.v00 for 0x42af88e bytes
2018-04-26T20:32:07.707Z cpu15:65999)Elf: 2043: module nvidia has license NVIDIA
2018-04-26T20:32:08.101Z cpu15:65999)NVRM: loading NVIDIA UNIX x86_64 Kernel Module 367.124 Wed Jan 3 15:55:25 PST 2018
2018-04-26T20:32:08.111Z cpu11:65979)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:08.111Z cpu11:65979)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:08.111Z cpu11:65979)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:10.109Z cpu11:65979)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:10.118Z cpu11:65979)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:10.118Z cpu11:65979)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:10.118Z cpu11:65979)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:10.913Z cpu11:65979)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:13.009Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:13.009Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:13.009Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:13.269Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:13.275Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:13.275Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:13.275Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:13.921Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:14.343Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:14.343Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:14.343Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:15.115Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:15.121Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:15.121Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:15.121Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:15.917Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:15.929Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:15.929Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:15.929Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:16.713Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:16.722Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:16.722Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:16.722Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:17.519Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:17.532Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:17.532Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:17.532Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:18.316Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:18.321Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:18.321Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:18.321Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:19.119Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:20.157Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:07:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:20.157Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:20.157Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:20.218Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:20.223Z cpu7:66069)DMA: 945: Protecting DMA engine 'NVIDIADmaEngine'. Putting parent PCI device 0000:08:00.0 in IOMMU domain 0x4302ebbbadd0.
2018-04-26T20:32:20.223Z cpu7:66069)DMA: 646: DMA Engine 'NVIDIADmaEngine' created using mapper 'DMAIOMMU'.
2018-04-26T20:32:20.223Z cpu7:66069)NVRM: Can't find an IRQ for NVIDIA device!
2018-04-26T20:32:20.920Z cpu7:66069)DMA: 691: DMA Engine 'NVIDIADmaEngine' destroyed.
2018-04-26T20:32:48.658Z cpu11:67584)ALERT: NVIDIA: module load failed during VIB install/upgrade.
2018-04-26T20:32:48.677Z cpu14:67585)NVIDIA: Starting vGPU Services.
2018-04-26T20:32:48.715Z cpu7:67588)NVIDIA: Starting Xorg service.
"
can somebody help please?
thanks

#1
Posted 04/27/2018 02:49 AM   
http://nvidia.custhelp.com/app/answers/detail/a_id/4119/~/incorrect-bios-settings-on-a-server-when-used-with-a-hypervisor-can-cause-mmio
hi sschaber, i did set bios to disable above 4GB memory mapping. still the same.
hi sschaber, i did set bios to disable above 4GB memory mapping. still the same.

#3
Posted 04/30/2018 01:57 AM   
[quote=""]2018-04-26T20:32:08.111Z cpu11:65979)NVRM: Can't find an IRQ for NVIDIA device![/quote] You can check BIOS/ESXi: - try to use "dmesg | grep [b]NVRM[/b]" - try to list hardware resource assignments "esxcfg-info -a | less" and search for "Nvidia" device and analyze "Bar Info" (check "Address" assignments for IO and memory Bars) and "Vector"/"Old IRQ" (interrupt assignment). [i](I cannot give you expected output because not using ESXi with vGPU).[/i] - VT-d/"VMDirectPath" engine must be enabled/working/certified for your MB (it is unclear if the problem starts with VT-d enable because paravirtualized/mediated vGPU need VT-d support too (VT-d includes "Interrupt Remapping")) . You can check general hits [url]https://www.google.com/search?q=NVRM%3A+Can%27t+find+an+IRQ+for+NVIDIA+device[/url] and/or update your bios (some hints leads to ACPI error in some BIOSes). [code][ 2.172069] NVRM: Can't find an IRQ for your NVIDIA card! [ 2.172070] NVRM: Please check your BIOS settings. [ 2.172070] NVRM: [Plug & Play OS] should be set to NO [ 2.172071] NVRM: [Assign IRQ to VGA] should be set to YES [/code]
said:2018-04-26T20:32:08.111Z cpu11:65979)NVRM: Can't find an IRQ for NVIDIA device!

You can check BIOS/ESXi:
- try to use "dmesg | grep NVRM"
- try to list hardware resource assignments "esxcfg-info -a | less" and search for "Nvidia" device and analyze "Bar Info" (check "Address" assignments for IO and memory Bars) and "Vector"/"Old IRQ" (interrupt assignment). (I cannot give you expected output because not using ESXi with vGPU).
- VT-d/"VMDirectPath" engine must be enabled/working/certified for your MB (it is unclear if the problem starts with VT-d enable because paravirtualized/mediated vGPU need VT-d support too (VT-d includes "Interrupt Remapping")) .

You can check general hits https://www.google.com/search?q=NVRM%3A+Can%27t+find+an+IRQ+for+NVIDIA+device and/or update your bios (some hints leads to ACPI error in some BIOSes).
[    2.172069] NVRM: Can't find an IRQ for your NVIDIA card!
[ 2.172070] NVRM: Please check your BIOS settings.
[ 2.172070] NVRM: [Plug & Play OS] should be set to NO
[ 2.172071] NVRM: [Assign IRQ to VGA] should be set to YES

#4
Posted 04/30/2018 07:27 AM   
Scroll To Top

Add Reply