NVIDIA
Dell 730 with M60 - nvidia-smi throwing power error
Hello, 3 brand new dell r730 with an m60 in each (factory shipped) each of them throwing the same error with nvidia-smi: [i]Unable to determine the device handle for GPU 0000:05:00.0: Unable to communicate with GPU because it is insufficiently powered. This may be because not all required external power cables are attached, or the attached cables are not seated properly.[/i] running latest esxi 3825889, applied Dell recommended BIOS settings as per http://www.nvidia.com/content/grid/pdf/grid-vgpu-deployment-guide.pdf installed latest grid driver: 361.45.09-1OEM.600.0.0.2494585 from licensing portal ran gpumodeswitch on all hosts to switch all gpu's to graphics mode and confirmed with lspci -n | grep 10de: [i]0000:05:00.0 Class 0300: 10de:13f2 [vmgfx0] 0000:06:00.0 Class 0300: 10de:13f2 [vmgfx1][/i] A VM with vGPU starts on one host but not on another > [i]Failed to start the virtual machine. Module DevicePowerOn power on failed. Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU 'grid_m60-0b'. No graphics device is available for vGPU 'grid_m60-0b'.[/i] maybe it's just a wiring issue, I will have to check that tomorrow when going on site Thanks
Hello,

3 brand new dell r730 with an m60 in each (factory shipped) each of them throwing the same error with nvidia-smi:

Unable to determine the device handle for GPU 0000:05:00.0: Unable to communicate with GPU because it is insufficiently powered.
This may be because not all required external power cables are attached, or the attached cables are not seated properly.


running latest esxi 3825889, applied Dell recommended BIOS settings as per http://www.nvidia.com/content/grid/pdf/grid-vgpu-deployment-guide.pdf


installed latest grid driver: 361.45.09-1OEM.600.0.0.2494585 from licensing portal

ran gpumodeswitch on all hosts to switch all gpu's to graphics mode and confirmed with lspci -n | grep 10de:
0000:05:00.0 Class 0300: 10de:13f2 [vmgfx0]
0000:06:00.0 Class 0300: 10de:13f2 [vmgfx1]


A VM with vGPU starts on one host but not on another > Failed to start the virtual machine.
Module DevicePowerOn power on failed.
Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU 'grid_m60-0b'.
No graphics device is available for vGPU 'grid_m60-0b'.


maybe it's just a wiring issue, I will have to check that tomorrow when going on site

Thanks

#1
Posted 07/19/2016 07:25 PM   
Check they're cabled correctly. The M60's should have a 300W power connector, and not a standard 8 pin PCIe cable. I'm not sure if Dell has a specific cable, or whether they use the adapter cable that takes the feed from 2x PCIe cables to deliver 300W. That or underpowered PSU's are the likely cause.
Check they're cabled correctly.

The M60's should have a 300W power connector, and not a standard 8 pin PCIe cable. I'm not sure if Dell has a specific cable, or whether they use the adapter cable that takes the feed from 2x PCIe cables to deliver 300W.

That or underpowered PSU's are the likely cause.

Jason Southern, Regional Lead for ProVis Sales - EMEA: NVIDIA Ltd.

#2
Posted 07/19/2016 11:21 PM   
Jason is right the Dell R720 and R730 require a GPU enablement kit including power cables https://qrl.dell.com/Files/en-us/Html/Manuals/R730/GPU%20Card%20Installation%20Guidelines=GUID-C3605F65-C4AE-4BEB-9A32-907A90753B81=1=en-us=.html I seem to recall it was something called a "8pin to 8pin+6pin" but this is one you need to go back to Dell on and check you have the right power supply and cables as per the GPU enablement kit.
Jason is right the Dell R720 and R730 require a GPU enablement kit including power cables https://qrl.dell.com/Files/en-us/Html/Manuals/R730/GPU%20Card%20Installation%20Guidelines=GUID-C3605F65-C4AE-4BEB-9A32-907A90753B81=1=en-us=.html

I seem to recall it was something called a "8pin to 8pin+6pin" but this is one you need to go back to Dell on and check you have the right power supply and cables as per the GPU enablement kit.

#3
Posted 07/21/2016 11:17 AM   
Hi Jason and Rachel, Thanks for your replies, It was a wiring issue, after connecting them as you instructed we don't see the nvidia-smi issues anymore and we are now able to use vGPU on all hosts. They came from dell direct, so strange how one host was wired correctly and others not. Thanks, J. Wirth
Hi Jason and Rachel,

Thanks for your replies,

It was a wiring issue, after connecting them as you instructed we don't see the nvidia-smi issues anymore and we are now able to use vGPU on all hosts.

They came from dell direct, so strange how one host was wired correctly and others not.

Thanks,

J. Wirth

#4
Posted 07/22/2016 12:56 PM   
Hi @Technicalmt have you got any feedback on the specifics or a reference SR with Dell. I think I have one with a similar issue at the moment and am going back and forth with Dell support trying to resolve. Can you please provide and explanation or a photo of how the GPUs are cabled on a working config?
Hi @Technicalmt have you got any feedback on the specifics or a reference SR with Dell. I think I have one with a similar issue at the moment and am going back and forth with Dell support trying to resolve. Can you please provide and explanation or a photo of how the GPUs are cabled on a working config?

#5
Posted 08/10/2016 12:28 PM   
Hi Everybody, I have the same issue with brand new DELL R730 server and vSphere 6.0 U2. Each Server has 2x Nvidia M60 factory installed. But only one of the server is able to see and use both gpu boards. The others show the message "Unable to communicate with GPU because it is insufficiently powered.". I checked the BIOS and software components. I couldn't find any difference. Please help. It's for me actually not possible to check the cables inside the servers. Regards, VM_master
Hi Everybody,

I have the same issue with brand new DELL R730 server and vSphere 6.0 U2.
Each Server has 2x Nvidia M60 factory installed. But only one of the server is able to see and use both gpu boards. The others show the message "Unable to communicate with GPU because it is insufficiently powered.".
I checked the BIOS and software components. I couldn't find any difference. Please help. It's for me actually not possible to check the cables inside the servers.


Regards,
VM_master

#6
Posted 02/06/2017 04:48 PM   
Scroll To Top

Add Reply