NVIDIA Logo - Artificial Intelligence Computing Leadership from NVIDIA
Nvidia VMware vSphere-6.7
Hi, I have installed VMware-VMvisor-Installer-6.7.0.update02-13006603.x86_64, on my server and the vib to the supported NVIDIA-VMware_ESXi_6.7_Host_Driver-430.27-1OEM.670.0.0.8169922.x86_64.vib, download from the Nvidia Enterprise Website but the base machine will not start with vGPU. When running 'nvidia-smi' on the host, it's shows the cards: [root@localhost:~] nvidia-smi Tue Jul 2 09:35:54 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.27 Driver Version: 430.27 CUDA Version: N/A | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Quadro RTX 6000 On | 00000000:1A:00.0 Off | Off | | 34% 36C P8 146W / 260W | 159MiB / 24575MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 Quadro RTX 6000 On | 00000000:1B:00.0 Off | Off | | 34% 37C P8 150W / 260W | 159MiB / 24575MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 Quadro RTX 6000 On | 00000000:60:00.0 Off | Off | | 33% 36C P8 137W / 260W | 159MiB / 24575MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 Quadro RTX 6000 On | 00000000:61:00.0 Off | Off | | 34% 37C P8 140W / 260W | 159MiB / 24575MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 4 Quadro RTX 6000 On | 00000000:B1:00.0 Off | Off | | 34% 37C P8 147W / 260W | 159MiB / 24575MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 5 Quadro RTX 6000 On | 00000000:B2:00.0 Off | Off | | 34% 37C P8 146W / 260W | 159MiB / 24575MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 6 Quadro RTX 6000 On | 00000000:DA:00.0 Off | Off | | 33% 31C P8 140W / 260W | 159MiB / 24575MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 7 Quadro RTX 6000 On | 00000000:DB:00.0 Off | Off | | 33% 35C P8 144W / 260W | 159MiB / 24575MiB | 0% Default | +-------------------------------+----------------------+----------------------+ I changed the default GPU mode from “Shared” (vSGA) to “Shared Direct” (vGPU) via vCenter to enable vGPU support for VMs. Here is the error message: Failed to start the virtual machine. Module 'DevicePowerOn' power on failed. Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU 'grid_rtx6000-24q' passthrough device 'pciPassthru0' vGPU 'grid_rtx6000-24q' disallowed by vmkernel Thanks for your help
Hi,

I have installed VMware-VMvisor-Installer-6.7.0.update02-13006603.x86_64, on my server and the vib to the supported NVIDIA-VMware_ESXi_6.7_Host_Driver-430.27-1OEM.670.0.0.8169922.x86_64.vib, download from the Nvidia Enterprise Website but the base machine will not start with vGPU.

When running 'nvidia-smi' on the host, it's shows the cards:


[root@localhost:~] nvidia-smi

Tue Jul 2 09:35:54 2019

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 430.27 Driver Version: 430.27 CUDA Version: N/A |

|-------------------------------+----------------------+----------------------+

| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |

| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |

|===============================+======================+======================|

| 0 Quadro RTX 6000 On | 00000000:1A:00.0 Off | Off |

| 34% 36C P8 146W / 260W | 159MiB / 24575MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

| 1 Quadro RTX 6000 On | 00000000:1B:00.0 Off | Off |

| 34% 37C P8 150W / 260W | 159MiB / 24575MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

| 2 Quadro RTX 6000 On | 00000000:60:00.0 Off | Off |

| 33% 36C P8 137W / 260W | 159MiB / 24575MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

| 3 Quadro RTX 6000 On | 00000000:61:00.0 Off | Off |

| 34% 37C P8 140W / 260W | 159MiB / 24575MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

| 4 Quadro RTX 6000 On | 00000000:B1:00.0 Off | Off |

| 34% 37C P8 147W / 260W | 159MiB / 24575MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

| 5 Quadro RTX 6000 On | 00000000:B2:00.0 Off | Off |

| 34% 37C P8 146W / 260W | 159MiB / 24575MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

| 6 Quadro RTX 6000 On | 00000000:DA:00.0 Off | Off |

| 33% 31C P8 140W / 260W | 159MiB / 24575MiB | 0% Default |

+-------------------------------+----------------------+----------------------+

| 7 Quadro RTX 6000 On | 00000000:DB:00.0 Off | Off |

| 33% 35C P8 144W / 260W | 159MiB / 24575MiB | 0% Default |

+-------------------------------+----------------------+----------------------+


I changed the default GPU mode from “Shared” (vSGA) to “Shared Direct” (vGPU) via vCenter to enable vGPU support for VMs.


Here is the error message:

Failed to start the virtual machine.
Module 'DevicePowerOn' power on failed.
Could not initialize plugin '/usr/lib64/vmware/plugin/libnvidia-vgx.so' for vGPU 'grid_rtx6000-24q'
passthrough device 'pciPassthru0' vGPU 'grid_rtx6000-24q' disallowed by vmkernel


Thanks for your help

#1
Posted 07/02/2019 10:11 AM   
Hi Which server chassis are you running? What happens when you try a smaller profile? Try a 1Q and see if the VM powers on. Also ... Have you disabled ECC? ... Check it by running: [b]nvidia-smi -q[/b] Disable it by running: [b]nvidia-smi -e 0[/b] You'll need to reboot the chassis after running this command You can also try adding the following to the VMs "Advanced Configuration": [b]pciPassthru.use64bitMMIO= "TRUE"[/b] [b]pciPassthru.64bitMMIOSizeGB = "64"[/b] Regards Ben
Hi

Which server chassis are you running?

What happens when you try a smaller profile? Try a 1Q and see if the VM powers on.

Also ...

Have you disabled ECC? ...

Check it by running: nvidia-smi -q

Disable it by running: nvidia-smi -e 0

You'll need to reboot the chassis after running this command

You can also try adding the following to the VMs "Advanced Configuration":

pciPassthru.use64bitMMIO= "TRUE"

pciPassthru.64bitMMIOSizeGB = "64"


Regards

Ben

#2
Posted 07/03/2019 07:44 AM   
Hi My server is a TYAN Model: B7109F77DV10E4HR-2T-N I check with 1Q and it's same error. I have disabled ECC And i have added the following to the VMs "Advanced Configuration" pciPassthru.use64bitMMIO= "TRUE" pciPassthru.64bitMMIOSizeGB = "64" Always the same error Thanks Hyssam
Hi

My server is a TYAN Model: B7109F77DV10E4HR-2T-N

I check with 1Q and it's same error.
I have disabled ECC
And i have added the following to the VMs "Advanced Configuration"

pciPassthru.use64bitMMIO= "TRUE"
pciPassthru.64bitMMIOSizeGB = "64"

Always the same error

Thanks

Hyssam

#3
Posted 07/03/2019 01:09 PM   
Hi Just checking ... but when you added those entries, I assume you added the values [u]without[/u] the quotation marks on each end? " " When you configured the VM, did you select the option on the VM to "Reserve all guest memory" ? Also make sure that the memory allocated to the VM vs the "reserved memory" values are the same. If you've changed the amount of memory allocated to the VM, you need to un-check, then re-check the "reserved memory" option, as it doesn't automatically update and the VM will then fail to power on. I've had it in the past that when changing from "Shared" to "Shared Direct" a host reboot has been required. As although you can manually restart Xorg, sometimes this isn't enough and a full reboot has made the difference. Something you could try just to see whether it's vGPU or System related ... Put one of the GPUs into Passthrough mode, replace the vGPU profile on the VM with it and try powering it on. Regards Ben
Hi

Just checking ... but when you added those entries, I assume you added the values without the quotation marks on each end? " "

When you configured the VM, did you select the option on the VM to "Reserve all guest memory" ?

Also make sure that the memory allocated to the VM vs the "reserved memory" values are the same. If you've changed the amount of memory allocated to the VM, you need to un-check, then re-check the "reserved memory" option, as it doesn't automatically update and the VM will then fail to power on.

I've had it in the past that when changing from "Shared" to "Shared Direct" a host reboot has been required. As although you can manually restart Xorg, sometimes this isn't enough and a full reboot has made the difference.

Something you could try just to see whether it's vGPU or System related ... Put one of the GPUs into Passthrough mode, replace the vGPU profile on the VM with it and try powering it on.

Regards

Ben

#4
Posted 07/04/2019 09:07 AM   
Hi I have added the values without the quotation marks; I have selected on the VM option "Reserve all guest Memory" The reserve memory and the memory alloccated are the same. I have restarted manually xorg and ESXi. How put on of the GPUs ? thanks Hyssam
Hi

I have added the values without the quotation marks;

I have selected on the VM option "Reserve all guest Memory"

The reserve memory and the memory alloccated are the same.

I have restarted manually xorg and ESXi.

How put on of the GPUs ?

thanks

Hyssam

#5
Posted 07/04/2019 09:54 AM   
Hi As I can't actually see how you've configured things, I'm not able to suggest anything else. Can you take a few screenshots of your VM configuration and also GPU configuration from vCenter and post it on here? Maybe that will show a configuration issue somewhere. Regards Ben
Hi

As I can't actually see how you've configured things, I'm not able to suggest anything else.

Can you take a few screenshots of your VM configuration and also GPU configuration from vCenter and post it on here? Maybe that will show a configuration issue somewhere.

Regards

Ben

#6
Posted 07/04/2019 10:32 AM   
VM configuration http://zupimages.net/viewer.php?id=19/27/m4go.png http://zupimages.net/viewer.php?id=19/27/icr1.png http://zupimages.net/viewer.php?id=19/27/d8kn.png Vcenter configuration http://zupimages.net/viewer.php?id=19/27/mag0.png http://zupimages.net/viewer.php?id=19/27/qqye.png VMware ESXi configuration http://zupimages.net/viewer.php?id=19/27/jwdj.png http://zupimages.net/viewer.php?id=19/27/vyjs.png http://zupimages.net/viewer.php?id=19/27/9y9b.png
Thanks for taking the time to do that. The VM has a lot of vCPUs added, but that won't stop it powering on. Apart from that, the general config looks ok initially with no obvious issues to me. Have you made any changes in the BIOS? Can you have a look at the MMIO settings and make sure they're configured correctly. I've not used a Tyan before, so am unsure what options are available, but here's a reference on what you should be looking for: https://nvidia.custhelp.com/app/answers/detail/a_id/4119/~/incorrect-bios-settings-on-a-server-when-used-with-a-hypervisor-can-cause-mmio Regards Ben
Thanks for taking the time to do that.

The VM has a lot of vCPUs added, but that won't stop it powering on. Apart from that, the general config looks ok initially with no obvious issues to me.

Have you made any changes in the BIOS? Can you have a look at the MMIO settings and make sure they're configured correctly. I've not used a Tyan before, so am unsure what options are available, but here's a reference on what you should be looking for: https://nvidia.custhelp.com/app/answers/detail/a_id/4119/~/incorrect-bios-settings-on-a-server-when-used-with-a-hypervisor-can-cause-mmio


Regards

Ben

#8
Posted 07/04/2019 12:21 PM   
Just to be sure, what license is your vcenter?
Just to be sure, what license is your vcenter?

#9
Posted 07/04/2019 12:37 PM   
Also, do you have a GPU profile that ends with: a Does this work or does it give the same error?
Also, do you have a GPU profile that ends with: a
Does this work or does it give the same error?

#10
Posted 07/04/2019 12:46 PM   
You are solved my problem, on the BIOS the intel VT for Directed I/O has been Disabled. I activated the option and my virtual machine works. Thanks for your help Hyssam
You are solved my problem, on the BIOS the intel VT for Directed I/O has been Disabled.

I activated the option and my virtual machine works.

Thanks for your help

Hyssam

#11
Posted 07/04/2019 01:07 PM   
No worries, glad it's now working :-) By the way .... that's a kick-ass configuration! Just out of interest, are you able to say what you plan to use it for? And FYI, you can put 4 of those RTX 6000s with the 24Q profile inside the same VM if using vGPU, as vGPU now supports Multi-GPU configurations with up to 4 GPUs (but you have to use the top profile, in this case 24Q). But if you switch to Passthrough, then you can put [u][b]all[/b][/u] of them inside a single VM !! ... :-D Regards Ben
No worries, glad it's now working :-)

By the way .... that's a kick-ass configuration! Just out of interest, are you able to say what you plan to use it for?

And FYI, you can put 4 of those RTX 6000s with the 24Q profile inside the same VM if using vGPU, as vGPU now supports Multi-GPU configurations with up to 4 GPUs (but you have to use the top profile, in this case 24Q). But if you switch to Passthrough, then you can put all of them inside a single VM !! ... :-D

Regards

Ben

#12
Posted 07/04/2019 01:17 PM   
It's to make a server certification(3D virtualisation) and all GPU are allocated inside a single VM thank you for all Hyssam
It's to make a server certification(3D virtualisation) and all GPU are allocated inside a single VM

thank you for all

Hyssam

#13
Posted 07/04/2019 01:39 PM   
Nice! Thanks for the information Best of luck with your project! Regards Ben
Nice!

Thanks for the information

Best of luck with your project!

Regards

Ben

#14
Posted 07/04/2019 02:14 PM   
Hi, I am unable to download VIB for ESXi 6.7 .I have TESLA V100d. Can anyone help.
Hi, I am unable to download VIB for ESXi 6.7 .I have TESLA V100d.

Can anyone help.

#15
Posted 08/19/2019 01:20 PM   
Scroll To Top

Add Reply