NVIDIA Vmware vSphere-6.5
We have upgraded a esxi host to 6.5 and the VIB to the supported NVIDIA-kepler-vSphere-6.5-367.64-369.71 downloaded from Nvidia's website but the base machine will not start with the GPU (PCI shared device) enabled complaining about not enough GPU memory. When running 'nvidia-smi' on the host, it shows the cards: nvidia-smi Thu Nov 24 00:04:52 2016 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 367.64 Driver Version: 367.64 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GRID K2 On | 0000:05:00.0 Off | Off | | N/A 25C P8 28W / 117W | 18MiB / 4095MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 1 GRID K2 On | 0000:06:00.0 Off | Off | | N/A 23C P8 27W / 117W | 18MiB / 4095MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 2 GRID K2 On | 0000:84:00.0 Off | Off | | N/A 26C P8 28W / 117W | 18MiB / 4095MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 GRID K2 On | 0000:85:00.0 Off | Off | | N/A 24C P8 27W / 117W | 18MiB / 4095MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 68574 G Xorg 7MiB | | 1 68600 G Xorg 7MiB | | 2 68641 G Xorg 7MiB | | 3 68660 G Xorg 7MiB | +-----------------------------------------------------------------------------+ [root@k2-3:~] Um, Xorg? The older esxi host down't show that. Output from 'gpuvm' gpuvm Xserver unix:0, PCI ID 0:5:0:0, vSGA mode, GPU maximum memory 4173824KB GPU memory left 4173824KB. Xserver unix:1, PCI ID 0:6:0:0, vSGA mode, GPU maximum memory 4173824KB GPU memory left 4173824KB. Xserver unix:2, PCI ID 0:132:0:0, vSGA mode, GPU maximum memory 4173824KB GPU memory left 4173824KB. Xserver unix:3, PCI ID 0:133:0:0, vSGA mode, GPU maximum memory 4173824KB GPU memory left 4173824KB. To me, something implies the VIB is not correct but that is the only 1 available via Nvidia's website. Downgrading to NVIDIA-GRID-vGPU-kepler-vSphere-6.0-367.64-369.71 on the esxi host allows the base machine to start with GPU enabled, but View won't compose a pool as it does not recognize the older GPU. Anyway, has anyone else upgraded their Vsphere to 6.5 and run into this issue or are we missing something simple? Thanks.
We have upgraded a esxi host to 6.5 and the VIB to the supported NVIDIA-kepler-vSphere-6.5-367.64-369.71 downloaded from Nvidia's website but the base machine will not start with the GPU (PCI shared device) enabled complaining about not enough GPU memory. When running 'nvidia-smi' on the host, it shows the cards:

nvidia-smi
Thu Nov 24 00:04:52 2016
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.64 Driver Version: 367.64 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K2 On | 0000:05:00.0 Off | Off |
| N/A 25C P8 28W / 117W | 18MiB / 4095MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GRID K2 On | 0000:06:00.0 Off | Off |
| N/A 23C P8 27W / 117W | 18MiB / 4095MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GRID K2 On | 0000:84:00.0 Off | Off |
| N/A 26C P8 28W / 117W | 18MiB / 4095MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GRID K2 On | 0000:85:00.0 Off | Off |
| N/A 24C P8 27W / 117W | 18MiB / 4095MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 68574 G Xorg 7MiB |
| 1 68600 G Xorg 7MiB |
| 2 68641 G Xorg 7MiB |
| 3 68660 G Xorg 7MiB |
+-----------------------------------------------------------------------------+
[root@k2-3:~]

Um, Xorg? The older esxi host down't show that. Output from 'gpuvm'

gpuvm
Xserver unix:0, PCI ID 0:5:0:0, vSGA mode, GPU maximum memory 4173824KB
GPU memory left 4173824KB.
Xserver unix:1, PCI ID 0:6:0:0, vSGA mode, GPU maximum memory 4173824KB
GPU memory left 4173824KB.
Xserver unix:2, PCI ID 0:132:0:0, vSGA mode, GPU maximum memory 4173824KB
GPU memory left 4173824KB.
Xserver unix:3, PCI ID 0:133:0:0, vSGA mode, GPU maximum memory 4173824KB
GPU memory left 4173824KB.

To me, something implies the VIB is not correct but that is the only 1 available via Nvidia's website. Downgrading to NVIDIA-GRID-vGPU-kepler-vSphere-6.0-367.64-369.71 on the esxi host allows the base machine to start with GPU enabled, but View won't compose a pool as it does not recognize the older GPU.

Anyway, has anyone else upgraded their Vsphere to 6.5 and run into this issue or are we missing something simple?

Thanks.

#1
Posted 11/24/2016 01:07 AM   
Nevermind, the host graphics settings on each esxi that had been updated to 6.5 had reverted back to Shared and not Shared Direct. Once setting the host to "Shared Direct" and restarting xorg, all is well.
Nevermind, the host graphics settings on each esxi that had been updated to 6.5 had reverted back to Shared and not Shared Direct. Once setting the host to "Shared Direct" and restarting xorg, all is well.

#2
Posted 11/24/2016 03:23 AM   
This this exactly the problem I was running into, thanks for sharing the solution.
This this exactly the problem I was running into, thanks for sharing the solution.

#3
Posted 11/29/2016 03:49 AM   
vSphere 6.5 and November 2016 GRID drivers (both Kepler and Maxwell) require changing the default GPU mode from “Shared” (vSGA) to “Shared Direct” (vGPU) via vCenter to enable vGPU support for VMs. Not changing this will result in the VMs with a vGPU profile assigned to not start with the standard “graphics resources not available” error. For those that may be starting to evaluate the November 2016 GRID drivers with vSphere 6.5, an additional step to configure the GPU mode is required. Procedure: - Select the ESXi 6.5 host in vCenter 6.5, next select the “Configure” tab and scroll down to “Graphics”. - Highlight each GPUs that you want to use for vGPU and then select the edit icon to modify the Graphics device settings. - Select “Shared Direct” for vGPU - The host will need to be rebooted for the changes to take effect, after that your vGPU VMs should now start normally. This new requirement and procedures will ba added to the documentation shortly, thank you for reporting this issue.
vSphere 6.5 and November 2016 GRID drivers (both Kepler and Maxwell) require changing the default GPU mode from “Shared” (vSGA) to “Shared Direct” (vGPU) via vCenter to enable vGPU support for VMs.

Not changing this will result in the VMs with a vGPU profile assigned to not start with the standard “graphics resources not available” error.

For those that may be starting to evaluate the November 2016 GRID drivers with vSphere 6.5, an additional step to configure the GPU mode is required.

Procedure:
- Select the ESXi 6.5 host in vCenter 6.5, next select the “Configure” tab and scroll down to “Graphics”.
- Highlight each GPUs that you want to use for vGPU and then select the edit icon to modify the Graphics device settings.
- Select “Shared Direct” for vGPU
- The host will need to be rebooted for the changes to take effect, after that your vGPU VMs should now start normally.

This new requirement and procedures will ba added to the documentation shortly, thank you for reporting this issue.

Jeremy Main
Lead Solution Architect - NVIDIA GRID
GRID Resources : http://www.nvidia.com/object/grid-enterprise-resources.html
GPUProfiler : http://gpuprofiler.com/

#4
Posted 11/30/2016 02:34 PM   
I found this and configured my server this way. It caused all my VMs set to use vmware svga to have issues. I don't need them to use the GPU at all. I only wanted to enable for some. Is this the new way we need to configure? To have all the VMs use the GPU, regardless of if needed? This happened to VMs that did not have the Shared PCI added with a profile.
I found this and configured my server this way. It caused all my VMs set to use vmware svga to have issues. I don't need them to use the GPU at all. I only wanted to enable for some.

Is this the new way we need to configure? To have all the VMs use the GPU, regardless of if needed?

This happened to VMs that did not have the Shared PCI added with a profile.

#5
Posted 12/09/2016 12:47 AM   
Hi, Thanks alot for this info. I was working with NVIDIA support team on SR 161202-000639 with no avail until I came with this community. Once again thanks alot Jeremy Main
Hi,

Thanks alot for this info. I was working with NVIDIA support team on SR 161202-000639 with no avail until I came with this community.


Once again thanks alot Jeremy Main

#6
Posted 12/13/2016 07:29 PM   
This worked perfectly and make sure to restart xorg as mentioned by Yem above. I have edited my comments per @Sschaber below.
This worked perfectly and make sure to restart xorg as mentioned by Yem above. I have edited my comments per @Sschaber below.

#7
Posted 03/06/2017 11:57 PM   
@Taskman: There are different versions of vGPU manager. Our documentation is fully correct. We reference on the Maxwell based vGPU manager (>GRID 2.0) but there is still the kepler one for public download as this version is for K1/K2 only and doesn't require a GRID license. Regards Simon
@Taskman: There are different versions of vGPU manager. Our documentation is fully correct. We reference on the Maxwell based vGPU manager (>GRID 2.0) but there is still the kepler one for public download as this version is for K1/K2 only and doesn't require a GRID license.

Regards

Simon

#8
Posted 03/08/2017 07:33 AM   
@Jmain: I tried to follow your procedure to change the GPU from "Shared" to "Shared Direct". Although I dont see Edit option available under Graphics setting for my ESxi host. I am running vsphere 6.0.0. Where else can I change the Graphics settings?
@Jmain: I tried to follow your procedure to change the GPU from "Shared" to "Shared Direct". Although I dont see Edit option available under Graphics setting for my ESxi host. I am running vsphere 6.0.0. Where else can I change the Graphics settings?

#9
Posted 05/26/2017 10:52 PM   
Hi, this is an option only for vSphere 6.5. You won't find it on 6.0!!!!
Hi, this is an option only for vSphere 6.5. You won't find it on 6.0!!!!

#10
Posted 05/27/2017 08:29 AM   
Hi - Any thoughts on how to fix the 'GPU memory' error if we are not on 6.5 vSphere ? I am on 6.0.0 rev 3018524 of vSphere. I just upgraded some Esxi hosts to 6.0 Patch 5 ( i.e. rev 5572656 ). I now can not turn on any VM's with a K2 card. Do i need to force them to vGPU mode ? im trying to figure out how to do that now with the rev's im at . Any ideas? thanks.
Hi - Any thoughts on how to fix the 'GPU memory' error if we are not on 6.5 vSphere ? I am on 6.0.0 rev 3018524 of vSphere. I just upgraded some Esxi hosts to 6.0 Patch 5 ( i.e. rev 5572656 ). I now can not turn on any VM's with a K2 card. Do i need to force them to vGPU mode ? im trying to figure out how to do that now with the rev's im at . Any ideas? thanks.

#11
Posted 06/27/2017 08:35 PM   
@bobtheslob: We have the same issue after upgrading to ESXi 6.0 Patch (Build 5572656). I've opened a case at VMware. I will inform you, if I have any news.
@bobtheslob: We have the same issue after upgrading to ESXi 6.0 Patch (Build 5572656). I've opened a case at VMware. I will inform you, if I have any news.

#12
Posted 06/29/2017 12:38 PM   
Friends. With the video card K1 the same problem. I decided temporarily through shared direct. We are waiting for corrections
Friends. With the video card K1 the same problem. I decided temporarily through shared direct. We are waiting for corrections

#13
Posted 06/29/2017 02:10 PM   
Hi - I've received an answer from VMware, they have sent me a link to kb2150498: [url]https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2150498[/url] I've followed the instructions and copied the attached xorg file, after that I was able to start the service and the VMs again without changing the graphic settings. It seems that there is no other fix for this issue on ESX 6.0 Patch 5 (Build 5572656) with vCenter 6
Hi - I've received an answer from VMware, they have sent me a link to kb2150498: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2150498
I've followed the instructions and copied the attached xorg file, after that I was able to start the service and the VMs again without changing the graphic settings. It seems that there is no other fix for this issue on ESX 6.0 Patch 5 (Build 5572656) with vCenter 6

#14
Posted 06/30/2017 11:21 AM   
@Neo2k4: Thanks for the link to the article. I will track the decision
@Neo2k4: Thanks for the link to the article. I will track the decision

#15
Posted 07/03/2017 07:07 AM   
Scroll To Top