NVIDIA
vGPU Utilization Per VM
@Taskman, I think the K1 may be underpowered for your purposes. The M60 solution with passthrough licensing is quite affordable and will give you a lot more GPU power and will scale much better. That's the route we're in the process of implementing.
@Taskman, I think the K1 may be underpowered for your purposes. The M60 solution with passthrough licensing is quite affordable and will give you a lot more GPU power and will scale much better. That's the route we're in the process of implementing.

-=Tobias

#16
Posted 05/06/2016 04:10 PM   
Taskman, any update on this 20-25% GPU utilization issue when initiating a Horizon PCoIP session? This is definitely a contributing factor to my lack of GRID performance. Pascal
Taskman,

any update on this 20-25% GPU utilization issue when initiating a Horizon PCoIP session?

This is definitely a contributing factor to my lack of GRID performance.

Pascal

#17
Posted 06/09/2016 02:20 PM   
Hi Pascal, It's an issue identified in the VMware stack (i.e. not one NVIDIA can resolve) and as such you need to raise a ticket with them and request a fix (although I don't believe one has been released yet). We are trackign it and passing on cases to VMware. We have a KB article in draft: Symptom / Error High GPU load is seen with vSphere/View deployments and NVIDIA GRID vGPU, this may be seen even when sessions/VMs are idle. nvidia-smi shows high GPU utilization for vGPU VMs with active Horizon session. vGPU VMs with an active Horizon connection utilize a high percentage of the GPU on the ESXi host. The GPU utilization remains high for the duration of the Horizon session even if there are no active applications running on the VM. NVIDIA Ref. #1735009 Workaround / Solution There is no workaround currently and customers affected need to raise a support case with VMware who hope to release a fix in a future release of their product. The issue is within the Horizon View product and as such this is not an issue NVIDIA can resolve. Affected Products VMware Horizon View 7.0 and earlier when using NVIDIA GRID vGPU and NVIDIA GRID Cards (K1, K2, M60, M6, M10). Citrix Products This issue only affects VMware Horizon View and related Blast Extreme and PCoIP protocols. Citrix XenDesktop/XenApp and HDX/ICA are unaffected by this issue. References This issue is documented in the latest release notes (Version 361.40 / 362.13) for NVIDIA GRID vGPU for VMware:
Hi Pascal,

It's an issue identified in the VMware stack (i.e. not one NVIDIA can resolve) and as such you need to raise a ticket with them and request a fix (although I don't believe one has been released yet). We are trackign it and passing on cases to VMware. We have a KB article in draft:


Symptom / Error
High GPU load is seen with vSphere/View deployments and NVIDIA GRID vGPU, this may be seen even when sessions/VMs are idle. nvidia-smi shows high GPU utilization for vGPU VMs with active Horizon session. vGPU VMs with an active Horizon connection utilize a high percentage of the GPU on the ESXi host. The GPU utilization remains high for the duration of the Horizon session even if there are no active applications running on the VM. NVIDIA Ref. #1735009

Workaround / Solution
There is no workaround currently and customers affected need to raise a support case with VMware who hope to release a fix in a future release of their product. The issue is within the Horizon View product and as such this is not an issue NVIDIA can resolve.

Affected Products
VMware Horizon View 7.0 and earlier when using NVIDIA GRID vGPU and NVIDIA GRID Cards (K1, K2, M60, M6, M10).

Citrix Products
This issue only affects VMware Horizon View and related Blast Extreme and PCoIP protocols. Citrix XenDesktop/XenApp and HDX/ICA are unaffected by this issue.

References
This issue is documented in the latest release notes (Version 361.40 / 362.13) for NVIDIA GRID vGPU for VMware:

#18
Posted 06/09/2016 02:42 PM   
much thanks Rachel. I have just raised a ticket with VMWare. Thanks again. PAscal
much thanks Rachel.

I have just raised a ticket with VMWare.

Thanks again.

PAscal

#19
Posted 06/09/2016 03:24 PM   
Re: Symptom / Error High GPU load is seen with vSphere/View deployments and NVIDIA GRID vGPU, this may be seen even when sessions/VMs are idle. nvidia-smi shows high GPU utilization for vGPU VMs with active Horizon session. vGPU VMs with an active Horizon connection utilize a high percentage of the GPU on the ESXi host. The GPU utilization remains high for the duration of the Horizon session even if there are no active applications running on the VM. NVIDIA Ref. #1735009 VMware have released a fix for the Blast Extreme protocol with VMware Horizon 7.0.1 update. Users with issues on PCoIP need to continue to raise the need for a fix with that protocol with VMware. Best wishes, Rachel
Re: Symptom / Error
High GPU load is seen with vSphere/View deployments and NVIDIA GRID vGPU, this may be seen even when sessions/VMs are idle. nvidia-smi shows high GPU utilization for vGPU VMs with active Horizon session. vGPU VMs with an active Horizon connection utilize a high percentage of the GPU on the ESXi host. The GPU utilization remains high for the duration of the Horizon session even if there are no active applications running on the VM. NVIDIA Ref. #1735009


VMware have released a fix for the Blast Extreme protocol with VMware Horizon 7.0.1 update.

Users with issues on PCoIP need to continue to raise the need for a fix with that protocol with VMware.

Best wishes,
Rachel

#20
Posted 06/28/2016 11:01 AM   
Hi Rachel, indeed the 7.0.1 update did not resolve PCoIP sessions. What is even more confusing is that when I raised a ticket with VMWare, they dont even have anything in their records that touches this subject. They would like to get a contact person at NVIDIA to tell them more about the problem so that they can properly send the issue to their engineering department. I find that their answer is very strange, would you be so kind and tell me who you are in contact with at VMWare when you passed along the KB 1735009 to them? Somebody must know about this problem since they fixed it for the BLAST protocol... Thanks Rachel, Pascal
Hi Rachel,

indeed the 7.0.1 update did not resolve PCoIP sessions. What is even more confusing is that when I raised a ticket with VMWare, they dont even have anything in their records that touches this subject.

They would like to get a contact person at NVIDIA to tell them more about the problem so that they can properly send the issue to their engineering department. I find that their answer is very strange, would you be so kind and tell me who you are in contact with at VMWare when you passed along the KB 1735009 to them? Somebody must know about this problem since they fixed it for the BLAST protocol...

Thanks Rachel,

Pascal

#21
Posted 08/17/2016 07:54 PM   
I have sent you a contact via private message. How strange... Rachel
I have sent you a contact via private message.

How strange...

Rachel

#22
Posted 08/19/2016 12:33 PM   
"I want to know if it is possible to see the vGPU utilization per VM." You asked, so we have now added :-D https://blogs.nvidia.com/blog/2016/08/24/nvidia-grid-monitoring/ There's a webinar at 4pm UK/8am PDT today where you can ask questions http://info.nvidianews.com/aug24_webinar.html
"I want to know if it is possible to see the vGPU utilization per VM."

You asked, so we have now added :-D


https://blogs.nvidia.com/blog/2016/08/24/nvidia-grid-monitoring/


There's a webinar at 4pm UK/8am PDT today where you can ask questions

http://info.nvidianews.com/aug24_webinar.html

#23
Posted 08/24/2016 01:24 PM   
Great News! Unfortunate that these changes were not implemented for the GRID version that supports K2 boards. Maybe next time or is this a message advising me to jump on the Tesla M60 with GRID 2.0 ? :-)
Great News! Unfortunate that these changes were not implemented for the GRID version that supports K2 boards.

Maybe next time or is this a message advising me to jump on the Tesla M60 with GRID 2.0 ? :-)

#24
Posted 08/24/2016 04:30 PM   
Historically most GPU vendors have sold GPUs as pure hardware and whilst this meant no ongoing software licensing it is a model without support for the software where the functionality of the card is fixed with the physical hardware specification purchased. The Kepler K1/K2 cards were sold under such a model. Recognizing this model does not fit with customer expectations for 24/7 support and enhancements to the software newer cards are developed and sold with a software component that enables and entitles the customer to the development of new software features on their existing hardware. As such the new monitoring functionality has only been developed for newer products developed under a software model such as the M10/M6/M60. NVIDIA continues to provide the features and functionality that were available when you bought your cards. NVIDIA _is_ committed to long term availability and continual regression testing. The Kepler GRID cards are coming towards their end of availability. NVIDIA is fully committed to supporting and maintaining the functionality that was available on the Kepler generation when customers bought them. The GRID software platform allows users to manage environments with hosts that have Kepler cards alongside hosts with Maxwell generation cards. Allowing users to expand to use new features on newer generations. In a virtualized VDI environment the server and hypervisor certification is driven by the hypervisor vendors and server OEMs. Server OEMs prefer to support limited combinations to ensure full test coverage and as such rarely add support for newer GPUs to older servers. Likewise hypervisor vendors rarely add new features to older stable and tested platforms choosing to focus the testing on new releases to ensure quality. For every variation with so many technologies interacting at some point organisations have to balance the need to add and test new architectures and features and test them fully by expanding the test matrix on newer technologies rather than older versions. The software licensing of GRID 2.0 and up products is invested in the development of supported and tested software features such as the new monitoring functionality Similarly newer features like support and development for VMware BLAST Extreme and Linux VDA support have been focused on the software supported cards M60/M10/M6.
Historically most GPU vendors have sold GPUs as pure hardware and whilst this meant no ongoing software licensing it is a model without support for the software where the functionality of the card is fixed with the physical hardware specification purchased. The Kepler K1/K2 cards were sold under such a model. Recognizing this model does not fit with customer expectations for 24/7 support and enhancements to the software newer cards are developed and sold with a software component that enables and entitles the customer to the development of new software features on their existing hardware. As such the new monitoring functionality has only been developed for newer products developed under a software model such as the M10/M6/M60.

NVIDIA continues to provide the features and functionality that were available when you bought your cards. NVIDIA _is_ committed to long term availability and continual regression testing. The Kepler GRID cards are coming towards their end of availability. NVIDIA is fully committed to supporting and maintaining the functionality that was available on the Kepler generation when customers bought them.

The GRID software platform allows users to manage environments with hosts that have Kepler cards alongside hosts with Maxwell generation cards. Allowing users to expand to use new features on newer generations.

In a virtualized VDI environment the server and hypervisor certification is driven by the hypervisor vendors and server OEMs. Server OEMs prefer to support limited combinations to ensure full test coverage and as such rarely add support for newer GPUs to older servers. Likewise hypervisor vendors rarely add new features to older stable and tested platforms choosing to focus the testing on new releases to ensure quality.

For every variation with so many technologies interacting at some point organisations have to balance the need to add and test new architectures and features and test them fully by expanding the test matrix on newer technologies rather than older versions.

The software licensing of GRID 2.0 and up products is invested in the development of supported and tested software features such as the new monitoring functionality

Similarly newer features like support and development for VMware BLAST Extreme and Linux VDA support have been focused on the software supported cards M60/M10/M6.

#25
Posted 08/25/2016 12:58 PM   
Scroll To Top

Add Reply