NVIDIA
vGPU Utilization Per VM
I want to know if it is possible to see the vGPU utilization per VM. Not within the OS but from the Grid K1 card. For instance, if GPU 0 is at 80%, it would be great if I knew 45% of that number is coming from a specific VM. I looked through the forums but didn't find any specific posts on this. I tried using nvidia-smi CLI command. For instance: 'nvidia-smi -q' but while it showed detailed info on each physical GPU, including utilization. There was no per VM utilization. Thanks and I appreciate any help on this request.
I want to know if it is possible to see the vGPU utilization per VM. Not within the OS but from the Grid K1 card. For instance, if GPU 0 is at 80%, it would be great if I knew 45% of that number is coming from a specific VM. I looked through the forums but didn't find any specific posts on this.

I tried using nvidia-smi CLI command. For instance: 'nvidia-smi -q' but while it showed detailed info on each physical GPU, including utilization. There was no per VM utilization.

Thanks and I appreciate any help on this request.

#1
Posted 05/04/2016 06:15 PM   
Also, the reason for this request is we are finding high GPU utilization with a low number of users in our first vGPU deployment. Here is the setup. [list] [.]Grid K1[/.] [/list] [list] [.]K120Q vGPU Profile[/.] [/list] [list] [.]3-5 users per physical GPU equals constant 95%+ utilization[/.] [/list] User behavior is fairly standard with MS office applications and Chrome/IE with HW acceleration on.
Also, the reason for this request is we are finding high GPU utilization with a low number of users in our first vGPU deployment. Here is the setup.

  • Grid K1

  • K120Q vGPU Profile

  • 3-5 users per physical GPU equals constant 95%+ utilization



User behavior is fairly standard with MS office applications and Chrome/IE with HW acceleration on.

#2
Posted 05/04/2016 06:28 PM   
Hi Taskman, I'm afraid it isn't but this is a need highlighted to product management. You can monitor the framebuffer for each VM though but not the GPU processing. https://virtuallyvisual.wordpress.com/2015/07/27/limitations-in-monitoring-shared-nvidia-gpu-technologies/ (this is worth reading as explains how trying to monitor in VM would be very misleading) https://virtuallyvisual.wordpress.com/2015/09/09/monitoring-nvidia-gpu-usage-of-the-framebuffer-for-vgpu-and-gpu-passthrough/ The framebuffer usage may give you an idea of which applications are using the GPU but has to be done via the process manager in VM. I will pass on this feedback to the product managers. Best wishes, Rachel
Hi Taskman,

I'm afraid it isn't but this is a need highlighted to product management. You can monitor the framebuffer for each VM though but not the GPU processing.


https://virtuallyvisual.wordpress.com/2015/07/27/limitations-in-monitoring-shared-nvidia-gpu-technologies/
(this is worth reading as explains how trying to monitor in VM would be very misleading)


https://virtuallyvisual.wordpress.com/2015/09/09/monitoring-nvidia-gpu-usage-of-the-framebuffer-for-vgpu-and-gpu-passthrough/


The framebuffer usage may give you an idea of which applications are using the GPU but has to be done via the process manager in VM.

I will pass on this feedback to the product managers.

Best wishes,
Rachel

#3
Posted 05/04/2016 06:31 PM   
BTW What is the stack e.g. XenDesktop+vSphere? The K1 is essentially 4xK600 cards and chrome (and browsers in general) can be very GPU hungry see: https://www.virtualexperience.no/2015/11/05/mythbusting-browser-gpu-usage-on-xenapp/ So 4-5 users per pGPU is 1/4 of a K600 if they are watching a lot of video. The codecs/graphics mode in use will also use CPU and or GPU (new blast extreme on view uses GPU). Best wishes, Rachel
BTW

What is the stack e.g. XenDesktop+vSphere? The K1 is essentially 4xK600 cards and chrome (and browsers in general) can be very GPU hungry see: https://www.virtualexperience.no/2015/11/05/mythbusting-browser-gpu-usage-on-xenapp/

So 4-5 users per pGPU is 1/4 of a K600 if they are watching a lot of video.

The codecs/graphics mode in use will also use CPU and or GPU (new blast extreme on view uses GPU).

Best wishes,
Rachel

#4
Posted 05/04/2016 06:38 PM   
Thanks Rachel for the quick response. We are currently using vSphere ESXi 6 with Horizon View 6.2. I looked at both links and I'm trying to find where they mention how to monitor Frame Buffer. Is that a perfmon counter or CLI command? Thanks!
Thanks Rachel for the quick response. We are currently using vSphere ESXi 6 with Horizon View 6.2.

I looked at both links and I'm trying to find where they mention how to monitor Frame Buffer. Is that a perfmon counter or CLI command? Thanks!

#5
Posted 05/04/2016 08:00 PM   
JS made explanation video [b]year[/b] ago that there is [b]not[/b] possible to monitor vGPU performance. https://www.youtube.com/watch?v=lW_mt0kKY-w Some 3rd-party sell vGPU performance monitor tools [url]http://goliathtechnologies.com/software/goliath-nvidia-performance-monitor/[/url]. (But it seems to use the same global "NVML/nvidia-smi" performance metric. Updated by RachelBerry.) [img]http://cdn.goliathtechnologies.com/wp-content/uploads/2015/08/vGPU.png[/img] [color="orange"][b]Is there any reliable API for detailed vGPU performance monitor today ?[/b][/color] It will be very useful if "nvidia-smi pmon" will work in Dom0 (for monitoring per DomU) or in vGPU DomU (for monitoring per processes inside DomU). It is again about mysterious GPU timesliced scheduler configuration & observability ([url]https://gridforums.nvidia.com/default/topic/743/talks-with-the-developers/gpu-scheduler-for-vgpu/[/url])! It brings shame on NVidia if it is not possible to monitor detailed vGPU performance per DomU or per process in DomU after 3 years. Best regards, M.C> [i]Edited 05/06/2016[/i]
JS made explanation video year ago that there is not possible to monitor vGPU performance.



Some 3rd-party sell vGPU performance monitor tools http://goliathtechnologies.com/software/goliath-nvidia-performance-monitor/. (But it seems to use the same global "NVML/nvidia-smi" performance metric. Updated by RachelBerry.)

Image


Is there any reliable API for detailed vGPU performance monitor today ?

It will be very useful if "nvidia-smi pmon" will work in Dom0 (for monitoring per DomU) or in vGPU DomU (for monitoring per processes inside DomU). It is again about mysterious GPU timesliced scheduler configuration & observability (https://gridforums.nvidia.com/default/topic/743/talks-with-the-developers/gpu-scheduler-for-vgpu/)!
It brings shame on NVidia if it is not possible to monitor detailed vGPU performance per DomU or per process in DomU after 3 years.

Best regards, M.C>

Edited 05/06/2016

#6
Posted 05/04/2016 08:16 PM   
You can use passthrough mode to see gpusizer for one VM And test how much one VM is using before deciding vgpu profile. Another way is to make sure you have only one VM with vgpu active on one pgpu, then you can use gpu-z or uberagent to get gpu per process And per vm with correct result. If you have multiple vm's on the same Physical gpu you cannot rely on in-VM metrics. Browser video usage on a k1 is typically 3-4 users per physical gpu (pgpu), but CPU is still quite intense with gpu enabled browsers. Have a look at these blogposts: http://www.virtualexperience.no/2015/11/05/mythbusting-browser-gpu-usage-on-xenapp/ http://www.virtualexperience.no/2015/01/07/im-100-sure-that-100-is-not-100/
You can use passthrough mode to see gpusizer for one VM And test how much one VM is using before deciding vgpu profile. Another way is to make sure you have only one VM with vgpu active on one pgpu, then you can use gpu-z or uberagent to get gpu per process And per vm with correct result. If you have multiple vm's on the same Physical gpu you cannot rely on in-VM metrics.

Browser video usage on a k1 is typically 3-4 users per physical gpu (pgpu), but CPU is still quite intense with gpu enabled browsers.

Have a look at these blogposts:

http://www.virtualexperience.no/2015/11/05/mythbusting-browser-gpu-usage-on-xenapp/

http://www.virtualexperience.no/2015/01/07/im-100-sure-that-100-is-not-100/

#7
Posted 05/04/2016 08:51 PM   
Thank you everyone for the quick responses, this is a very active forum. Now, let me go back to the reason for this post. We are doing our first vGPU deployment and I'm noticing 99% utilization with 3-5 users on a pGPU. I have completed more testing with process-explorer and GPU-Z. When a VM is on a pGPU by itself, it is consuming 20-25% GPU utilization. At first it looked like the tools would not help as they showed no process using GPU resources. However, once I loaded Chrome (GPU accelerated), process-explorer then registered it as using GPU resources. No other process is using any GPU resources and I have tried stripping down all running processes to just the systems, vmware, nvidia processes. I duplicated the behavior on two different hosts, each with two K1 cards being used. As soon as my user logs into the desktop on the VM, pGPU utilization hits 20-25%. This makes it seem like its the parent image or an issue with the configuration on the K1.
Thank you everyone for the quick responses, this is a very active forum.

Now, let me go back to the reason for this post. We are doing our first vGPU deployment and I'm noticing 99% utilization with 3-5 users on a pGPU.

I have completed more testing with process-explorer and GPU-Z. When a VM is on a pGPU by itself, it is consuming 20-25% GPU utilization. At first it looked like the tools would not help as they showed no process using GPU resources. However, once I loaded Chrome (GPU accelerated), process-explorer then registered it as using GPU resources.

No other process is using any GPU resources and I have tried stripping down all running processes to just the systems, vmware, nvidia processes.

I duplicated the behavior on two different hosts, each with two K1 cards being used. As soon as my user logs into the desktop on the VM, pGPU utilization hits 20-25%. This makes it seem like its the parent image or an issue with the configuration on the K1.

#8
Posted 05/04/2016 09:37 PM   
There can be many hidden vGPU application: - Windows Aero composer, try to switch it off - remoting protocol also use vGPU resources (nvifr/nvfbc/nvenc), try to access desktop over direct console - there can be also problem with power management, many times I see 25% utilization after two windows OS starts but only 5% after 3rd windows OS. Check "Perf" column from nvidia-smi if it is stay in P8 (power saving state). It cannot be regulated outside ([url]https://gridforums.nvidia.com/default/topic/378/[/url])
There can be many hidden vGPU application:
- Windows Aero composer, try to switch it off
- remoting protocol also use vGPU resources (nvifr/nvfbc/nvenc), try to access desktop over direct console
- there can be also problem with power management, many times I see 25% utilization after two windows OS starts but only 5% after 3rd windows OS. Check "Perf" column from nvidia-smi if it is stay in P8 (power saving state). It cannot be regulated outside (https://gridforums.nvidia.com/default/topic/378/)

#9
Posted 05/04/2016 09:52 PM   
Thanks @mcerveny, I didn't realize the PCOIP protocol played a factor in the pGPU utilization number. As soon as I disconnect from the VM, utilization drops to 0% as it was the only VM on that pGPU. Then back to 22-25% when logged back in. I'm guessing (hoping) this is not normal. In case it is a factor, here is the GPO settings being used for PCOIP. [b]PCoIP Session Variables/Not Overridable Administrator Settingshide[/b] Policy Setting Comment Configure clipboard redirection Enabled Configure clipboard redirection Enabled client to server only Policy Setting Comment Configure the PCoIP session bandwidth floor Enabled Set PCoIP session bandwidth floor in kilobits per second to: 2000 Policy Setting Comment Turn off Build-to-Lossless feature Enabled
Thanks @mcerveny, I didn't realize the PCOIP protocol played a factor in the pGPU utilization number. As soon as I disconnect from the VM, utilization drops to 0% as it was the only VM on that pGPU. Then back to 22-25% when logged back in.

I'm guessing (hoping) this is not normal. In case it is a factor, here is the GPO settings being used for PCOIP.

PCoIP Session Variables/Not Overridable Administrator Settingshide
Policy Setting Comment

Configure clipboard redirection Enabled
Configure clipboard redirection Enabled client to server only

Policy Setting Comment

Configure the PCoIP session bandwidth floor Enabled
Set PCoIP session bandwidth floor in kilobits per second to: 2000

Policy Setting Comment

Turn off Build-to-Lossless feature Enabled

#10
Posted 05/04/2016 10:10 PM   
New update, I was digging through the release notes and came across this in the known issues section. This is exactly what I'm seeing but it is odd that I didn't find anyone else reporting it online. Can anyone at Nvidia provide a status on Ref# 1735009? Thanks [b]From Release notes of 361.40/362.13[/b] nvidia-smi shows high GPU utilization for vGPU VMs with active Horizon sessions Description vGPU VMs with an active Horizon connection utilize a high percentage of the GPU on the ESXi host. The GPU utilization remains high for the duration of the Horizon session even if there are no active applications running on the VM. Version Workaround None Status Open Ref. # 1735009
New update, I was digging through the release notes and came across this in the known issues section. This is exactly what I'm seeing but it is odd that I didn't find anyone else reporting it online. Can anyone at Nvidia provide a status on Ref# 1735009? Thanks


From Release notes of 361.40/362.13

nvidia-smi shows high GPU utilization for vGPU VMs with active
Horizon sessions

Description vGPU VMs with an active Horizon connection utilize a high percentage
of the GPU on the ESXi host. The GPU utilization remains high for the
duration of the Horizon session even if there are no active
applications running on the VM.
Version

Workaround None
Status Open
Ref. # 1735009

#11
Posted 05/04/2016 10:54 PM   
Hi Taskman, That issue is open with VMware for resolution. I don't know the root cause or any workaround I'm afraid, and it doesn't affect every session. PCoIP itself doesn't use the GPU for encoding, but it does query the API's to read directly from the FrameBuffer. BLAST (since 7.0) will use the GPU for encoding when accessing from a client with a single display. Magnar & mcerveny have both pretty much covered all the other likely causes, remember the K1 is a pretty small GPU, so it's easy to load it up with a few browser apps, and often, though counter intuitive, the cards with just 2 GPU's (K2 / M60) can give better performance and density if the application load requires GPU resource over Graphics Memory.
Hi Taskman,

That issue is open with VMware for resolution. I don't know the root cause or any workaround I'm afraid, and it doesn't affect every session.

PCoIP itself doesn't use the GPU for encoding, but it does query the API's to read directly from the FrameBuffer.

BLAST (since 7.0) will use the GPU for encoding when accessing from a client with a single display.

Magnar & mcerveny have both pretty much covered all the other likely causes, remember the K1 is a pretty small GPU, so it's easy to load it up with a few browser apps, and often, though counter intuitive, the cards with just 2 GPU's (K2 / M60) can give better performance and density if the application load requires GPU resource over Graphics Memory.

Jason Southern, Regional Lead for ProVis Sales - EMEA: NVIDIA Ltd.

#12
Posted 05/05/2016 09:14 AM   
As M.C points out there are third-party tools liek Goliath, which is very good, they use the NVIDIA APIS and those derived from them by the hypervisor vendors and work with us closely to ensure used properly and interoperability is good. However they are limited as nvidia-smi is by the underlyign capabilities of the card to provide per VM info GPU resource usage and so it's not functionality a third-party can provide either. Best wishes, Rachel
As M.C points out there are third-party tools liek Goliath, which is very good, they use the NVIDIA APIS and those derived from them by the hypervisor vendors and work with us closely to ensure used properly and interoperability is good. However they are limited as nvidia-smi is by the underlyign capabilities of the card to provide per VM info GPU resource usage and so it's not functionality a third-party can provide either.

Best wishes,
Rachel

#13
Posted 05/05/2016 10:47 AM   
Thanks everyone. I just spoke with VMware and my issue matches the known issue in the release notes. They are escalating it on their end to Nvidia. For the POC we are in, I had changed the deployment to Depth-First instead of breadth-first in order to do a load test and identify potential issues like this. For now, I'll switch back to breadth-first to mitigate this issue until a resolution is released by Nvidia. Per @JasonSouthern mentions of the capabilities of the K1s and the numbers we are seeing. I am also going to contact Nvidia about an eval for the M60 as part of our POC. I'll update this post once that occurs. Thanks again.
Thanks everyone. I just spoke with VMware and my issue matches the known issue in the release notes. They are escalating it on their end to Nvidia.

For the POC we are in, I had changed the deployment to Depth-First instead of breadth-first in order to do a load test and identify potential issues like this. For now, I'll switch back to breadth-first to mitigate this issue until a resolution is released by Nvidia.

Per @JasonSouthern mentions of the capabilities of the K1s and the numbers we are seeing. I am also going to contact Nvidia about an eval for the M60 as part of our POC.

I'll update this post once that occurs. Thanks again.

#14
Posted 05/05/2016 07:00 PM   
Hi folks, Support have now published a KB explaining framebuffer monitoring both on host and per VM. So while you can't get GPU resource per VM this may be of use for understanding your application use: [url]http://nvidia.custhelp.com/app/answers/detail/a_id/4108/[/url] Best wishes, Rachel
Hi folks,

Support have now published a KB explaining framebuffer monitoring both on host and per VM. So while you can't get GPU resource per VM this may be of use for understanding your application use:
http://nvidia.custhelp.com/app/answers/detail/a_id/4108/

Best wishes,
Rachel

#15
Posted 05/05/2016 10:40 PM   
Scroll To Top

Add Reply