NVIDIA
Performance/power management problem on shared vGPU
Hello. [b]Situation 0 - 1 "idle" windows aero on one physical GPU:[/b] If I run 1 vGPU per physical GPU, all seems to be fine, I catch maximum framerate [b]25 FPS[/b] (limited by plugin0.frl_config). [b]Situation 1 - 2 "idle" windows aero on one physical GPU:[/b] If I run 2 vGPU per physical GPU, framerate on both drop below [b]17 FPS[/b]. [u]Observation:[/u] # nvidia-smi [code] +------------------------------------------------------+ | NVIDIA-SMI 340.34 Driver Version: 340.34 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GRID K1 On | 0000:06:00.0 Off | N/A | | N/A 33C P8 11W / 31W | 1820MiB / 4095MiB | 20% Default | +-------------------------------+----------------------+----------------------+ ... snip [/code] # lspci -s 06:00.0 -vvv | grep LnkSta: [code] LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- [/code] # nvidia-smi -q -i 0 -d CLOCK | head -12 [code] ==============NVSMI LOG============== Timestamp : Sun Jan 11 15:55:58 2015 Driver Version : 340.34 Attached GPUs : 4 GPU 0000:06:00.0 Clocks Graphics : 324 MHz SM : 324 MHz Memory : 324 MHz [/code] [u]Interpretation:[/u] Physical GPU stay in lowest powerstate "P8" (lowest MEM, CPU and PCIe clocks) and does not able to deliver required processing power. [b]Situation 2 - 1 "idle" windows aero and one "3D application" on one physical GPU:[/b] If I run small 3D application on one vGPU it leaves change powerstate to "P0" but it is still unable to achieve requested framerate. On both shared vGPU framerate is now about [b]23 FPS[/b]. [u]Observation:[/u] # nvidia-smi [code] +------------------------------------------------------+ | NVIDIA-SMI 340.34 Driver Version: 340.34 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GRID K1 On | 0000:06:00.0 Off | N/A | | N/A 38C P0 17W / 31W | 1820MiB / 4095MiB | 31% Default | +-------------------------------+----------------------+----------------------+ ... snip [/code] # lspci -s 06:00.0 -vvv | grep LnkSta: [code] LnkSta: Speed unknown, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- [/code] [i](Speed unknown == PCIe Gen3)[/i] # nvidia-smi -q -i 0 -d CLOCK | head -12 [code] ==============NVSMI LOG============== Timestamp : Sun Jan 11 16:11:00 2015 Driver Version : 340.34 Attached GPUs : 4 GPU 0000:06:00.0 Clocks Graphics : 680 MHz SM : 680 MHz Memory : 891 MHz [/code] [u]Interpretation:[/u] Now the physical GPU is not in lowest powerstate, but there is still performance throttling ! [u]QUESTION:[/u] I tried direct clock management but it is unsupported on Grid K1. [code] nvidia-smi –q –d SUPPORTED_CLOCKS # Show Supported Clock Frequencies nvidia-smi –ac <MEM clock, Graphics clock> # Set the Memory and Graphics Clock Frequency [/code] [b]How to reprogram automatic power-management ?[/b] Thanks for answers, Martin Cerveny
Hello.

Situation 0 - 1 "idle" windows aero on one physical GPU:

If I run 1 vGPU per physical GPU, all seems to be fine, I catch maximum framerate 25 FPS (limited by plugin0.frl_config).

Situation 1 - 2 "idle" windows aero on one physical GPU:

If I run 2 vGPU per physical GPU, framerate on both drop below 17 FPS.

Observation:

# nvidia-smi
+------------------------------------------------------+                       
| NVIDIA-SMI 340.34 Driver Version: 340.34 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K1 On | 0000:06:00.0 Off | N/A |
| N/A 33C P8 11W / 31W | 1820MiB / 4095MiB | 20% Default |
+-------------------------------+----------------------+----------------------+
... snip


# lspci -s 06:00.0 -vvv | grep LnkSta:
LnkSta:	Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-


# nvidia-smi -q -i 0 -d CLOCK | head -12
==============NVSMI LOG==============

Timestamp : Sun Jan 11 15:55:58 2015
Driver Version : 340.34

Attached GPUs : 4
GPU 0000:06:00.0
Clocks
Graphics : 324 MHz
SM : 324 MHz
Memory : 324 MHz


Interpretation:

Physical GPU stay in lowest powerstate "P8" (lowest MEM, CPU and PCIe clocks) and does not able to deliver required processing power.

Situation 2 - 1 "idle" windows aero and one "3D application" on one physical GPU:

If I run small 3D application on one vGPU it leaves change powerstate to "P0" but it is still unable to achieve requested framerate. On both shared vGPU framerate is now about 23 FPS.

Observation:

# nvidia-smi
+------------------------------------------------------+                       
| NVIDIA-SMI 340.34 Driver Version: 340.34 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K1 On | 0000:06:00.0 Off | N/A |
| N/A 38C P0 17W / 31W | 1820MiB / 4095MiB | 31% Default |
+-------------------------------+----------------------+----------------------+
... snip


# lspci -s 06:00.0 -vvv | grep LnkSta:
LnkSta:	Speed unknown, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

(Speed unknown == PCIe Gen3)

# nvidia-smi -q -i 0 -d CLOCK | head -12
==============NVSMI LOG==============

Timestamp : Sun Jan 11 16:11:00 2015
Driver Version : 340.34

Attached GPUs : 4
GPU 0000:06:00.0
Clocks
Graphics : 680 MHz
SM : 680 MHz
Memory : 891 MHz


Interpretation:

Now the physical GPU is not in lowest powerstate, but there is still performance throttling !

QUESTION:

I tried direct clock management but it is unsupported on Grid K1.

nvidia-smi –q –d SUPPORTED_CLOCKS           # Show Supported Clock Frequencies
nvidia-smi –ac <MEM clock, Graphics clock> # Set the Memory and Graphics Clock Frequency


How to reprogram automatic power-management ?

Thanks for answers, Martin Cerveny

#1
Posted 01/11/2015 03:30 PM   
http://xenserver.org/partners/developing-products-for-xenserver/19-dev-help/138-xs-dev-perf-turbo.html Might help, are the fans turned up?

#2
Posted 01/17/2015 02:13 PM   
[quote="rachel"]http://xenserver.org/partners/developing-products-for-xenserver/19-dev-help/138-xs-dev-perf-turbo.html Might help, are the fans turned up? [/quote] Thanks, but it does not help. There is problem with [b]GPU[/b] power management not [b]CPU[/b] power management :-( M.C>
rachel said:http://xenserver.org/partners/developing-products-for-xenserver/19-dev-help/138-xs-dev-perf-turbo.html Might help, are the fans turned up?


Thanks, but it does not help. There is problem with GPU power management not CPU power management :-(

M.C>

#3
Posted 01/17/2015 02:50 PM   
Hi Martin, What system are you using (from the hcl list)? It looks like the card is in a x8 PCIe slot. It needs to be in a x16 slot. Are you able to switch slots and try again?
Hi Martin,
What system are you using (from the hcl list)?
It looks like the card is in a x8 PCIe slot. It needs to be in a x16 slot. Are you able to switch slots and try again?

#4
Posted 01/22/2015 07:08 PM   
[quote="tonyberholt"] It looks like the card is in a x8 PCIe slot. It needs to be in a x16 slot. Are you able to switch slots and try again?[/quote] Thanks for your hint, but it does not help. K1 card is inserted in 16x PCIe 3 gen and there is PLX bridge on card (PEX 8747 in 16+4*8 configuration, http://www.plxtech.com/download/file/1824 ) that connects 4*GPU only by 8x PCIe 3 gen link. PLX bridge faced to CPU is running always in 16x and full 3 gen speed (eg. "Speed unknown"). But PLX<->GPU side runs in 8x and speed negotiated by [b]GPU[/b] power management. M.C> # lspci -vvv | egrep '^04:|^05|LnkSta:' [code]... 04:00.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode]) LnkSta: Speed unknown, Width x16, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- 05:08.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode]) LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+ 05:09.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode]) LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+ 05:10.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode]) LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+ 05:11.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode]) LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+ ... [/code]
tonyberholt said:
It looks like the card is in a x8 PCIe slot. It needs to be in a x16 slot. Are you able to switch slots and try again?


Thanks for your hint, but it does not help. K1 card is inserted in 16x PCIe 3 gen and there is PLX bridge on card (PEX 8747 in 16+4*8 configuration, http://www.plxtech.com/download/file/1824 ) that connects 4*GPU only by 8x PCIe 3 gen link.
PLX bridge faced to CPU is running always in 16x and full 3 gen speed (eg. "Speed unknown"). But PLX<->GPU side runs in 8x and speed negotiated by GPU power management.

M.C>

# lspci -vvv | egrep '^04:|^05|LnkSta:'
...
04:00.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode])
LnkSta: Speed unknown, Width x16, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
05:08.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode])
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+
05:09.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode])
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+
05:10.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode])
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+
05:11.0 PCI bridge: PLX Technology, Inc. Device 8747 (rev ca) (prog-if 00 [Normal decode])
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+
...

#5
Posted 01/22/2015 08:58 PM   
I see. A couple of suggestions: - Verify that your system is capable of providing enough gfx power - Try with another card to verify if it is the system or the specific card that is giving you problems Any results from this?
I see.
A couple of suggestions:
- Verify that your system is capable of providing enough gfx power
- Try with another card to verify if it is the system or the specific card that is giving you problems

Any results from this?

#6
Posted 01/23/2015 12:02 PM   
[quote="tonyberholt"]- Verify that your system is capable of providing enough gfx power[/quote] I do not understand. The system is ok (DomU(Win7)/Dom0(Xen)/2xE5v2Xeon/32G RAM and K1) and yes there is the problem with [b]GPU[/b] that is throttled down by [b]GPU's own bad power management[/b]. As I wrote in original post I do not known how to disable or reprogram [b]GPU's[/b] power management. If you mean power supply of 12V it is sufficient too (running on 1/2 of 665W limit with max 54A on 12V). If you mean cooling it is ok too (sensing temperature between 29C in idle to 43C in load, throttling is 95C in specs). [quote="tonyberholt"]- Try with another card to verify if it is the system or the specific card that is giving you problems[/quote] There are only two card K1 an K2 capable of vGPU, I have accessible only two K1 cards. I will build second server after next week and try with some version of XenServer, other version of GPU bios to test if the problem persists. M.C>
tonyberholt said:- Verify that your system is capable of providing enough gfx power

I do not understand. The system is ok (DomU(Win7)/Dom0(Xen)/2xE5v2Xeon/32G RAM and K1) and yes there is the problem with GPU that is throttled down by GPU's own bad power management. As I wrote in original post I do not known how to disable or reprogram GPU's power management. If you mean power supply of 12V it is sufficient too (running on 1/2 of 665W limit with max 54A on 12V). If you mean cooling it is ok too (sensing temperature between 29C in idle to 43C in load, throttling is 95C in specs).

tonyberholt said:- Try with another card to verify if it is the system or the specific card that is giving you problems

There are only two card K1 an K2 capable of vGPU, I have accessible only two K1 cards.
I will build second server after next week and try with some version of XenServer, other version of GPU bios to test if the problem persists.

M.C>

#7
Posted 01/23/2015 08:02 PM   
Scroll To Top

Add Reply