NVIDIA
Full profiles k180q/k280q
Hello. The full profiles k180/k280 occupy all GPU K1/K2 except reserved 320MB vram and vgpu/libnvidia-vgpu process in Dom0. The profile does not share vGPU computing power and Dom0 scheduler is not needed. Why is there [b]enabled[/b] "[b]frame_rate_limiter"[/b] by default for that profiles ? Does the disabling remove virtualization overhead (Dom0 scheduler) from GPU ? Why is there [b]disabled[/b] "[b]cuda_enabled[/b]" by default for that profiles ? I suppose that Dom0 scheduler does not able to handle sharing (time slicing) for CUDA processes but there is only DomU concurrency that should be possible. I know that this parameter is intentionally protected by digital sign in new drivers (>6/2015 in vgpuConfig.xml) therefore I am using older driver. Thanks for technical answer, M.C>
Hello.

The full profiles k180/k280 occupy all GPU K1/K2 except reserved 320MB vram and vgpu/libnvidia-vgpu process in Dom0. The profile does not share vGPU computing power and Dom0 scheduler is not needed.

Why is there enabled "frame_rate_limiter" by default for that profiles ?
Does the disabling remove virtualization overhead (Dom0 scheduler) from GPU ?

Why is there disabled "cuda_enabled" by default for that profiles ?
I suppose that Dom0 scheduler does not able to handle sharing (time slicing) for CUDA processes but there is only DomU concurrency that should be possible.
I know that this parameter is intentionally protected by digital sign in new drivers (>6/2015 in vgpuConfig.xml) therefore I am using older driver.

Thanks for technical answer, M.C>

#1
Posted 03/07/2016 08:50 AM   
Hi MC, Why do you want the FRL off? In a VDI environment bandwidth usage is of concern and I'd be curious to know when one would want to use more to go above 60fps - imo this is one that I would leave off by default . I'm not sure about the cuda - I agree that one makes less sense - but in K2/K1 CUDA is not enabled so may be a hangover. Long term these will be things I expect to see change as GPU architecture makes CUDA on shared GPUs more sensible. One good reason could be that whilst a full GPU is not shared under XenDesktop - it is in XenApp and support is only experimental. Turnign it on by default under XenApp would turn on an unsupported feature I think. Just my best guesses! Best wishes, Rachel
Hi MC,

Why do you want the FRL off? In a VDI environment bandwidth usage is of concern and I'd be curious to know when one would want to use more to go above 60fps - imo this is one that I would leave off by default .

I'm not sure about the cuda - I agree that one makes less sense - but in K2/K1 CUDA is not enabled so may be a hangover. Long term these will be things I expect to see change as GPU architecture makes CUDA on shared GPUs more sensible. One good reason could be that whilst a full GPU is not shared under XenDesktop - it is in XenApp and support is only experimental. Turnign it on by default under XenApp would turn on an unsupported feature I think.

Just my best guesses!

Best wishes,
Rachel

#2
Posted 03/07/2016 09:06 AM   
I suppose that disabling FRL should remove virtualization overhead (Dom0 scheduler) and graphics intensive application (eg. usually not hitting 60 FPS) can run faster without FRL but actual results are not so optimistic and leads to lower and unstable vGPU performance. VDI streaming encoder can pickup frames at lower rate. For example Unigine Haeven extreme preset (testing window automatically resized to 1290x900 due to monitor 1280x1024) on [color="orange"]full k280[/color] vGPU profile: drivers [color="orange"]346.68/348.27[/color]: - frame_rate_limiter=[color="orange"]1[/color],frl_config=[color="green"]0x3c[/color] - max ~68 FPS, result ~38.5 FPS, score ~970 - frame_rate_limiter=[color="orange"]1[/color],frl_config=[color="green"]0x2d[/color] - max ~50 FPS, result ~36.5 FPS, score ~920 - frame_rate_limiter=[color="orange"]1[/color],frl_config=[color="green"]0x1e[/color] - max ~33 FPS, result ~30.4 FPS, score ~766 - frame_rate_limiter=[color="orange"]1[/color],frl_config=[color="green"]0x78[/color] - max ~89 FPS, result ~39.7 FPS, score ~999 - frame_rate_limiter=[color="orange"]0[/color] - max 40-86 FPS, result 29-39 FPS, score 748-990 (why is the results lower with disabled FRL, unstable results, random visual hungs) drivers [color="orange"]352.83/354.80[/color]: - frame_rate_limiter=[color="orange"]1[/color],frl_config=[color="green"]0x3c[/color] - max ~68 FPS, result ~37.4 FPS, score ~943 - frame_rate_limiter=[color="orange"]1[/color],frl_config=[color="green"]0x78[/color] - max ~78 FPS, result ~38.1 FPS, score ~960 - frame_rate_limiter=[color="orange"]0[/color] - max 51-62 FPS, result 31.5-34.5 FPS, score 793-870 (why is the results lower with disabled FRL, unstable results, random visual hungs) drivers [color="orange"]361.40/362.13[/color]: - frame_rate_limiter=[color="orange"]1[/color],frl_config=[color="green"]0x3c[/color] - max ~67 FPS, result ~37.0 FPS, score ~932 - frame_rate_limiter=[color="orange"]1[/color],frl_config=[color="green"]0x78[/color] - max ~67 FPS, result ~37.6 FPS, score ~946 - frame_rate_limiter=[color="orange"]0[/color] - max 60-80 FPS, result ~36 FPS, score 914-926 M.C> PS: every newer driver slower and slower ?
I suppose that disabling FRL should remove virtualization overhead (Dom0 scheduler) and graphics intensive application (eg. usually not hitting 60 FPS) can run faster without FRL but actual results are not so optimistic and leads to lower and unstable vGPU performance. VDI streaming encoder can pickup frames at lower rate.

For example Unigine Haeven extreme preset (testing window automatically resized to 1290x900 due to monitor 1280x1024) on full k280 vGPU profile:

drivers 346.68/348.27:

- frame_rate_limiter=1,frl_config=0x3c - max ~68 FPS, result ~38.5 FPS, score ~970
- frame_rate_limiter=1,frl_config=0x2d - max ~50 FPS, result ~36.5 FPS, score ~920
- frame_rate_limiter=1,frl_config=0x1e - max ~33 FPS, result ~30.4 FPS, score ~766
- frame_rate_limiter=1,frl_config=0x78 - max ~89 FPS, result ~39.7 FPS, score ~999

- frame_rate_limiter=0 - max 40-86 FPS, result 29-39 FPS, score 748-990 (why is the results lower with disabled FRL, unstable results, random visual hungs)

drivers 352.83/354.80:

- frame_rate_limiter=1,frl_config=0x3c - max ~68 FPS, result ~37.4 FPS, score ~943
- frame_rate_limiter=1,frl_config=0x78 - max ~78 FPS, result ~38.1 FPS, score ~960

- frame_rate_limiter=0 - max 51-62 FPS, result 31.5-34.5 FPS, score 793-870 (why is the results lower with disabled FRL, unstable results, random visual hungs)

drivers 361.40/362.13:

- frame_rate_limiter=1,frl_config=0x3c - max ~67 FPS, result ~37.0 FPS, score ~932
- frame_rate_limiter=1,frl_config=0x78 - max ~67 FPS, result ~37.6 FPS, score ~946

- frame_rate_limiter=0 - max 60-80 FPS, result ~36 FPS, score 914-926

M.C>

PS: every newer driver slower and slower ?

#3
Posted 03/07/2016 12:57 PM   
[quote="mcerveny"] The full profiles k180/k280 occupy all GPU K1/K2 except reserved 320MB vram and vgpu/libnvidia-vgpu process in Dom0. The profile does not share vGPU computing power and Dom0 scheduler is not needed. [/quote] Scheduling is in the GPU silicon not DomU nor Dom0. The memory reservation is for mapping System Memory into the GPU memory. With PCI passthrough this is handled slightly differently in the OS and is not required at the hypervisor level. [quote="mcerveny"] Why is there [b]disabled[/b] "[b]cuda_enabled[/b]" by default for that profiles ? I suppose that Dom0 scheduler does not able to handle sharing (time slicing) for CUDA processes but there is only DomU concurrency that should be possible. [/quote] As above, scheduling is not in the hypervisor it's in the GPU silicon. Changes in the Maxwell architecture allow us to deliver this in vGPU whilst having to retain this limitation in Kepler. We have to differentiate between what may work, and what is fully QA'd and supported. This is hte reason that drivers are signed and why modifications to such settings place environments into an unsupported state, not just unsupported by Nvidia, but also by the hypervisor vendors.
mcerveny said:

The full profiles k180/k280 occupy all GPU K1/K2 except reserved 320MB vram and vgpu/libnvidia-vgpu process in Dom0. The profile does not share vGPU computing power and Dom0 scheduler is not needed.


Scheduling is in the GPU silicon not DomU nor Dom0.


The memory reservation is for mapping System Memory into the GPU memory. With PCI passthrough this is handled slightly differently in the OS and is not required at the hypervisor level.

mcerveny said:
Why is there disabled "cuda_enabled" by default for that profiles ?
I suppose that Dom0 scheduler does not able to handle sharing (time slicing) for CUDA processes but there is only DomU concurrency that should be possible.


As above, scheduling is not in the hypervisor it's in the GPU silicon.

Changes in the Maxwell architecture allow us to deliver this in vGPU whilst having to retain this limitation in Kepler.

We have to differentiate between what may work, and what is fully QA'd and supported. This is hte reason that drivers are signed and why modifications to such settings place environments into an unsupported state, not just unsupported by Nvidia, but also by the hypervisor vendors.

Jason Southern, Regional Lead for ProVis Sales - EMEA: NVIDIA Ltd.

#4
Posted 03/16/2016 01:00 PM   
[quote="Jason_Southern"]Scheduling is in the GPU silicon not DomU nor Dom0.[/quote] Yes, GPU has hardware "round-robin" scheduler that plan processes (C+G) on SMX+ENC+DEC but [b]Dom0[/b] (or DomU should do this for k180q/k280q profiles) controls & monitors this scheduler. There will not be able to start vgpu Dom0 process (secondary_device_emulator for virtualized vGPU PCI bar) or monitor this process "nvidia-smi pmon" [b]without[/b] some Dom0 control & monitor of this scheduler. [quote="Jason_Southern"]We have to differentiate between what may work, and what is fully QA'd and supported.[/quote] [i]"He who wants looks for a way; he who doesn't looks for an excuse." (Jan Werich)[/i] [i]"Nothing ventured, nothing gained."[/i] Yours sincerely, M.C>
Jason_Southern said:Scheduling is in the GPU silicon not DomU nor Dom0.

Yes, GPU has hardware "round-robin" scheduler that plan processes (C+G) on SMX+ENC+DEC but Dom0 (or DomU should do this for k180q/k280q profiles) controls & monitors this scheduler.
There will not be able to start vgpu Dom0 process (secondary_device_emulator for virtualized vGPU PCI bar) or monitor this process "nvidia-smi pmon" without some Dom0 control & monitor of this scheduler.

Jason_Southern said:We have to differentiate between what may work, and what is fully QA'd and supported.

"He who wants looks for a way; he who doesn't looks for an excuse." (Jan Werich)
"Nothing ventured, nothing gained."

Yours sincerely, M.C>

#5
Posted 03/16/2016 05:24 PM   
Scroll To Top

Add Reply