NVIDIA
video transcoding in virtualized environment with NVIDIA GPU
Hello, I read some topics in this forum, and based on those Q&As I`ll try to formulate my questions. We are running a virtualized environment today on HP blade servers BL460, virtualizing with vmware sphere 5.5. External Netapp Storage used. We would like to build new 2 VMs on top of it with video streaming SW WOWZA running in them, that would use the GPU capability of NVIDIA M6 (with NVENC video-codec support). We are aware that HP WS460s blades must be used because technically same BL460 is not supported by HP from some reason. So we would buy two new WS460c blades. HA redundancy is the reason why two, so in case of a host failure the VM would restart on the other. These VMs (two) would use all the GPUs memory, but would use only relatively small amount of host resources allocated (4vCPU, 32GB RAM each) - that is the reason we do not want to dedicate two hosts only for this HPComputing realtime video-transcoding stuff. https://www.wowza.com/forums/content.php?310-Server-specifications-for-NVIDIA-NVENC-and-NVIDIA-CUDA-acceleration-with-Wowza-Transcoder Could you please advise me to decide: 1. NVIDIA M6 SW licensing is talking only in desktop virtualization terms. The 3 licensing models are reffering to desktop virt. only, I have no clue how would I license a videotranscoding VM - if at all. 2. In this forum somewhere I read that HPC is not supported in a virtualized environemnt, but that comment was from 2014. Why not? Really not? So, thank you very much, Daniel HP WS460c datasheet with a list of supported GPUs: https://www.google.cz/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0ahUKEwjkvNew09DRAhXFBiwKHfetAJEQFggeMAE&url=https%3A%2F%2Fwww.hpe.com%2Fh20195%2Fv2%2FGetPDF.aspx%2F4AA5-7517ENN.pdf&usg=AFQjCNEhu88SEdiGzlT175xIHbGNvJ8XNg&sig2=IyhZWboR_T6MsrKpmrQtWA&bvm=bv.144224172,d.bGs
Hello, I read some topics in this forum, and based on those Q&As I`ll try to formulate my questions.

We are running a virtualized environment today on HP blade servers BL460, virtualizing with vmware sphere 5.5. External Netapp Storage used.
We would like to build new 2 VMs on top of it with video streaming SW WOWZA running in them, that would use the GPU capability of NVIDIA M6 (with NVENC video-codec support). We are aware that HP WS460s blades must be used because technically same BL460 is not supported by HP from some reason. So we would buy two new WS460c blades. HA redundancy is the reason why two, so in case of a host failure the VM would restart on the other.
These VMs (two) would use all the GPUs memory, but would use only relatively small amount of host resources allocated (4vCPU, 32GB RAM each) - that is the reason we do not want to dedicate two hosts only for this HPComputing realtime video-transcoding stuff.


https://www.wowza.com/forums/content.php?310-Server-specifications-for-NVIDIA-NVENC-and-NVIDIA-CUDA-acceleration-with-Wowza-Transcoder


Could you please advise me to decide:
1. NVIDIA M6 SW licensing is talking only in desktop virtualization terms. The 3 licensing models are reffering to desktop virt. only, I have no clue how would I license a videotranscoding VM - if at all.
2. In this forum somewhere I read that HPC is not supported in a virtualized environemnt, but that comment was from 2014. Why not? Really not?

So, thank you very much, Daniel

HP WS460c datasheet with a list of supported GPUs:

https://www.google.cz/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0ahUKEwjkvNew09DRAhXFBiwKHfetAJEQFggeMAE&url=https%3A%2F%2Fwww.hpe.com%2Fh20195%2Fv2%2FGetPDF.aspx%2F4AA5-7517ENN.pdf&usg=AFQjCNEhu88SEdiGzlT175xIHbGNvJ8XNg&sig2=IyhZWboR_T6MsrKpmrQtWA&bvm=bv.144224172,d.bGs

#1
Posted 01/20/2017 01:05 PM   
Hi Daniel, from the link provided I couldn't find out if you're going to run the software on client or server OS. In general this is a passthrough use case and needs a vWorkstation GRID license for client OS. If you run it on a server OS a vApps license might be sufficient as you don't need the Quadro features and would also be able to use passthrough with vApps and Server OS. Regards Simon
Hi Daniel,

from the link provided I couldn't find out if you're going to run the software on client or server OS.
In general this is a passthrough use case and needs a vWorkstation GRID license for client OS. If you run it on a server OS a vApps license might be sufficient as you don't need the Quadro features and would also be able to use passthrough with vApps and Server OS.

Regards

Simon

#2
Posted 01/21/2017 09:00 AM   
More information/clarification needed: Wowza [color="orange"]Transcoder[/color]: 1a) only NVIDIA NVENC accelerated encoding 1b) NVIDIA NVENC accelerated encoding + NVIDIA CUDA accelerated video scaling If 1a) vGPU can be used. If 1b) only passthrough option (or vGPU profile M6-8Q) is available due to CUDA requirements. [color="orange"]Hardware[/color] setup: 2a) One M6 mezzanine (805132-B21) (only one MXM mezzanine can be installed) with M6. 2b) Parner PCIe blade (Expansion Blade) (836738-B21+775168-B21) with maximum 2x multi-gpu carrier (4xMXM but maximum 2xMXM M6 installed in one card) (805133-B21). 2c) Parner PCIe blade (Expansion Blade) (836738-B21+775168-B21) and 2x Tesla P4. If 2a) two guests must share one M6 (eg. M6-4Q) and must be 1a) only (no CUDA). If 2b) there can be used passthrough option (for 1-2 cards depends on required performance and if multi gpu is supported in Wowza Transcoder). The 2c) can be mention as passthrouh option (see HPC). This is new card and it is not in support matrix now but it should be cheaper solution. [color="orange"]Performance[/color]+memory+codec requirements ([url]https://developer.nvidia.com/nvenc-application-note[/url]), there should be determined FPS & resolution & quality & latency for encoding: 3a) 1x M6 ~ 430FPS (1920x1080/YUV4:2:0, High Performance, H264) * 2 encoders * 0.8 #underclocked M6 = 688 FPS 3b) 1x P4 - 648FPS (1920x1080/YUV4:2:0, High Performance, H264) * 2 encoders * 0.7 #underclocked P4 = 907 FPS [i](#uderclocked ratio is estimated and unverified, [s]NVidia did not publish any accurate benchmarks per card model[/s] [b][color="orange"]UPDATE:[/color] NVidia published performance "detailed list" without any impact on #underclocked cards [url]https://developer.nvidia.com/nvidia-video-codec-sdk#NVENCPerf[/url][/b])[/i] [color="orange"]Licenses[/color] ([url]http://images.nvidia.com/content/pdf/grid/guides/GRID-Packaging-and-Licensing-Guide.pdf[/url], [url]http://images.nvidia.com/content/pdf/grid/guides/GRID-Licensing-Guide.pdf[/url]): - vAPP - for vGPU (M6-*A) and passthrough - [i][color="orange"]Warning:[/color] It is unknown/untested for me if the vAPP license limits number of encoder sessions (geforce cards (eg. not quadro features) have enforced software limit to 2x encoder session per system for Video SDK).[/i] - vWS - for vGPU (M6-*Q) and passthrough. - without license - switch to "Tesla" (and use some vSGA as primary display). [color="orange"]HPC[/color]/Tesla card presents large memory region to PCIe space and it is/was problem for VMware ESXi because VMware can handle only 32bit PCIe space in ESXi for unknown reason (but it should not be needed for passthrough). There is some success story ([url]https://cto.vmware.com/gpgpu-computing-with-the-nvidia-k80-on-vmware-vsphere-6/[/url]). You can try also forum for NVidia Video (SDK) technologies ([url]https://devtalk.nvidia.com/default/board/175/[/url]) (with questions not specific to grid+licensing) if there are customers with HPC setups for Video SDK. Most local HPE representatives or HPE partners should help you to organize POC.
More information/clarification needed:

Wowza Transcoder:

1a) only NVIDIA NVENC accelerated encoding
1b) NVIDIA NVENC accelerated encoding + NVIDIA CUDA accelerated video scaling

If 1a) vGPU can be used.
If 1b) only passthrough option (or vGPU profile M6-8Q) is available due to CUDA requirements.

Hardware setup:

2a) One M6 mezzanine (805132-B21) (only one MXM mezzanine can be installed) with M6.
2b) Parner PCIe blade (Expansion Blade) (836738-B21+775168-B21) with maximum 2x multi-gpu carrier (4xMXM but maximum 2xMXM M6 installed in one card) (805133-B21).
2c) Parner PCIe blade (Expansion Blade) (836738-B21+775168-B21) and 2x Tesla P4.

If 2a) two guests must share one M6 (eg. M6-4Q) and must be 1a) only (no CUDA).
If 2b) there can be used passthrough option (for 1-2 cards depends on required performance and if multi gpu is supported in Wowza Transcoder).
The 2c) can be mention as passthrouh option (see HPC). This is new card and it is not in support matrix now but it should be cheaper solution.

Performance+memory+codec requirements (https://developer.nvidia.com/nvenc-application-note), there should be determined FPS & resolution & quality & latency for encoding:

3a) 1x M6 ~ 430FPS (1920x1080/YUV4:2:0, High Performance, H264) * 2 encoders * 0.8 #underclocked M6 = 688 FPS
3b) 1x P4 - 648FPS (1920x1080/YUV4:2:0, High Performance, H264) * 2 encoders * 0.7 #underclocked P4 = 907 FPS

(#uderclocked ratio is estimated and unverified, NVidia did not publish any accurate benchmarks per card model UPDATE: NVidia published performance "detailed list" without any impact on #underclocked cards https://developer.nvidia.com/nvidia-video-codec-sdk#NVENCPerf)

Licenses (http://images.nvidia.com/content/pdf/grid/guides/GRID-Packaging-and-Licensing-Guide.pdf, http://images.nvidia.com/content/pdf/grid/guides/GRID-Licensing-Guide.pdf):

- vAPP - for vGPU (M6-*A) and passthrough - Warning: It is unknown/untested for me if the vAPP license limits number of encoder sessions (geforce cards (eg. not quadro features) have enforced software limit to 2x encoder session per system for Video SDK).
- vWS - for vGPU (M6-*Q) and passthrough.
- without license - switch to "Tesla" (and use some vSGA as primary display).

HPC/Tesla card presents large memory region to PCIe space and it is/was problem for VMware ESXi because VMware can handle only 32bit PCIe space in ESXi for unknown reason (but it should not be needed for passthrough). There is some success story (https://cto.vmware.com/gpgpu-computing-with-the-nvidia-k80-on-vmware-vsphere-6/). You can try also forum for NVidia Video (SDK) technologies (https://devtalk.nvidia.com/default/board/175/) (with questions not specific to grid+licensing) if there are customers with HPC setups for Video SDK.

Most local HPE representatives or HPE partners should help you to organize POC.

#3
Posted 01/21/2017 01:11 PM   
Hello gentlemen, I`m kindly surprised I have got 2 relevant answers just after returning to work after the weekend. Thank you very much for that! Now I`m going to dive into the topic, it looks it is going to be fun. :)
Hello gentlemen, I`m kindly surprised I have got 2 relevant answers just after returning to work after the weekend. Thank you very much for that!
Now I`m going to dive into the topic, it looks it is going to be fun. :)

#4
Posted 01/23/2017 08:01 AM   
Scroll To Top

Add Reply