NVIDIA
M5000 GPU Pass-through to VM
I am having an issue with getting a VM to load and display a graphics package. The VM is running RHEL 7 VM that is using PCI Pass-through I created a Linux RHEL x64 VM. I was able to activate the M5000 GPU for the VM via PCI. When the VM boots, a modified XORG RHEL OS is loaded via a PXE menu. I can see the OS load as there are several messages being displayed on the VM screen. It continues and then it stops at the message: “Started OpenSSH server daemon” Looking at the Xorg.log, I see messages such as: NVIDIA GLX Module 390.25 .... Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so NVIDIA Unified Driver for all Supported NVIDIA GPUs Using VT number 1 No device detected Fatal Server Error No screens found I am not sure what this specialized OS runs, but it loaded on physical server and works. Any ideas on what i need to add or modify to get the OS to load and run the graphics to the VM screen?
I am having an issue with getting a VM to load and display a graphics package.
The VM is running RHEL 7 VM that is using PCI Pass-through


I created a Linux RHEL x64 VM.
I was able to activate the M5000 GPU for the VM via PCI.

When the VM boots, a modified XORG RHEL OS is loaded via a PXE menu.

I can see the OS load as there are several messages being displayed on the VM screen.
It continues and then it stops at the message:
“Started OpenSSH server daemon”

Looking at the Xorg.log, I see messages such as:

NVIDIA GLX Module 390.25 ....
Loading /usr/lib/xorg/modules/drivers/nvidia_drv.so
NVIDIA Unified Driver for all Supported NVIDIA GPUs

Using VT number 1

No device detected

Fatal Server Error

No screens found


I am not sure what this specialized OS runs, but it loaded on physical server and works.

Any ideas on what i need to add or modify to get the OS to load and run the graphics to the VM screen?

#1
Posted 06/11/2018 06:18 PM   
You should check other things in guest: [code]# ### device visible: # lspci -vvv -d 10de:* # ### driver loaded: # lsmod | grep nvidia # ### no errors in driver: # dmesg | egrep 'nvidia|NVRM' # ### driver responding to basic command (try two times): # nvidia-smi [/code]
You should check other things in guest:
# ### device visible:
# lspci -vvv -d 10de:*
# ### driver loaded:
# lsmod | grep nvidia
# ### no errors in driver:
# dmesg | egrep 'nvidia|NVRM'
# ### driver responding to basic command (try two times):
# nvidia-smi

#2
Posted 06/17/2018 09:49 PM   
thanks,,, i will run these and report later.. i did create a new RHEL 7 VM and loaded the PXE image,..same info below.. [b]The xorg file shows [/b] [ 19.153] (II) Module glx: vendor="NVIDIA Corporation" [ 19.154] compiled for 4.0.2, module version = 1.0.0 [ 19.154] Module class: X.Org Server Extension [ 19.154] (II) NVIDIA GLX Module 390.25 Wed Jan 24 19:23:51 PST 2018 [ 19.154] (II) LoadModule: "nvidia" [ 19.154] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so [ 19.154] (II) Module nvidia: vendor="NVIDIA Corporation" [ 19.154] compiled for 4.0.2, module version = 1.0.0 [ 19.154] Module class: X.Org Video Driver [ 19.154] (II) NVIDIA dlloader X Driver 352.93 Wed Jan 24 18:57:05 PST 2018 [ 19.154] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs [ 19.154] (++) using VT number 1 [ 19.157] (EE) No devices detected. [ 19.157] (EE) Fatal server error: [b]If I issue a nvdia-msi, I can see the driver is loaded and working the card..[/b] Mon Jun 18 15:56:53 2018 +------------------------------------------------------+ | NVIDIA-SMI 390.25 Driver Version: 390.25 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Quadro M5000 Off | 00000000:0B:00.0 Off | N/A | | 0% 34C P0 46W / 150W | 0MiB / 8126MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
thanks,,,
i will run these and report later..

i did create a new RHEL 7 VM and loaded the PXE image,..same info below..

The xorg file shows

[ 19.153] (II) Module glx: vendor="NVIDIA Corporation"
[ 19.154] compiled for 4.0.2, module version = 1.0.0
[ 19.154] Module class: X.Org Server Extension
[ 19.154] (II) NVIDIA GLX Module 390.25 Wed Jan 24 19:23:51 PST 2018
[ 19.154] (II) LoadModule: "nvidia"
[ 19.154] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
[ 19.154] (II) Module nvidia: vendor="NVIDIA Corporation"
[ 19.154] compiled for 4.0.2, module version = 1.0.0
[ 19.154] Module class: X.Org Video Driver
[ 19.154] (II) NVIDIA dlloader X Driver 352.93 Wed Jan 24 18:57:05 PST 2018
[ 19.154] (II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
[ 19.154] (++) using VT number 1

[ 19.157] (EE) No devices detected.
[ 19.157] (EE)
Fatal server error:

If I issue a nvdia-msi, I can see the driver is loaded and working the card..

Mon Jun 18 15:56:53 2018
+------------------------------------------------------+
| NVIDIA-SMI 390.25 Driver Version: 390.25 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro M5000 Off | 00000000:0B:00.0 Off | N/A |
| 0% 34C P0 46W / 150W | 0MiB / 8126MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

#3
Posted 06/19/2018 02:33 PM   
Try this (explicit 'BusID "PCI:11:00:0"'): [url]https://gridforums.nvidia.com/default/topic/962/#3387[/url]
Try this (explicit 'BusID "PCI:11:00:0"'): https://gridforums.nvidia.com/default/topic/962/#3387

#4
Posted 06/19/2018 05:27 PM   
thanks Are you saying place "PCI:11:00:0" for BusID in the [b]Xorg.conf[/b] file, as in below? What exactly does 11:00:0 represent? Section "Device" Identifier "Device0" Driver "nvidia" VendorName "NVIDIA Corporation" BoardName "GRID M60-4Q" BusID "PCI:11:00:0" EndSection
thanks

Are you saying place "PCI:11:00:0" for BusID in the Xorg.conf file, as in below?

What exactly does 11:00:0 represent?


Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GRID M60-4Q"
BusID "PCI:11:00:0"
EndSection

#5
Posted 06/20/2018 11:19 AM   
nvidia-smi "00000000:0B:00.0" -> xorg.conf "PCI:11:00:0" (see "Xorg -scanpci", "man xorg.conf", documentation ...).
nvidia-smi "00000000:0B:00.0" -> xorg.conf "PCI:11:00:0" (see "Xorg -scanpci", "man xorg.conf", documentation ...).

#6
Posted 06/20/2018 12:09 PM   
thanks,,,i will look into this next week and respond when i find out something
thanks,,,i will look into this next week and respond when i find out something

#7
Posted 06/21/2018 11:42 AM   
Update,,, I modified the Xorg.conf, adding BusID as in below 1. Section "Device" 2. Identifier "Device0" 3. Driver "nvidia" 4. VendorName "NVIDIA Corporation" 5. BusID "PCI:11:0:0" 6. EndSection I restarted the VM, but it still hangs and I get a different error... [20.913] (--) NVIDIA(GPU-0) [20.913] (EE) NVIDIA(0): Failed to assign any connected display devices to X screen 0. [20.913] (EE) NVIDIA(0): Set AllowEmptyInitialConfiguration if you want the server [20.913] (EE) NVIDIA(0): to start anyway [20.913] (EE) NVIDIA(0): Failing initialization of X screen 0 [20.913] (II) UnloadModule: "nvidia" [20.913] (II) UnloadSubModule: "wfb" [20.913] (II) UnloadSubModule: "fb" [20.913] (EE) Screen(s) found, but none have a usable configuration. [20.913] (EE) Fatal server error: [20.913] (EE) no screens found(EE) [20.913] (EE) Please consult the The X.Org Foundation support at http://wiki.x.org for help.
Update,,,

I modified the Xorg.conf, adding BusID as in below

1. Section "Device"
2. Identifier "Device0"
3. Driver "nvidia"
4. VendorName "NVIDIA Corporation"
5. BusID "PCI:11:0:0"
6. EndSection

I restarted the VM, but it still hangs and I get a different error...

[20.913] (--) NVIDIA(GPU-0)
[20.913] (EE) NVIDIA(0): Failed to assign any connected display devices to X screen 0.
[20.913] (EE) NVIDIA(0): Set AllowEmptyInitialConfiguration if you want the server
[20.913] (EE) NVIDIA(0): to start anyway
[20.913] (EE) NVIDIA(0): Failing initialization of X screen 0
[20.913] (II) UnloadModule: "nvidia"
[20.913] (II) UnloadSubModule: "wfb"
[20.913] (II) UnloadSubModule: "fb"
[20.913] (EE) Screen(s) found, but none have a usable configuration.
[20.913] (EE)
Fatal server error:
[20.913] (EE) no screens found(EE)
[20.913] (EE)
Please consult the The X.Org Foundation support at http://wiki.x.org for help.

#8
Posted 07/02/2018 06:39 PM   
[quote=""] [ 19.154] (II) NVIDIA GLX Module [b]390.25[/b] Wed Jan 24 19:23:51 PST 2018 [ 19.154] (II) NVIDIA dlloader X Driver [b]352.93[/b] Wed Jan 24 18:57:05 PST 2018 [/quote] The driver is not correctly installed. And again you should analyze also dmesg after Xorg fail (like "dmesg | egrep 'nvidia|NVRM'").
said:
[ 19.154] (II) NVIDIA GLX Module 390.25 Wed Jan 24 19:23:51 PST 2018
[ 19.154] (II) NVIDIA dlloader X Driver 352.93 Wed Jan 24 18:57:05 PST 2018

The driver is not correctly installed. And again you should analyze also dmesg after Xorg fail (like "dmesg | egrep 'nvidia|NVRM'").

#9
Posted 07/03/2018 06:13 AM   
i typed the log file and may have transposed the incorrect driver and date Below is from the old Xorg. [ 19.154] (II) NVIDIA GLX Module 390.25 Wed Jan 24 19:23:51 PST 2018 [ 19.154] (II) NVIDIA dlloader X Driver 390.25 Wed Jan 24 18:57:05 PST 2018 [ 19.154] (++) using VT number 1 [ 19.157] (EE) No devices detected. [ 19.157] (EE) Fatal server error I did the commands,,, # ### device visible: # lspci -vvv -d 10de:* ---- no match # ### driver loaded: # lsmod | grep nvidia ---- comes back with a list of nvidia files (nvidia-drm, nvidia-modset,nvidis,,etc) # ### no errors in driver: # dmesg | egrep 'nvidia|NVRM' ---- comes back with a listing showing irq info, Allocated GPU, Freed GPU (nvidia 0000:0b:00:0 irq 69 for MSI/MI-X, nvidia-modeset: Freed GPU:0 (GPU-25aa9821-………………..) # ### driver responding to basic command (try two times): # nvidia-smi ---- comes back with a page shwong info for the M5000 both times
i typed the log file and may have transposed the incorrect driver and date
Below is from the old Xorg.

[ 19.154] (II) NVIDIA GLX Module 390.25 Wed Jan 24 19:23:51 PST 2018

[ 19.154] (II) NVIDIA dlloader X Driver 390.25 Wed Jan 24 18:57:05 PST 2018

[ 19.154] (++) using VT number 1

[ 19.157] (EE) No devices detected.
[ 19.157] (EE)
Fatal server error

I did the commands,,,

# ### device visible:
# lspci -vvv -d 10de:*
---- no match
# ### driver loaded:
# lsmod | grep nvidia
---- comes back with a list of nvidia files (nvidia-drm, nvidia-modset,nvidis,,etc)
# ### no errors in driver:
# dmesg | egrep 'nvidia|NVRM'
---- comes back with a listing showing irq info, Allocated GPU, Freed GPU (nvidia 0000:0b:00:0 irq 69 for MSI/MI-X, nvidia-modeset: Freed GPU:0 (GPU-25aa9821-………………..)
# ### driver responding to basic command (try two times):
# nvidia-smi
---- comes back with a page shwong info for the M5000 both times

#10
Posted 07/03/2018 02:26 PM   
You can try this: [list] [.]Deactivate selinux "setenforce 0" (see [url]https://docs.nvidia.com/grid/latest/grid-vgpu-release-notes-citrix-xenserver/index.html#bug-200167868-gdm-start-failure-on-rhel-7-2[/url]).[/.] [.]Start X under root account "Xorg :0 &".[/.] [.]Start X with tracing "strace -o /tmp/out Xorg :0 &" and analyze errors from /tmp/out.[/.] [.]Ask support [url]http://www.nvidia.com/object/support.html[/url] ([i]but support is busy with running Skynet, moving to "Endeavor" or building "Voyager"[/i]).[/.] [.]Sell Quadro Mxxxx and buy Quadro >= K2000 (usually works better) or try to buy AMD cards.[/.] [/list]That all from me for this topic.
You can try this:
That all from me for this topic.

#11
Posted 07/04/2018 08:21 AM   
i will try the commands not an option to see and buy a K2,,we have to prove first the GPU will work in a VMware VM on ESXi 6.7 with pass-through
i will try the commands

not an option to see and buy a K2,,we have to prove first the GPU will work in a VMware VM on ESXi 6.7 with pass-through

#12
Posted 07/10/2018 12:16 PM   
Scroll To Top

Add Reply