NVIDIA
nvidia-smi returns failed result on Cisco b200-m4
I have b200-m4 server with m6 card installed. However after installing the vib using the Grid quick startup guide, the nvidia-smi returns error. Any information regarding why? I see this Cisco server is certified with grid: [root@localhost:~] nvidia-smi NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. [root@localhost:~] esxcli hardware platform get Platform Information UUID: ???? Product Name: UCSB-B200-M4 Vendor Name: Cisco Systems Inc Serial Number: ??? IPMI Supported: true [root@localhost:/tmp] esxcli hardware pci list | grep -i Nvidia -A 20 -B 20 0000:81:00.0 Address: 0000:81:00.0 Segment: 0x0000 Bus: 0x81 Slot: 0x00 Function: 0x0 VMkernel Name: Vendor Name: nVidia Corporation Device Name: <class> 3D controller Configured Owner: Unknown Current Owner: VMkernel Vendor ID: 0x10de Device ID: 0x13f3 SubVendor ID: 0x10de SubDevice ID: 0x1143 Device Class: 0x0302 Device Class Name: 3D controller Programming Interface: 0x00 Revision ID: 0xa1 Interrupt Line: 0x0b IRQ: 11 Interrupt Vector: 0x35 PCI Pin: 0x00 Spawned Bus: 0x00 Flags: 0x0201 Module ID: -1 Module Name: None Chassis: 0 [root@localhost:~] esxcli software vib list | grep -i nvidia NVIDIA-vGPU-VMware_ESXi_6.0_Host_Driver 367.64-1OEM.600.0.0.2494585 NVIDIA VMwareAccepted 2017-01-02 [root@localhost:~]
I have b200-m4 server with m6 card installed. However after installing the vib using the Grid quick startup guide, the nvidia-smi returns error. Any information regarding why? I see this Cisco server is certified with grid:


[root@localhost:~] nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

[root@localhost:~] esxcli hardware platform get
Platform Information
UUID: ????
Product Name: UCSB-B200-M4
Vendor Name: Cisco Systems Inc
Serial Number: ???
IPMI Supported: true


[root@localhost:/tmp] esxcli hardware pci list | grep -i Nvidia -A 20 -B 20

0000:81:00.0

Address: 0000:81:00.0

Segment: 0x0000

Bus: 0x81

Slot: 0x00

Function: 0x0

VMkernel Name:

Vendor Name: nVidia Corporation

Device Name: <class> 3D controller

Configured Owner: Unknown

Current Owner: VMkernel

Vendor ID: 0x10de

Device ID: 0x13f3

SubVendor ID: 0x10de

SubDevice ID: 0x1143

Device Class: 0x0302

Device Class Name: 3D controller

Programming Interface: 0x00

Revision ID: 0xa1

Interrupt Line: 0x0b

IRQ: 11

Interrupt Vector: 0x35

PCI Pin: 0x00

Spawned Bus: 0x00

Flags: 0x0201

Module ID: -1

Module Name: None

Chassis: 0


[root@localhost:~] esxcli software vib list | grep -i nvidia
NVIDIA-vGPU-VMware_ESXi_6.0_Host_Driver 367.64-1OEM.600.0.0.2494585 NVIDIA VMwareAccepted 2017-01-02
[root@localhost:~]

#1
Posted 01/02/2017 08:48 AM   
Here is the ESXi version: [root@localhost:~] esxcli system version get Product: VMware ESXi Version: 6.0.0 Build: Releasebuild-2494585 Update: 0 Patch: 0
Here is the ESXi version:
[root@localhost:~] esxcli system version get
Product: VMware ESXi
Version: 6.0.0
Build: Releasebuild-2494585
Update: 0
Patch: 0

#2
Posted 01/02/2017 08:53 AM   
You probably need to switch from "3D controller" to "VGA Controller" mode - see [url]http://nvidia.custhelp.com/app/answers/detail/a_id/4164/~/running-the-gpumodeswitch-tool-on-grid-m60%2Fm6-cards-on-vmware-esxi-returns-an[/url] + study package "NVIDIA-gpumodeswitch-2016-04.zip" + "GRID gpumodeswitch User Guide.pdf" ... and blogs like https://virtuallyvisual.wordpress.com/2016/04/19/nvidia-m60-m6-problems-check-your-card-in-graphics-mode/ and https://www.youtube.com/watch?v=VAQhiNNFXxQ
You probably need to switch from "3D controller" to "VGA Controller" mode - see http://nvidia.custhelp.com/app/answers/detail/a_id/4164/~/running-the-gpumodeswitch-tool-on-grid-m60%2Fm6-cards-on-vmware-esxi-returns-an + study package "NVIDIA-gpumodeswitch-2016-04.zip" + "GRID gpumodeswitch User Guide.pdf" ... and blogs like https://virtuallyvisual.wordpress.com/2016/04/19/nvidia-m60-m6-problems-check-your-card-in-graphics-mode/ and

#3
Posted 01/02/2017 09:04 AM   
Thanks, I followed it, however getting this error now. I also see this same error from net ssh connection but also from hypervisor's its own console: [root@localhost:~] gpumodeswitch --gpumode graphics NVIDIA GPU Mode Switch Utility Version 1.23.0 Copyright (C) 2015, NVIDIA Corporation. All Rights Reserved. Update GPU Mode of all adapters to "graphics"? Press 'y' to confirm or 'n' to choose adapters or any other key to abort: y Updating GPU Mode of all eligible adapters to "graphics" ERROR: Read card info failed by using character device based. [root@localhost:~] [root@localhost:~] [root@localhost:~] [root@localhost:~] gpumodeswitch --gpumode graphics
Thanks, I followed it, however getting this error now. I also see this same error from net ssh connection but also from hypervisor's its own console:

[root@localhost:~] gpumodeswitch --gpumode graphics

NVIDIA GPU Mode Switch Utility Version 1.23.0
Copyright (C) 2015, NVIDIA Corporation. All Rights Reserved.


Update GPU Mode of all adapters to "graphics"?
Press 'y' to confirm or 'n' to choose adapters or any other key to abort:
y

Updating GPU Mode of all eligible adapters to "graphics"


ERROR: Read card info failed by using character device based.
[root@localhost:~]
[root@localhost:~]
[root@localhost:~]
[root@localhost:~] gpumodeswitch --gpumode graphics

#4
Posted 01/02/2017 03:51 PM   
1) You can try to switch mode over console (CIMC) with included linux bootable gpumodeswitch.iso image (selected mode is persistent on card). 2) You can try detect errors with "gpumodeswitch --listgpumodes" and/or output log "more /tmp/listgpumodes.txt" as described in pdf. You can try "esxcli hardware pci list" or "esxcfg-info -a" or search log for relevant errors "/var/log/vmkernel.log" ... 3) You can try contact/call NVidia support if you paid for Support Updates and Maintenance agreement (SUMS).
1) You can try to switch mode over console (CIMC) with included linux bootable gpumodeswitch.iso image (selected mode is persistent on card).
2) You can try detect errors with "gpumodeswitch --listgpumodes" and/or output log "more /tmp/listgpumodes.txt" as described in pdf. You can try "esxcli hardware pci list" or "esxcfg-info -a" or search log for relevant errors "/var/log/vmkernel.log" ...
3) You can try contact/call NVidia support if you paid for Support Updates and Maintenance agreement (SUMS).

#5
Posted 01/02/2017 04:46 PM   
i did and it is now in what appears to be in graphics mode. But nvidia-smi still returning the same crap. I am out of here.
i did and it is now in what appears to be in graphics mode. But nvidia-smi still returning the same crap. I am out of here.

#6
Posted 01/02/2017 09:06 PM   
You can try more blogs (or directly Cisco for validated setup) - [url]https://virtuallyvisual.wordpress.com/2016/06/20/new-cisco-validated-design-featuring-ucs-b200-m4-with-nvidia-grid-m6-vgpu-available-now/[/url] - for example - ESXi 6.0 [b]Update 1[/b], upgrade your Cisco UCS system to a version of Cisco UCS Manager that supports this card, BIOS tweaking (card must be mapped under 4G address space for ESXi) ...
You can try more blogs (or directly Cisco for validated setup) - https://virtuallyvisual.wordpress.com/2016/06/20/new-cisco-validated-design-featuring-ucs-b200-m4-with-nvidia-grid-m6-vgpu-available-now/ - for example - ESXi 6.0 Update 1, upgrade your Cisco UCS system to a version of Cisco UCS Manager that supports this card, BIOS tweaking (card must be mapped under 4G address space for ESXi) ...

#7
Posted 01/02/2017 09:50 PM   
As an M6 user you should be entitled to support via your licenses. Please do raise a support case with the organisation you bought the licenses and support from e.g. Cisco or NVIDIA as appropriate. Best wishes, Rachel
As an M6 user you should be entitled to support via your licenses. Please do raise a support case with the organisation you bought the licenses and support from e.g. Cisco or NVIDIA as appropriate.
Best wishes,
Rachel

#8
Posted 01/03/2017 03:03 PM   
Scroll To Top

Add Reply