NVIDIA
Dell R730, Tesla K20Xm, CentOS 7, Matlab
Hi There, I've two sets of Dell R730 server, each installed a Tesla K20Xm GPU card. I want to run Matlab (and/or other software) on these server using GPU rendering. First, I installed CentOS 7 on these servers using Serve GUI. Then I install NVIDIA driver via elrepo using the following procedure: [list] [.][b]systemctl set-default multi-user.target[/b][/.] [.][b]reboot[/b][/.] [.][b]yum install epel-release[/b][/.] [.][b]yum install http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm[/b][/.] [.][b]yum install nvidia-detect[/b][/.] [.][b]nvidia-detect[/b](and the output is kmod-nvidia)[/.] [.][b]yum install kmod-nvidia nvidia-x11-drv bumblebee[/b][/.] [.]Modify /etc/bumblebee/bumblebee.conf according to ([url]http://elrepo.org/tiki/bumblebee[/url])[/.] [.][b]reboot[/b][/.] [/list] Note that if I do not install bumblebee, the X cannot start with error (no screen found). The reason may be GPU has no display output. Now, after reboot, when I run the command [b]nvidia-smi[/b], the output of two servers are: [list] [.]Failed to initialize NVML: Function Not Found[/.] [.]Unable to determine the device handle for GPU 0000:04:00.0: Unable to communicate with GPU because it is insufficiently powered. This may be because not all required external power cables are attached, or the attached cables are not seated properly.[/.] [/list] The two servers should have the same hardware power cables attached to GPU card. Can anyone help or suggest the correct installation on CentOS 7 using this configuration? Thanks
Hi There,


I've two sets of Dell R730 server, each installed a Tesla K20Xm GPU card. I want to run Matlab (and/or other software) on these server using GPU rendering.

First, I installed CentOS 7 on these servers using Serve GUI. Then I install NVIDIA driver via elrepo using the following procedure:

  • systemctl set-default multi-user.target
  • reboot
  • yum install epel-release
  • yum install http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
  • yum install nvidia-detect
  • nvidia-detect(and the output is kmod-nvidia)
  • yum install kmod-nvidia nvidia-x11-drv bumblebee
  • Modify /etc/bumblebee/bumblebee.conf according to (http://elrepo.org/tiki/bumblebee)
  • reboot


Note that if I do not install bumblebee, the X cannot start with error (no screen found). The reason may be GPU has no display output.

Now, after reboot, when I run the command nvidia-smi, the output of two servers are:

  • Failed to initialize NVML: Function Not Found
  • Unable to determine the device handle for GPU 0000:04:00.0: Unable to communicate with GPU because it is insufficiently powered.
    This may be because not all required external power cables are attached, or the attached cables are not seated properly.


The two servers should have the same hardware power cables attached to GPU card. Can anyone help or suggest the correct installation on CentOS 7 using this configuration?

Thanks

#1
Posted 01/09/2018 02:06 PM   
Scroll To Top

Add Reply