Showing posts with label Linux. Show all posts
Showing posts with label Linux. Show all posts

Wednesday, August 19, 2009

SAAHPC presentation

I recently presented my work at the SAAHPC conference, held at NCSA Urbana, IL. The Keynote talk was by Pradeep Dubey on Massive Data Computing using Intel Larrabee. Excellent overview on why data transfer and management is the challenge in today's computing. I liked the Connected Computing factor - Content, Connect and Compute. I have found out from real-time problems that having the fastest computation platform is not enough, it's even more important to sustain the streaming bandwidth of the data into the platform. Obviously, once the data is inside the memory, computation is fast. But the real bottleneck is in importing and offloading the data and trying to streamline and synchronize the data (some of my PhD dissertation grumble).

I liked the talk by Michael Garland on GPU Computing using CUDA. This was informative in terms of his insight on the Thrust template. Thrust is open source and is hosted on Google Code. I have to start using this for my next CUDA project.

Finally, here's a link to my presentation on "Accelerating Particle Image Velocimetry using Hybrid Architectures".

Friday, July 31, 2009

Installing CUDA 2.3

Installing CUDA 2.3 is pretty easy and straightforward. However, on my Mac, the CUDA SDK examples are now in
/Developer/GPU Computing

In addition to the drivers for Leopard, there's a separate version for Snow Leopard :) New documentation includes the CUDA Best Practices Guide. This is the correct link, thanks to the NVIDIA forums post. The CUDA Resource page does not point you to the correct link.

Have a good time accelerating your apps using CUDA :)

Monday, June 29, 2009

cuda-gdb error on CentOS

CUDA 2.2 comes with its native debugger support through cuda-gdb. However, I had some problems configuring cuda-gdb on my Linux distro (CentOS).
When I execute cuda-gdb fresh after my installation, here's what I see:
$ cuda-gdb
cuda-gdb: error while loading shared libraries: libtermcap.so.2: cannot open shared object file: No such file or directory

I tried searching for the missing library libtermcap.so.2 using
locate libtermcap

A google search led me to http://forums.nvidia.com/index.php?showtopic=96987
and I installed libncurses and linked it correctly.
sudo yum install ncurses.x86_64
sudo ln -s /usr/lib64/libncurses.so /usr/local/cuda/lib/libtermcap.so.2

That seemed to do the trick !!!
$ cuda-gdb
NVIDIA (R) CUDA Debugger
BETA release
Portions Copyright (C) 2008,2009 NVIDIA Corporation
GNU gdb 6.6
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
(cuda-gdb)

For more info on cuda-gdb refer to: http://developer.download.nvidia.com/compute/cuda/2_2/toolkit/docs/CUDA_GDB_User_Manual_2.2beta.pdf

Tuesday, May 12, 2009

Installing CUDA 2.2 on MacPro running CentOS 64-bit

My test machine is an early 200 Mac Pro with a NVIDIA Tesla C1060. Upgrading to CUDA 2.2 on this Mac was relatively easy. Please read my previous post for installing CUDA 2.1 before attempting to install CUDA 2.2 on a CentOS 64-bit distro.

Step 1: Download packages:
wget http://developer.download.nvidia.com/compute/cuda/2_2/drivers/cudadriver_2.2_linux_64_185.18.08-beta.run
wget http://developer.download.nvidia.com/compute/cuda/2_2/toolkit/cudatoolkit_2.2_linux_64_rhel5.3.run
wget http://developer.download.nvidia.com/compute/cuda/2_2/sdk/cudasdk_2.2_linux.run
wget http://developer.download.nvidia.com/compute/cuda/2_2/toolkit/cudagdb_2.2_linux_64_rhel5.3.run
Change permissions
chmod +x cudadriver_2.2_linux_64_185.18.08-beta.run
chmod +x cudatoolkit_2.2_linux_64_rhel5.3.run
chmod +x cudagdb_2.2_linux_64_rhel5.3.run
chmod +x cudasdk_2.2_linux.run
Step 2: Find your kernel source and install the corresponding kernel-devel and kernel-headers:
uname -r
2.6.27.21-170.2.56.fc10.x86_64

sudo yum install kernel-devel
sudo yum install kernel-headers
Step 3: Install the CUDA driver first, followed by the toolkit, gdb and finally the SDK.
sudo ./cudadriver_2.2_linux_64_185.18.08-beta.run --kernel-source-path /usr/src/kernels/2.6.27.21-170.2.56.fc10.x86_64
sudo ./cudatoolkit_2.2_linux_64_rhel5.3.run
sudo ./cudagdb_2.2_linux_64_rhel5.3.run
./cudasdk_2.2_linux.run
Make sure that your path is configured correctly. (Refer to previous post)

Step 4: Enable the cuda script mentioned in the previous post.
sudo service cuda start
check the nvidia driver status and you should see this:
ls /dev/nv*
/dev/nvidia0 /dev/nvidiactl /dev/nvram
Step 5: Make sure you do a fresh install of the SDK if you are upgrading.
cd NVIDIA_CUDA_SDK/
make
This should make all your projects in the following directory:
$HOME/NVIDIA_CUDA_SDK/bin/linux/release
Executing the deviceQuery executable shows me this:
./deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA

Device 0: "Tesla C1060"
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294705152 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit...
Congrats !!! CUDA 2.2 is configured on your 64-bit CentOS Mac Pro and also recognizes the NVIDIA Tesla C1060. Looking forward to post some examples with the zero copy feature :)


Saturday, February 21, 2009

Installing Nvidia Tesla C1060 on a Mac Pro with CentOS

After my last post on getting CUDA 2.0 to run on my Mac, I will go through the steps for installing a Nvidia Tesla C1060 on a early 2008 Mac Pro with CentOS or equivalent RHEL 5.xx Linux distro. The Tesla C1060 has 240 streaming processor cores (1.3 GHz), 4GB onboard memory, requires PCIe x16, occupies two slots and has a 6-pin and 8-pin connector.

Instructions:

1. Physical install: Please download the C1060 User Manual and go through it before installing the C1060 physically. The C1060 has two connectors 6-pin and 8-pin, for a Mac Pro please please order the Booster cable (that's the correct Apple jargon). The booster cable is mentioned in the Mac Pro documentation to connect the Nvidia Quadro FX 5600. After inserting the C1060, refer to page 16 of the manual to see the LED status. Red indicates something wrong in connecting the booster cables. (As an afterthought, if you have a PC, you can buy the standard 6-pin/8-pin connector from Newegg for about $6.99)

2. Software install:
Step 1: Get the latest CUDA drivers, toolkit and SDK from here. I selected the CUDA 2.1 drivers for Linux 64-bit for RHEL 5.xx. (You can paste the code directly to your terminal)

wget http://developer.download.nvidia.com/compute/cuda/2_1/drivers/NVIDIA-Linux-x86_64-180.22-pkg2.run
wget http://developer.download.nvidia.com/compute/cuda/2_1/toolkit/cudatoolkit_2.1_linux64_rhel5.2.run
wget http://developer.download.nvidia.com/compute/cuda/2_1/SDK/cuda-sdk-linux-2.10.1215.2015-3233425.run


Change permissions on the downloaded files
chmod +x NVIDIA-Linux-x86_64-180.22-pkg2.run
chmod +x cuda-sdk-linux-2.10.1215.2015-3233425.run
chmod +x cudatoolkit_2.1_linux64_rhel5.2.run


Step 2: Install the following packages or see if you have them installed on your system already.

sudo yum install freeglut-devel
sudo yum install libXi-devel


Step 3: Install the CUDA driver first, the toolkit and the SDK finally.
sudo ./NVIDIA-Linux-x86_64-180.22-pkg2.run
sudo ./cudatoolkit_2.1_linux64_rhel5.2.run
./cuda-sdk-linux-2.10.1215.2015-3233425.run


The Nvidia driver build the kernel for the Linux distro, if it cannot find the precompiled kernel. Click yes and go on. The toolkit installs cuda to /usr/local/cuda by default. The SDK installs into the NVIDIA_CUDA_SDK directory in your home folder by default.

Paste the following in your terminal (thanks to the Ubuntu instructions here)
echo "# CUDA stuff
PATH=\$PATH:/usr/local/cuda/bin
LD_LIBRARY_PATH=\$LD_LIBRARY_PATH:/usr/local/cuda/lib
export PATH
export LD_LIBRARY_PATH" >> ~/.bashrc
Step 4: On a RHEL Linux distro, a startup script is required to turn on and off the Nvidia device. This startup script is posted on the Nvidia forums.
#!/bin/bash
#
# Startup/shutdown script for nVidia CUDA
#
# chkconfig: 345 80 20
# description: Startup/shutdown script for nVidia CUDA

# Source function library.
. /etc/init.d/functions

DRIVER=nvidia
RETVAL=0

# Create /dev nodes for nvidia devices
function createnodes() {
# Count the number of NVIDIA controllers found.
N3D=`/sbin/lspci | grep -i NVIDIA | grep "3D controller" | wc -l`
NVGA=`/sbin/lspci | grep -i NVIDIA | grep "VGA compatible controller" | wc -l`

N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 666 /dev/nvidia$i c 195 $i
RETVAL=$?
[ "$RETVAL" = 0 ] || exit $RETVAL
done

mknod -m 666 /dev/nvidiactl c 195 255
RETVAL=$?
[ "$RETVAL" = 0 ] || exit $RETVAL
}

# Remove /dev nodes for nvidia devices
function removenodes() {
rm -f /dev/nvidia*
}

# Start daemon
function start() {
echo -n $"Loading $DRIVER kernel module: "
modprobe $DRIVER && success || failure
RETVAL=$?
echo
[ "$RETVAL" = 0 ] || exit $RETVAL

echo -n $"Initializing CUDA /dev entries: "
createnodes && success || failure
RETVAL=$?
echo
[ "$RETVAL" = 0 ] || exit $RETVAL
}

# Stop daemon
function stop() {
echo -n $"Unloading $DRIVER kernel module: "
rmmod -f $DRIVER && success || failure
RETVAL=$?
echo
[ "$RETVAL" = 0 ] || exit $RETVAL

echo -n $"Removing CUDA /dev entries: "
removenodes && success || failure
RETVAL=$?
echo
[ "$RETVAL" = 0 ] || exit $RETVAL
}

# See how we were called
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
stop
start
;;
*)
echo $"Usage: $0 {start|stop|restart}"
RETVAL=1
esac
exit $RETVAL


Copy this startup script into a file, "cuda_startup_script".
cp $HOME/cuda_drivers_2.1/cuda_startup_script cuda
cp cuda /etc/init.d/cuda
chmod 755 /etc/init.d/cuda
chkconfig --add cuda
chkconfig cuda on
service cuda start


After this, check if the device is recognized by executing
ls /dev/nv*


You should see the following in your terminal
/dev/nvidia0  /dev/nvidiactl  /dev/nvram


The "/dev/nvidiactl" indicates that the Nvidia card is recognized on your system. This is necessary when you need to compile your projects in the SDK/projects folder.

Also, on executing a lspci, you should see the following:
sudo /sbin/lspci -d "10de:*" -v -xxx
02:00.0 3D controller: nVidia Corporation Unknown device 05e7 (rev a1)
Subsystem: nVidia Corporation Unknown device 066a
Flags: bus master, fast devsel, latency 0, IRQ 193
Memory at 96000000 (32-bit, non-prefetchable) [size=16M]
Memory at 90000000 (64-bit, prefetchable) [size=64M]
Memory at 94000000 (64-bit, non-prefetchable) [size=32M]
I/O ports at 3000 [size=128]
[virtual] Expansion ROM at 97000000 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Capabilities: [68] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
Capabilities: [78] Express Endpoint IRQ 0
00: de 10 e7 05 07 00 10 00 a1 00 02 03 00 00 00 00
10: 00 00 00 96 0c 00 00 90 00 00 00 00 04 00 00 94
20: 00 00 00 00 01 30 00 00 00 00 00 00 de 10 6a 06
30: 00 00 00 00 60 00 00 00 00 00 00 00 0b 01 00 00
40: de 10 6a 06 00 00 00 00 00 00 00 00 00 00 00 00
50: 01 00 00 00 01 00 00 00 ce d6 23 00 00 00 00 00
60: 01 68 03 00 08 00 00 00 05 78 80 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 10 00 01 00 e0 84 00 00
80: 10 29 00 00 01 3d 00 08 08 00 01 01 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


Step 5:
cd NVIDIA_CUDA_SDK/
make

This should make all your projects in the following directory
$HOME/NVIDIA_CUDA_SDK/bin/linux/release

Execute the deviceQuery executable should show this:
./deviceQuery
There is 1 device supporting CUDA

Device 0: "Tesla C1060"
Major revision number: 1
Minor revision number: 3
Total amount of global memory: 4294705152 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: Yes

Test PASSED

Press ENTER to exit...


Congratulations !!! Your Nvidia Tesla C1060 is configured on your system running CentOS or a similar RHEL distro. I shall post some exercises soon. For further questions, refer to the Nvidia forums.