Tuesday, May 12, 2009

Installing CUDA 2.2 on MacPro running CentOS 64-bit

My test machine is an early 200 Mac Pro with a NVIDIA Tesla C1060. Upgrading to CUDA 2.2 on this Mac was relatively easy. Please read my previous post for installing CUDA 2.1 before attempting to install CUDA 2.2 on a CentOS 64-bit distro.

Step 1: Download packages:
wget http://developer.download.nvidia.com/compute/cuda/2_2/drivers/cudadriver_2.2_linux_64_185.18.08-beta.run
wget http://developer.download.nvidia.com/compute/cuda/2_2/toolkit/cudatoolkit_2.2_linux_64_rhel5.3.run
wget http://developer.download.nvidia.com/compute/cuda/2_2/sdk/cudasdk_2.2_linux.run
wget http://developer.download.nvidia.com/compute/cuda/2_2/toolkit/cudagdb_2.2_linux_64_rhel5.3.run
Change permissions
chmod +x cudadriver_2.2_linux_64_185.18.08-beta.run
chmod +x cudatoolkit_2.2_linux_64_rhel5.3.run
chmod +x cudagdb_2.2_linux_64_rhel5.3.run
chmod +x cudasdk_2.2_linux.run
Step 2: Find your kernel source and install the corresponding kernel-devel and kernel-headers:
uname -r
2.6.27.21-170.2.56.fc10.x86_64

sudo yum install kernel-devel
sudo yum install kernel-headers
Step 3: Install the CUDA driver first, followed by the toolkit, gdb and finally the SDK.
sudo ./cudadriver_2.2_linux_64_185.18.08-beta.run --kernel-source-path /usr/src/kernels/2.6.27.21-170.2.56.fc10.x86_64
sudo ./cudatoolkit_2.2_linux_64_rhel5.3.run
sudo ./cudagdb_2.2_linux_64_rhel5.3.run
./cudasdk_2.2_linux.run
Make sure that your path is configured correctly. (Refer to previous post)

Step 4: Enable the cuda script mentioned in the previous post.
sudo service cuda start
check the nvidia driver status and you should see this:
ls /dev/nv*
/dev/nvidia0 /dev/nvidiactl /dev/nvram
Step 5: Make sure you do a fresh install of the SDK if you are upgrading.
cd NVIDIA_CUDA_SDK/
make
This should make all your projects in the following directory:
$HOME/NVIDIA_CUDA_SDK/bin/linux/release
Executing the deviceQuery executable shows me this:
./deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There is 1 device supporting CUDA

Device 0: "Tesla C1060"
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 3
Total amount of global memory: 4294705152 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.30 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit...
Congrats !!! CUDA 2.2 is configured on your 64-bit CentOS Mac Pro and also recognizes the NVIDIA Tesla C1060. Looking forward to post some examples with the zero copy feature :)


Friday, May 8, 2009

Installing CUDA 2.2 on Mac OSX

Installing CUDA 2.2 on Mac OSX is pretty trivial. Only warning/suggestion: manually remove the CUDA 2.1 SDK directory from /Developer/CUDA
Do a fresh install (always recommended).

I tried upgrading without removing CUDA 2.1 SDK and saw my image denoising project failing to compile. This was the error I saw: "ld: duplicate symbol _cutFree in ../../lib/libcutil.a(cutil.cpp.o)"

My deviceQuery output:
hogwarts:emurelease vivekv$ ./deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There is no device supporting CUDA.

Device 0: "Device Emulation (CPU)"
CUDA Capability Major revision number: 9999
CUDA Capability Minor revision number: 9999
Total amount of global memory: 4294967295 bytes
Number of multiprocessors: 16
Number of cores: 128
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 1
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.35 GHz
Concurrent copy and execution: No
Run time limit on kernels: No
Integrated: Yes
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

Press ENTER to exit...



I shall also post my experience installing CUDA 2.2 on a MacPro running CentOS 64-bit.