I recently started a postdoctoral position and one of my first tasks was to get Sun Grid Engine up and running on our new Xserve cluster. I downloaded the sge-6.1u2 darwin binaries from http://gridengine.sunsource.net/ and then began the installation. The installation scripts are very picky about DNS and hostnames.

In order to install Sun Grid Engine the first step is to install the qmaster. In this case the head node will act as the master and export the Sun Grid Engine installation directory to the nodes via NFS. OS X needed some prodding before it would behave as expected.

Don’t use the automounter via LDAP, for whatever reason it does not work. I created a /Cluster directory and unpacked the Sun Grid Engine tarballs to the /Cluster/sge directory. Share the /Cluster directory on the head node via NFS, but be sure to take the tick out of the three boxes for NFS export and stop sharing via other methods. The path to the root SGE directory needs to be the same on all nodes.

Once exported issue the following commands on each node – I used KDE’s Konsole application to send input to all sessions, or Apple Remote Desktop tool can also send the commands to each node. The commands should be run as root or preceded with sudo.

mkdir /Cluster
nicl . -create /mounts/head:\\/Cluster
nicl . -append /mounts/head:\\/Cluster type nfs
nicl . -append /mounts/head:\\/Cluster opts “”
nicl . -append /mounts/head:\\/Cluster dir /Cluster
kill -1 `cat /var/run/automount.pid`

This step tells the automounter about the share and causes it to reread the database in order to mount the share. The next steps are interactive and so Konsole’s send input to all sessions was the most appropriate tool. (I didn’t find a similar program for Mac OS X. Apple Remote Desktop can send scripts, but not in an interactive manner. So I used my Gentoo laptop with KDE.) The sge_qmaster and sge_execd services must be added to /etc/services. Add the following lines to /etc/services on the master and all compute nodes.

sge_qmaster 536/tcp
sge_execd 537/tcp

Due to the master node having two interfaces and using both an internal and an external hostname it is necessary to add the hostnames to the /etc/hosts file and perform some extra configuration on the qmaster. Add the following to the head nodes /etc/hosts file,

10.1.1.1 external external.dns.example
192.168.1.100 head head.cluster.private

Then run the following,

cd /Cluster/sge
export SGE_ROOT=/Cluster/sge
./install_qmaster

This script will interactively set up the qmaster. The node name is “external,” other defaults can be accepted. Once complete it will probably complain a little about hostnames not matching. Classic spooling was chosen along with the standard scheduler. All nodes were added at this step but more can easily be added later.

Once complete a file in /Cluster/sge/external/common/host_aliases was created with the line,

head external

This simply tells SGE that the two hostnames refer to the same machine. The /Cluster/sge/external/common/sgemaster was executed to restart the master. Next each node must be configured. Log in to the node, become root and issue the following commands,

source /Cluster/sge/external/common/settings.sh
cd /Cluster/sge
./install_execd

Follow the prompts and accept the defaults. Be sure to define a local spool directory for each node, located in /Volumes/Scratch/sge. Once complete each node will be added to the grid and can be seen by issuing a qhost command on the head node.

qconf -as head

Adds the head node as a submit host. It is not necessary to log into any nodes to perform tasks as qsub will now submit tasks to nodes and monitor tasks for completion. You should also add some manager accounts – these are user accounts that have full access to the SGE.

qconf -am admin user123

The next step is to set up and configure OpenMPI and the parallel environment stuff. Issue the following command,

qconf -ap mpi

Then I used the following configuration for the parallel environment,

pe_name mpi
slots 64
user_lists NONE
xuser_lists NONE
start_proc_args /Cluster/sge/mpi/startmpi.sh $pe_hostfile
stop_proc_args /Cluster/sge/mpi/stopmpi.sh
allocation_rule $pe_slots
control_slaves FALSE
job_is_first_task TRUE
urgency_slots min

This allows jobs to be submitted to the grid using the following command where four processors are requested,

qsub -R y -pe mpi 4 test.sh

This job will reserve its resources and use four slots on a node. This can be modified in order to reserve two slots or just a single slot.