mpich2 question
I apologize if this is the wrong forum to ask this.
I'm the sysadmin for a group of Macs. We're trying to set up a cluster which will use mpich2 for the time being. gfortran is also required.
I also have an installation of about 50 other macs on the network which have idle cycles we could use, if I can get this going.
I have two computers, we'll call Cluster1 and Cluster2, which have an identical OS, (10.4.11), mpich2 installs, etc. They're attached to the network, to an opendirectory server which automounts a mpi directory wherein I keep the executables.
I have a regular machine on the network which is not often used on which I installed mpich2 and have it working locally. We'll call that one Network1.
Each computer is a single processor G4.
I have a network based account, mpiaccount, which is accessible from any machine on the network. This account has the .mpd.conf and mpd.hosts files in it. mpd.hosts is configured with the ip addresses for cluster1, cluster2 and Network1.
If I am logged in to mpiaccount on Cluster1 and Cluster2, and I fire up mpdboot -n 2, it works great.
If I am logged into mpiaccount on Cluster1 only, and I fire up mpdboot -n 2, I get the error "mpdboot_cluster2 (handle_mpd_output 406): from mpd on cluster2, invalid port info: no_port"
If I am logged into mpiaccount on Cluster2 only, and I fire up mpdboot -n 2, I get the error "mpdboot_cluster2 (handle_mpd_output 406): from mpd on cluster1, invalid port info: no_port"
If I am logged into mpiaccount on Cluster1, Cluster 2 and Network1, and I fire up mpdboot -n 3 from Cluster1, I get the error "mpdboot_cluster1 (handle_mpd_output 406): from mpd on network1, invalid port info: no_port"
This is the same as what I get if fired up from Cluster 2.
If I fire it up from Network 1, any number n greater than 1 will result in that same error message.
I've spent the last couple days scouring the web, and maybe I'm simply blind, but I can't seem to find a solution.
anyone here have any ideas?
Thank you
John


