Message passing with MPL


MPL is the IBM proprietary message passing library. It doesn't feature the same portability that PVM boasts, however, a number of tools have been developed specifically for the SP2 which require the use of MPL message passing. (Information on these tools will soon be added to this primer.)

We will assume that the reader has a basic understanding of the concept of message passing communication of data on distributed-memory parallel architectures. MPL is described here primarily by example, but for the interested reader, MPL documentation is available online.

The function/subroutine calls covered in this tutorial are:

        In C:                 In Fortran:
        -----------------------------------------
        mpc_environ()         call MP_ENVIRON()
        mpc_task_query()      call MP_TASK_QUERY()
        mpc_send()            call MP_SEND()
        mpc_wait()            call MP_WAIT()
        mpc_recv()            call MP_RECV()
        mpc_brecv()           call MP_BRECV()
        mpc_bcast()           call MP_BCAST()

Getting started

As with PVM, in MPL one includes calls to communication routines in the same way one makes subroutine calls in Fortran, or function calls in C. Corresponding to the pvm_mytid() call in PVM, MPL provides the MP_ENVIRON() call to determine the number of processors (set at runtime) and the task id of the processor running a particular instance of the program. The syntax of the MP_ENVIRON call is:

call MP_ENVIRON(nprocs,nodeid) 

Or, in C:

rc=mpc_environ(&nprocs, &nodeid);

MPL provides the MP_TASK_QUERY call to obtain information about system variables and constants (beyond what's returned by the MP_ENVIRON call). Depending on the last argument to the MP_TASK_QUERY call (the "query-type"), various information can be retrieved (see the man-page for complete information). The most common use of this call is to obtain the system values for "dontcare" and "allgrp", which are the "wildcard" value, used for indiscriminately receiving messages, and the default processor group containing all nodes, often used in broadcasting information across the machine.

Sample of a Fortran MP_TASK_QUERY call:


      integer nbuf(4), dontcare, allgrp
c
c     Learn value of allgrp and dontcare, for later use with broadcast:
c
      call MP_TASK_QUERY(nbuf, 4, 3)
      dontcare = nbuf(1)
      allgrp = nbuf(4)


Sample of a C mpc_task_query() call:

   int nbuf[4], allgrp, dontcare;

/*
/  Learn the value of allgrp and dontcare, for use with broadcast.
*/

   rc = mpc_task_query(nbuf, 4, 3);
   dontcare = nbuf[0];
   allgrp = nbuf[3];

Sending an MPL message

Messages are sent in MPL with a call to MP_SEND (or mpc_send) for each piece of data to be sent (contrasting with the pack and send model in PVM). The call to the send routine returns immediately, without waiting for the receive to actually take place at the destination node. Because of this "asynchronous" behavior, it is important to pair a send call with a call to MP_WAIT (or mpc_wait) if execution should not proceed until the destination processor has actually received the data. For example, if the data is going to be modified at the source after a send, a wait call before this modification will ensure that the original data is transferred, rather than the modified data.

Sample of Fortran message sending with MPL:


c
c       Send msgleng consecutive bytes of data, starting with 
c       location nodemsg, to node 0 (host) with a tag of 'msgid'
c
        call MP_SEND(nodemsg, msgleng, 0, 100, msgid)
c
c       Before proceeding, wait for nodemsg to be received at the host:
c
        call MP_WAIT(msgid, nbytes)
c
c       Now, nodemsg can be safely modified...
c


Sample of C message sending with MPL:


/*
/     Send msgleng consecutive bytes of data, starting with
/     location nodemsg, to node 0 (host) with a tag of 'msgid'
*/

      rc = mpc_send(nodemsg, msgleng, 0, 100, msgid);
/*
/     Before proceeding, wait for nodemsg to be received at the host:
*/
      rc = mpc_wait(&msgid, &nbytes)
      if (rc) {
         print("Error in mpc_wait(). Return value: %i\nExiting.\n", rc);
         exit(-1);
      }
/*
/     Now, nodemsg can be safely modified...
*/

Receiving an MPL message

Messages can be received selectively based on a combination of the source (which node is sending the message) and tag (a qualifier specified by the user in the corresponding "send" call). A "wildcard" value can be specified for one or both of these receive call arguments to allow a message to be received regardless of the source or tag. When a message fitting the specified qualifications arrives at the node, it is copied into the location specified in the receive call. Unless the specific "blocking" receive call is made (MP_BRECV), the program will not wait for the message to actually arrive before it continues processing. Therefore, if you need to use the data being received, you must either use the MP_BRECV call, or post a MP_WAIT call which will halt execution until the message is received. It is the non-blocking strategy which allows for the possibility of using cycles during network delays to perform useful work. However, it also can give the false impression that data has been received when it has not, thus creating difficult to locate bugs!

Sample of Fortran message receiving with MPL:


c
c  Receive a message msgleng bytes long from node i with tag 100 into the
c  consecutive memory locations beginning at 'nodemsg'
c  NOTE: This is a 'blocking' receive.
c
          call MP_BRECV(nodemsg, msgleng, i, 100, msgid)


Sample of C message receiving with MPL:


/* 
/  Receive a message msgleng bytes long from node i with tag 100 into the
/  consecutive memory locations beginning at 'nodemsg'
/  NOTE: This is a 'blocking' receive.
*/
      rc = mpc_brecv(nodemsg, msgleng, &i, &type, &msgid);

Broadcasting a message in MPL

Broadcasts are an easy way to distribute data from one node to all other nodes (for example, sending program parameters from a host program to the node programs). The MP_BCAST command performs the broadcast, taking the location of the data item to be sent as the first argument, the length of the message as the second argument, and the source node and participants as the final two arguments. Typically, all nodes are included in the participant list, and there is a special value used to indicate this. The broadcast can be tailored to use only subsets of the nodes, however. See the online POE documentation for more details.

Sample of a Fortran broadcast with MPL:


c
c     Broadcast the 8 byte double rstart from node 0 to all others:
c
      call MP_BCAST(rstart,8,0,allgrp)
c
c     Broadcast the 4 byte integer chunksize from node 0 to all others:
c
      call MP_BCAST(chunksize,4,0,allgrp)


Sample of a C broadcast with MPL:



/* Broadcast the 8 byte double rstart from node 0 to all others:      */

   rc = mpc_bcast(&rstart,sizeof(double),nodeid,allgrp);

/* Broadcast the 4 byte integer chunksize from node 0 to all others: */

   rc = mpc_bcast(&chunksize,sizeof(int),nodeid,allgrp);