OOMMF Home next up previous contents index
Next: Data Table Display: mmDataTable Up: The 2D Micromagnetic Solver Previous: The 2D Micromagnetic Interactive

Subsections


OOMMF 2D Micromagnetic Solver Batch System

The OOMMF Batch System (OBS) provides a scriptable interface to the same micromagnetic solver engine used by mmSolve2D, in the form of three Tcl applicatons (batchmaster, batchslave, and batchsolve) that provide support for complex job scheduling. All OBS script files are in the OOMMF distribution directory app/mmsolve/scripts.

Unlike much of the OOMMF package, the OBS is meant to be driven primarily from the command line or shell (batch) script. OBS applications are launched from the command line using the bootstrap application.


2D Micromagnetic Solver Batch Interface: batchsolve

Overview
The application batchsolve provides a simple command line interface to the OOMMF 2D micromagnetic solver engine.

Launching
The application batchsolve is launched by the command line:

tclsh oommf.tcl batchsolve [standard options] 
   [-end_exit <0|1>] [-end_paused] [-interface <0|1>] \
   [-restart <0|1>] [-start_paused] [file]
where
-end_exit <0|1>
Whether or not to explicitly call exit at bottom of batchsolve.tcl. When launched from the command line, the default is to exit after solving the problem in file. When sourced into another script, like batchslave.tcl, the default is to wait for the caller script to provide further instructions.
-interface <0|1>
Whether to register with the account service directory application, so that mmLaunch, can provide an interactive interface. Default = 1 (do register), which will automatically start account service directory and host service directory applications as necessary.
-start_paused
Pause solver after loading problem.
-end_paused
Pause solver and enter event loop at bottom of batchsolve.tcl rather than just falling off the end (the effect of which will depend on whether or not Tk is loaded).
-restart <0|1>
Determines solver behavior when a new problem is loaded. If 1, then the solver will look for basename.log and basename*.omf files to restart a previous run from the last saved state (where basename is the ``Base Output Filename'' specified in the input problem specification). If these files cannot be found, then a warning is issued and the solver falls back to the default behavior (equivalent to -restart 0) of starting the problem from scratch. The specified -restart setting holds for all problems fed to the solver, not just the first.
file
Immediately load and run the specified MIF 1.x file.

The input file file should contain a Micromagnetic Input Format 1.x problem description, such as produced by mmProbEd. The batch solver searches several directories for this file, including the current working directory, the data and scripts subdirectories, and parallel directories relative to the directories app/mmsolve and app/mmpe in the OOMMF distribution. Refer to the mif_path variable in batchsolve.tcl for the complete list.

If -interface is set to 1 (enabled), batchsolve registers with the account service directory application, and mmLaunch will be able to provide an interactive interface. Using this interface, batchsolve may be controlled in a manner similar to mmSolve2D. The interface allows you to pause, un-pause, and terminate the current simulation, as well as to attach data display applications to monitor the solver's progress. If more interactive control is needed, mmSolve2D should be used.

If -interface is 0 (disabled), batchsolve does not register, leaving it without an interface, unless it is sourced into another script (e.g., batchslave.tcl) that arranges for an interface on the behalf of batchsolve.

Use the -start_paused switch to monitor the progress of batchsolve from the very start of a simulation. With this switch the solver will be paused immediately after loading the specified MIF file, so you can bring up the interactive interface and connect display applications before the simulation begins. Start the simulation by selecting the Run command from the interactive interface. This option cannot be used if -interface is disabled.

The -end_paused switch insures that the solver does not automatically terminate after completing the specified simulation. This is not generally useful, but may find application when batchsolve is called from inside a Tcl-only wrapper script.

Note on Tk dependence: If a problem is loaded that uses a bitmap mask file, and if that mask file is not in the PPM P3 (text) format, then batchsolve will launch any2ppm to convert it into the PPM P3 format. Since any2ppm requires Tk, at the time the mask file is read a valid display must be available. See the any2ppm documentation for details.

Output
The output may be changed by a Tcl wrapper script, but the default output behavior of batchsolve is to write tabular text data and the magnetization state at the control point for each applied field step. The tabular data are appended to the file basename.odt, where basename is the ``Base Output Filename'' specified in the input MIF 1.x file. See the routine GetTextData in batchsolve.tcl for details, but at present the output consists of the solver iteration count, nominal applied field B, reduced average magnetization m, and total energy. This output is in the ODT file format.

The magnetization data are written to a series of OVF (OOMMF Vector Field) files, basename.fieldnnnn.omf, where nnnn starts at 0000 and is incremented at each applied field step. (The ASCII text header inside each file records the nominal applied field at that step.) These files are viewable using mmDisp.

The solver also automatically appends the input problem specification and miscellaneous runtime information to the log file basename.log.

Programmer's interface
In addition to directly launching batchsolve from the command line, batchsolve.tcl may also be sourced into another Tcl script that provides additional control structures. Within the scheduling system of OBS, batchsolve.tcl is sourced into batchslave, which provides additional control structures that support scheduling control by batchmaster. There are several variables and routines inside batchsolve.tcl that may be accessed and redefined from such a wrapper script to provide enhanced functionality.

Global variables

mif
A Tcl handle to a global mms_mif object holding the problem description defined by the input MIF 1.x file.
solver
A Tcl handle to the mms_solver object.
search_path
Directory search path used by the FindFile proc.
Refer to the source code and sample scripts for details on manipulation of these variables.

Batchsolve procs
The following Tcl procedures are designed for external use and/or redefinition:

SolverTaskInit
Called at the start of each task.
BatchTaskIterationCallback
Called after each iteration in the simulation.
BatchTaskRelaxCallback
Called at each control point reached in the simulation.
SolverTaskCleanup
Called at the conclusion of each task.
FindFile
Searches the directories specified by the global variable search_path for a specified file. The default SolverTaskInit proc uses this routine to locate the requested input MIF file.
SolverTaskInit and SolverTaskCleanup accept an arbitrary argument list (args), which is copied over from the args argument to the BatchTaskRun and BatchTaskLaunch procs in batchsolve.tcl. Typically one copies the default procs (as needed) into a task script, and makes appropriate modifications. You may (re-)define these procs either before or after sourcing batchsolve.tcl.

2D Micromagnetic Solver Batch Scheduling System

Overview
The OBS supports complex scheduling of multiple batch jobs with two applications, batchmaster and batchslave. The user launches batchmaster and provides it with a task script. The task script is a Tcl script that describes the set of tasks for batchmaster to accomplish. The work is actually done by instances of batchslave that are launched by batchmaster. The task script may be modeled after the included simpletask.tcl or multitask.tcl sample scripts.

The OBS has been designed to control multiple sequential and concurrent micromagnetic simulations, but batchmaster and batchslave are completely general and may be used to schedule other types of jobs as well.

Master Scheduling Control: batchmaster

The application batchmaster is launched by the command line:

tclsh oommf.tcl batchmaster [standard options] task_script \
      [host [port]]
task_script
is the user defined task (job) definition Tcl script,
host
specifies the network address for the master to use (default is localhost),
port
is the port address for the master (default is 0, which selects an arbitrary open port).

When batchmaster is run, it sources the task script. Tcl commands in the task script should modify the global object $TaskInfo to inform the master what tasks to perform and optionally how to launch slaves to perform those tasks. The easiest way to create a task script is to modify one of the included example scripts. More detailed instructions are in the Batch task scripts section.

After sourcing the task script, batchmaster launches all the specified slaves, initializes each with a slave initialization script, and then feeds tasks sequentially from the task list to the slaves. When a slave completes a task it reports back to the master and is given the next unclaimed task. If there are no more tasks, the slave is shut down. When all the tasks are complete, the master prints a summary of the tasks and exits.

When the task script requests the launching and controlling of jobs off the local machine, with slaves running on remote machines, then the command line argument host must be set to the local machine's network name, and the $TaskInfo methods AppendSlave and ModifyHostList will need to be called from inside the task script. Furthermore, OOMMF does not currently supply any methods for launching jobs on remote machines, so a task script which requests the launching of jobs on remote machines requires a working rsh command or equivalent. (Details.)

Task Control: batchslave

The application batchslave may be launched by the command line:

tclsh oommf.tcl batchslave [standard options] \
   host port id password [auxscript [arg ...]]
host, port
Host and port at which to contact the master to serve.
id, password
ID and password to send to the master for identification.
auxscript arg ...
The name of an optional script to source (which actually performs the task the slave is assigned), and any arguments it needs.

In normal operation, the user does not launch batchslave. Instead, instances of batchslave are launched by batchmaster as instructed by a task script. Although batchmaster may launch any slaves requested by its task script, by default it launches instances of batchslave.

The function of batchslave is to make a connection to a master program, source the auxscript and pass it the list of arguments aux_arg .... Then it receives commands from the master, and evaluates them, making use of the facilities provided by auxscript. Each command is typically a long-running one, such as solving a complete micromagnetic problem. When each command is complete, the batchslave reports back to its master program, asking for the next command. When the master program has no more commands batchslave terminates.

Inside batchmaster, each instance of batchslave is launched by evaluating a Tcl command. This command is called the spawn command, and it may be redefined by the task script in order to completely control which slave applications are launched and how they are launched. When batchslave is to be launched, the spawn command might be:

exec tclsh oommf.tcl batchslave -tk 0 -- $server(host) $server(port) \
   $slaveid $passwd batchsolve.tcl -restart 1 &
The Tcl command exec is used to launch subprocesses. When the last argument to exec is &, the subprocess runs in the background. The rest of the spawn command should look familiar as the command line syntax for launching batchslave.

The example spawn command above cannot be completely provided by the task script, however, because parts of it are only known by batchmaster. Because of this, the task script should define the spawn command using ``percent variables'' which are substituted by batchmaster. Continuing the example, the task script provides the spawn command:

exec %tclsh %oommf batchslave -tk 0 %connect_info \
   batchsolve.tcl -restart 1
batchmaster replaces %tclsh with the path to tclsh, and %oommf with the path to the OOMMF bootstrap application. It also replaces %connect_info with the five arguments from -- through $password that provide batchslave the hostname and port where batchmaster is waiting for it to report to, and the ID and password it should pass back. In this example, the task script instructs batchslave to source the file batchsolve.tcl and pass it the arguments -restart 1. Finally, batchmaster always appends the argument & to the spawn command so that all slave applications are launched in the background.

The communication protocol between batchmaster and batchslave is evolving and is not described here. Check the source code for the latest details.


Batch Task Scripts

The application batchmaster creates an instance of a BatchTaskObj object with the name $TaskInfo. The task script uses method calls to this object to set up tasks to be performed. The only required call is to the AppendTask method, e.g.,

$TaskInfo AppendTask A "BatchTaskRun taskA.mif"
This method expects two arguments, a label for the task (here ``A'') and a script to accomplish the task. The script will be passed across a network socket from batchmaster to a slave application, and then the script will be interpreted by the slave. In particular, keep in mind that the file system seen by the script will be that of the machine on which the slave process is running.

This example uses the default batchsolve.tcl procs to run the simulation defined by the taskA.mif MIF 1.x file. If you want to make changes to the MIF problem specifications on the fly, you will need to modify the default procs. This is done by creating a slave initialization script, via the call

$TaskInfo SetSlaveInitScript { <insert script here> }
The slave initialization script does global initializations, and also usually redefines the SolverTaskInit proc; optionally the BatchTaskIterationCallback, BatchTaskRelaxCallback and SolverTaskCleanup procs may be redefined as well. At the start of each task SolverTaskInit is called by BatchTaskRun (in batchsolve.tcl), after each iteration BatchTaskIterationCallback is executed, at each control point BatchTaskRelaxCallback is run, and at the end of each task SolverTaskCleanup is called. SolverTaskInit and SolverTaskCleanup are passed the arguments that were passed to BatchTaskRun. A simple SolverTaskInit proc could be
proc SolverTaskInit { args } {
   global mif basename outtextfile
   set A [lindex $args 0]
   set outbasename "$basename-A$A"
   $mif SetA $A
   $mif SetOutBaseName $outbasename
   set outtextfile [open "$outbasename.odt" "a+"]
   puts $outtextfile [GetTextData header \
         "Run on $basename.mif, with A=[$mif GetA]"]
}
This proc receives the exchange constant A for this task on the argument list, and makes use of the global variables mif and basename. (Both should be initialized in the slave initialization script outside the SolverTaskInit proc.) It then stores the requested value of A in the mif object, sets up the base filename to use for output, and opens a text file to which tabular data will be appended. The handle to this text file is stored in the global outtextfile, which is closed by the default SolverTaskCleanup proc. A corresponding task script could be
$TaskInfo AppendTask "A=13e-12 J/m" "BatchTaskRun 13e-12"
which runs a simulation with A set to 13e-12 J/m. This example is taken from the multitask.tcl sample script. (For commands accepted by mif objects, see the file mmsinit.cc. Another object than can be gainfully manipulated is solver, which is defined in solver.tcl.)

If you want to run more than one task at a time, then the $TaskInfo method AppendSlave will have to be invoked. This takes the form

$TaskInfo AppendSlave <spawn count> <spawn command>
where <spawn command> is the command to launch the slave process, and <spawn count> is the number of slaves to launch with this command. (Typically <spawn count> should not be larger than the number of processors on the target system.) The default value for this item (which gets overwritten with the first call to $TaskInfo AppendSlave) is
 1 {Oc_Application Exec batchslave -tk 0 %connect_info batchsolve.tcl}
The Tcl command Oc_Application Exec is supplied by OOMMF and provides access to the same application-launching capability that is used by the OOMMF bootstrap application. Using a <spawn command> of Oc_Application Exec instead of exec %tclsh %oommf saves the spawning of an additional process. The default <spawn command> launches the batchslave application, with connection information provided by batchmaster, and using the auxscript batchsolve.tcl.

Before evaluating the <spawn command>, batchmaster applies several percent-style substitutions useful in slave launch scripts: %tclsh, %oommf, %connect_info, %oommf_root, and %%. The first is the Tcl shell to use, the second is an absolute path to the OOMMF bootstrap program on the master machine, the third is connection information needed by the batchslave application, the fourth is the path to the OOMMF root directory on the master machine, and the last is interpreted as a single percent. batchmaster automatically appends the argument & to the <spawn command> so that the slave applications are launched in the background.

To launch batchslave on a remote host, use rsh in the spawn command, e.g.,

$TaskInfo AppendSlave 1 {exec rsh foo tclsh oommf/oommf.tcl \
      batchslave -tk 0 %connect_info batchsolve.tcl}
This example assumes tclsh is in the execution path on the remote machine foo, and OOMMF is installed off of your home directory. In addition, you will have to add the machine foo to the host connect list with
$TaskInfo ModifyHostList +foo
and batchmaster must be run with the network interface specified as the server host (instead of the default localhost), e.g.,
tclsh oommf.tcl batchmaster multitask.tcl bar
where bar is the name of the local machine.

This may seem a bit complicated, but the examples in the next section should make things clearer.


Sample task scripts

The first sample task script is a simple example that runs the 3 micromagnetic simulations described by the MIF 1.x files taskA.mif, taskB.mif and taskC.mif. It is launched with the command

tclsh oommf.tcl batchmaster simpletask.tcl
This example uses the default slave launch script, so a single slave is launched on the current machine, and the 3 simulations will be run sequentially. Also, no slave initialization script is given, so the default procs in batchsolve.tcl are used. Output will be magnetization states and tabular data at each control point, stored in files on the local machine with base names as specified in the MIF files.



# FILE: simpletask.tcl
#
# This is a sample batch task file.  Usage example:
#
#   tclsh oommf.tcl batchmaster simpletask.tcl
#
# Form task list
$TaskInfo AppendTask A "BatchTaskRun taskA.mif"
$TaskInfo AppendTask B "BatchTaskRun taskB.mif"
$TaskInfo AppendTask C "BatchTaskRun taskC.mif"
Figure 1: Sample task script simpletask.tcl. (Description.)


The second sample task script builds on the previous example by defining BatchTaskIterationCallback and BatchTaskRelaxCallback procedures in the slave init script. The first set up to write tabular data every 10 iterations, while the second writes tabular data on each control point event. The data is written to the output file specified by the Base Output Filename entry in the input MIF files. Note that there is no magnetization vector field output in this example. This task script is launched the same way as the previous example:

tclsh oommf.tcl batchmaster octrltask.tcl



# FILE: octrltask.tcl
#
# This is a sample batch task file, with expanded output control.
# Usage example:
#
#        tclsh oommf.tcl batchmaster octrltask.tcl
#
# "Every" output selection count
set SKIP_COUNT 10

# Initialize solver. This is run at global scope
set init_script {
    # Text output routine
    proc MyTextOutput {} {
        global outtextfile
        puts $outtextfile [GetTextData data]
        flush $outtextfile
    }
    # Change control point output
    proc BatchTaskRelaxCallback {} {
        MyTextOutput
    }
    # Add output on iteration events
    proc BatchTaskIterationCallback {} {
        global solver
        set count [$solver GetODEStepCount]
        if { ($count % __SKIP_COUNT__) == 0 } { MyTextOutput }
    }
}

# Substitute $SKIP_COUNT in for __SKIP_COUNT__ in above "init_script"
regsub -all -- __SKIP_COUNT__ $init_script $SKIP_COUNT init_script
$TaskInfo SetSlaveInitScript $init_script

# Form task list
$TaskInfo AppendTask A "BatchTaskRun taskA.mif"
$TaskInfo AppendTask B "BatchTaskRun taskB.mif"
$TaskInfo AppendTask C "BatchTaskRun taskC.mif"
Figure 2: Task script with iteration output octrltask.tcl. (Description.)


The third task script is a more complicated example running concurrent processes on two machines. This script should be run with the command

tclsh oommf.tcl batchmaster multitask.tcl bar
where bar is the name of the local machine.

Near the top of the multitask.tcl script several Tcl variables (RMT_MACHINE through A_list) are defined; these are used farther down in the script. The remote machine is specified as foo, which is used in the $TaskInfo AppendSlave and $TaskInfo ModifyHostList commands.

There are two AppendSlave commands, one to run two slaves on the local machine, and one to run a single slave on the remote machine (foo). The latter changes to a specified working directory before launching the batchslave application on the remote machine. (For this to work you must have rsh configured properly. In the future it may be possible to launch remote commands using the OOMMF account server application, thereby lessening the reliance on system commands like rsh.)

Below this the slave initialization script is defined. The Tcl regsub command is used to place the task script defined value of BASEMIF into the init script template. The init script is run on the slave when the slave is first brought up. It first reads the base MIF file into a newly created mms_mif instance. (The MIF file needs to be accessible by the slave process, irrespective of which machine it is running on.) Then replacement SolverTaskInit and SolverTaskCleanup procs are defined. The new SolverTaskInit interprets its first argument as a value for the exchange constant A. Note that this is different from the default SolverTaskInit proc, which interprets its first argument as the name of a MIF 1.x file to load. With this task script, a MIF file is read once when the slave is brought up, and then each task redefines only the value of A for the simulation (and corresponding changes to the output filenames and data table header).

Finally, the Tcl loop structure

foreach A $A_list {
    $TaskInfo AppendTask "A=$A" "BatchTaskRun $A"
}
is used to build up a task list consisting of one task for each value of A in A_list (defined at the top of the task script). For example, the first value of A is 10e-13, so the first task will have the label A=10e-13 and the corresponding script is BatchTaskRun 10e-13. The value 10e-13 is passed on by BatchTaskRun to the SolverTaskInit proc, which has been redefined to process this argument as the value for A, as described above.

There are 6 tasks in all, and 3 slave processes, so the first three tasks will run concurrently in the 3 slaves. As each slave finishes it will be given the next task, until all the tasks are complete.



# FILE: multitask.tcl
#
# This is a sample batch task file.  Usage example:
#
#   tclsh oommf.tcl batchmaster multitask.tcl hostname [port]
#
# Task script configuration
set RMT_MACHINE   foo 
set RMT_TCLSH      tclsh
set RMT_OOMMF      "/path/to/oommf/oommf.tcl"
set RMT_WORK_DIR   "/path/to/oommf/app/mmsolve/data"
set BASEMIF taskA
set A_list { 10e-13 10e-14 10e-15 10e-16 10e-17 10e-18 }

# Slave launch commands
$TaskInfo ModifyHostList +$RMT_MACHINE
$TaskInfo AppendSlave 2 "exec %tclsh %oommf batchslave -tk 0 \
        %connect_info batchsolve.tcl"
$TaskInfo AppendSlave 1 "exec rsh $RMT_MACHINE \
        cd $RMT_WORK_DIR \\\;\
        $RMT_TCLSH $RMT_OOMMF batchslave -tk 0 %connect_info batchsolve.tcl"

# Slave initialization script (with batchsolve.tcl proc
# redefinitions)
set init_script {
    # Initialize solver. This is run at global scope
    set basename __BASEMIF__      ;# Base mif filename (global)
    mms_mif New mif
    $mif Read [FindFile ${basename}.mif]
    # Redefine TaskInit and TaskCleanup proc's
    proc SolverTaskInit { args } {
        global mif outtextfile basename
        set A [lindex $args 0]
        set outbasename "$basename-A$A"
        $mif SetA $A
        $mif SetOutBaseName $outbasename
        set outtextfile [open "$outbasename.odt" "a+"]
        puts $outtextfile [GetTextData header \
                "Run on $basename.mif, with A=[$mif GetA]"]
        flush $outtextfile
    }
    proc SolverTaskCleanup { args } {
        global outtextfile
        close $outtextfile
    }
}
# Substitute $BASEMIF in for __BASEMIF__ in above script
regsub -all -- __BASEMIF__ $init_script $BASEMIF init_script
$TaskInfo SetSlaveInitScript $init_script

# Create task list
foreach A $A_list {
    $TaskInfo AppendTask "A=$A" "BatchTaskRun $A"
}
Figure 3: Advanced sample task script multitask.tcl. (Description.)



OOMMF Home next up previous Contents index

OOMMF Documentation Team
September 29, 2017