Overview
The application Boxsi provides a batch mode interface to the Oxs
micromagnetic computation engine. A restricted graphical interface is
provided, but Boxsi is primarily intended to be controlled by
command line arguments, and launched by the user either directly from
the shell prompt or from inside a batch file.
Within the OOMMF architecture, Boxsi is both a server and a client application. It is a client of data table display and storage applications, and vector field display and storage applications. Boxsi is the server of a solver control service for which the only client is mmLaunch. It is through this service that mmLaunch provides a user interface window (shown above) on behalf of Boxsi.
A micromagnetic problem is communicated to Boxsi through a MIF 2 file specified on the command line and loaded from disk. The MIF 1.x formats are also accepted; they are converted to the MIF 2 format by an automatic call to mifconvert.
Launching
Boxsi must be started from the command line. The syntax is
tclsh oommf.tcl boxsi [standard options] [-exitondone <0|1>] [-kill tags] \ [-logfile logname] [-loglevel level] [-nice <0|1>] [-nocrccheck <0|1>] \ [-numanodes nodes] [-outdir dir] [-parameters params] [-pause <0|1>] \ [-regression_test flag] [-regression_testname basename] \ [-restart <0|1|2>] [-restartfiledir dir] [-threads count] miffilewhere
The default value for nodes is ``none'', which allows the operating system to assign and move threads based on overall system usage. This is also the behavior obtained when the Oxs build is not NUMA-aware. On the other hand, if a machine is dedicated primarily to running one instance of Boxsi, then Boxsi will likely run fastest if the thread count is set to the number of processing cores on the machine, and nodes is set to ``auto''. If you want to run multiple copies of Boxsi simultaneously, or run Boxsi in parallel with some other application(s), then set the thread count to a number smaller than the number of processing cores and restrict Boxsi to some subset of the memory nodes with the -numanodes option and an explicit nodes list.
The default behavior is modified (in increasing order of priority) by the numanodes setting in the active oommf/config/platform/ platform file, by the numanodes setting in the oommf/config/options.tcl or oommf/config/local/options.tcl file, or by the environment variable OOMMF_NUMANODES. The -numanodes command line option, if any, overrides all.
-parameters "A 13e-12 Ms 800e3"could be used to set A to 13e-12 and Ms to 800e3. The quoting mechanism is specific to the shell/operating system; refer to your system documentation for details.
Although Boxsi cannot be launched by mmLaunch, nonetheless a limited graphical interactive interface for Boxsi is provided through mmLaunch, in the same manner as is done for Oxsii. Each running instance of Boxsi is included in the Threads list of mmLaunch, along with a checkbutton. This button toggles the presence of a user interface window.
Inputs
Boxsi loads problem specifications directly from disk as
requested on the command line. The format for these files is
the MIF 2 format,
the same as used by the Oxsii interactive interface. The
MIF 1.1 and
MIF 1.2
formats used by the
2D solver mmSolve2D can also be input
to Boxsi, which will automatically call the command line tool
mifconvert to convert from the MIF 1.x format to the
MIF 2 format ``on-the-fly.'' Sample MIF 2 files can be found in
the directory oommf/app/oxs/examples.
Outputs
The lower panel of the Boxsi interactive interface presents
Output, Destination, and Schedule sub-windows that display the current
output configuration and allow interactive modification of that
configuration. These controls are identical to those in the Oxsii
user interface; refer to the
Oxsii documentation for details.
The only difference between Boxsi and Oxsii with
respect to outputs is that in practice Boxsi tends to rely
primarily on
Destination and
Schedule commands in the input
MIF file
to setup the output configuration. The interactive output interface is
used for incidental runtime monitoring of the job.
Controls
The runtime controls provided by the Boxsi interactive interface
are a restricted subset of those available in the Oxsii interface.
If the runtime controls provided by Boxsi are found to be
insufficient for a given task, consider using Oxsii instead.
The File menu holds 4 entries: Show Console, Close Interface, Clear Schedule, and Exit Oxsii. File|Show Console brings up a Tcl shell console running off the Boxsi interface Tcl interpreter. This console is intended primary for debugging purposes. File|Close Interface will remove the interface window from the display, but leaves the solver running. This effect may also be obtained by deselecting the Boxsi interface button in the Threads list in mmLaunch. File|Clear Schedule will disable all currently active output schedules, exactly as if the user clicked through the interactive schedule interface one output and destination at a time and disabled each schedule-enabling checkbutton. The final entry, File|Exit Boxsi, terminates the Boxsi solver and closes the interface window. Note that there is no File|Load... menu item; the problem specification file must be declared on the Boxsi command line.
The Help menu provides the usual help facilities.
The row of buttons immediately below the menu bar provides simulation progress control. These buttons—Run, Relax, Step and Pause—become active once the micromagnetic problem has been initialized. These buttons allow the user to change the run state of the solver. In the Pause state, the solver sits idle awaiting further instructions. If Step is selected, then the solver will move forward one iteration and then Pause. In Relax mode, the solver takes at least one step, and then runs until it reaches a stage boundary, at which point the solver is paused. In Run mode, the solver runs until the end of the problem is reached. When the problem end is reached, the solver will either pause or exit, depending upon the setting of the -exitondone command line option.
Normally the solver progresses automatically from problem initialization into Run mode, but this can be changed by the -pause command line switch. Interactive output is available in all modes; the scheduled outputs occur appropriately as the step and stage counts advance.
Directly below the run state control buttons are three display lines, showing the name of the input MIF file, the current run-state, and the current stage number/maximum stage number. Both stage numbers are 0-indexed.
Details
As with Oxsii, the simulation model construction is governed by
the Specify blocks in the input MIF file, and all aspects of the
simulation are determined by the specified
Oxs_Ext classes.
Refer to the appropriate Oxs_Ext class documentation for simulation and
computational details.
Threading considerations
As an example, suppose you are running on a four dual-core processor
box, where each of the four processors is connected to a separate memory
node. In other words, there are eight cores in total, and each pair of
cores shares a memory node. Further assume that the processors are
connected via point-to-point links such as AMD's HyperTransport or
Intel's QuickPath Interconnect.
If you want to run a single instance of Boxsi as quickly as possible, you might use the -threads 8 option, which, assuming the default value of -numanodes none is in effect, would allow the operating system to schedule the eight threads among the system's eight cores as it sees fit. Or, you might reduce the thread count to reserve one or more cores for other applications. If the job is long running, however, you may find that the operating system tries to run multiple threads on a single core—perhaps in order to leave other cores idle so that they can be shut down to save energy. Or, the operating system may move threads away from the memory node where they have allocated memory, which effectively reduces memory bandwidth. In such cases you might want to launch Boxsi with the -numanodes auto option. This overrides the operating systems preferences, and ties threads to particular memory nodes for the lifetime of the process. (On Linux boxes, you should also check the ``cpu frequency governor'' and ``huge page support'' selection and settings.)
If you want to run two instances of Boxsi concurrently, you might launch each with the -threads 4 option, so that each job has four threads for the operating system to schedule. If you don't like the default scheduling by the operating system, you can use the -numanodes option, but what you don't want to do is launch two jobs with -numanodes auto, because the ``auto'' option assigns threads to memory nodes from a fixed sequence list, so both jobs will be assigned to the same nodes. Instead, you should manually assign the nodes, with a different set to each job. For example, you may launch the first job with -numanodes 0,1 and the second job with -numanodes 2,3. One point to keep in mind when assigning nodes is that some node pairs are ``closer'' (with respect to memory latency and bandwidth) than others. For example, memory node 0 and memory node 1 may be directly connected via a point-to-point link, so that data can be transferred in a single ``hop.'' But sending data from node 0 to node 2 may require two hops (from node 0 to node 1, and then from node 1 to node 2). In this case -numanodes 0,1 will probably run faster than -numanodes 0,2.
The -numanodes option is only available on Linux boxes if the ``numactl'' and ``numactl-devel'' packages are installed. The numactl command itself can be used to tie jobs to particular memory nodes, similar to the boxsi -numanodes option, except that -numanodes ties threads whereas numactl ties jobs. The numactl -hardware command will tell you how many memory nodes are in the system, and also reports a measure of the (memory latency and bandwidth) distance between nodes. This information can be used in selecting nodes for the boxsi -numanodes option, but in practice the distance information reported by numactl is often not reliable. For best results one should experiment with different settings, or run memory bandwidth tests with different node pairs.
Batch Scheduling Systems
OOMMF jobs submitted to a batch queuing system (e.g., Condor, PBS,
NQS) can experience sporadic failures caused by interactions between
separate OOMMF jobs running simultaneously on the same compute
node. These problems can be prevented by using the OOMMF command
line utility
launchhost
to isolate each job.