(The author wishes to thank the National Science Foundation for support for much of this work.)
svn co https://dmtcp.svn.sourceforge.net/svnroot/dmtcp dmtcpThis is a collaboration among many students and myself. Some of the contributors include: Jason Ansel, Kapil Arya, Michael Rieker, Artem Polyakov, Preveen Solanki, Ana Maria Visan, and myself. It works as simply as:
dmtcp_checkpoint a.out # Run a.out with the DMTCP checkpoint library dmtcp_command --checkpoint # Create a checkpoint image dmtcp_restart ckpt_a.out_*.dmtcp # Restart from checkpoint imageThe software transparently tracks spawning of threads, forking of child processes, and using ssh to start remote processes. All parts of the distributed computation (all threads and processes) are checkpointed, with one checkpoint image per process. Many more options are offered, and can be found in the documentation.
TOP-C has three fundamental concepts: the task, the public shared data, and the action chosen after a task is completed. Communication between processes occurs only through these three mechanisms. The design goals of TOP-C are that it should follow a simple, easy-to-use programmer's model, and that it should have high latency tolerance. This allows for economical parallel computing on commodity hardware. The simple parallel model turned out to be surprisingly adaptable to parallelizing legacy software, as was demonstrated in the parallelization of a 1,000,000 line C++ program, Geant4, for simulation of high energy particle showers. (see description below)
TOP-C is invoked by linking the application program with a TOP-C library.
You can choose to link with a library customized for your hardware
architecture: distributed memory, sequential (for debugging),
and (experimental version) shared memory. Because the application code
remains identical in all cases, it is planned to also provide a
TOP-C library for a hierarchy of distributed memory over shared memory,
as one might see in a Beowulf cluster of PC's, each with dual processors.
A TOP-C application is built around a single system call:
TOPC_master_slave().
At the time that TOPC_master_slave() is called,
it reads a file, procgroup, from the local directory, specifying which slave
machines to start. Typical applications have the slave process execute
the same binary as the master, thus running in SPMD mode.
The TOPC_master_slave()
system call takes four arguments consisting
of four application-defined procedures:
generate_task_input() -> input,
do_task(input) -> output,
check_task_result(input, output) -> action,
update_shared_data(input, output).
An action consists of one of NO_ACTION, REDO, UPDATE_ENVIRONMENT, or CONTINUATION(new_input).
TOP-C version 2.x is now available. You can find it here. To find out more about the TOP-C model, a good place to start is the README file and example.c (an sample TOP-C application).
Certain interesting issues of user interface became immediately apparent in this project. These issues had to be addressed before the package could be refereed as part of GAP's formal process for refereeing share packages. For example, when the user types ^C (interrupt) in a parallel application, which processes should be interrupted? What if the slave process goes into an infinite loop? Which UNIX signals should be ignored, or have a special handler? After an interrupt, what should happen to pending messages in the network buffer or in transit in the network? Since I decided to expose the underlying MPI calls by providing bindings to GAP commands that could be called interactively, what arguments should be defaulted. Resolving these issues while doing the least damage to a large software package originally designed for sequential computation was an interesting exercise.
Finally, in Fall, 1999, a graduate student, Predrag Petkovic, produced a better implementation for a course project. His implementation includes almost all of the layers of MPI (but not topology, for example), and is surprisingly professional for software done in such a short time. You can find both versions in mpinu directory.
If one is designing an array of 1,000 detectors, it is better to test design tradeoffs in a software simulation before fixing the design. Parallel simulation allows one to more easily iterate the design process, since the results of a simulation are available many times sooner, allowing for more rapid design development.
I have parallelized this code using TOP-C. The parallelization appeared in Geant4 version 4.1. Read about the latest version here.