(This material is based in part upon work supported by the National Science Foundation under Grants No. 9732330, 9872114 and 0204113. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.)
TOP-C version 2.5.2 is now the latest version. Some new features include:
--TOPC-aggregated-tasks=
XXX (for finer grain computations,
network latency can form a bottleneck; automatically aggregating
several smaller messages into one larger message often fixes this problem);
--TOPC-stats
.
|
This is a copy of the TOP-C home page at http://www.ccs.neu.edu/home/gene/topc.html. TOP-C is described below, along with pointers to an online manual and some other resources for learning more about TOP-C.
TOP-C is free open source software. TOP-C is a library that links with your existing sequential code (after small modifications) in order to parallelize it. TOP-C also has a small footprint. It has 5,000 lines of C code, and about 35 KB as a binary library on Linux (e.g. Intel/Linux) using gcc after ``stripping'' the libraries. It is distributed under the GNU LGPL (Lesser General Public License), which allows you to use and re-distribute the TOP-C library both for non-commercial and for commercial purposes without payments for licenses or royalties.
The design goals of TOP-C are that it should follow a simple, easy-to-use programmer's model, and that it should have high latency tolerance. This allows for economical parallel computing on commodity hardware, as well as high performance on the latest supercomputers. TOP-C hides the details of parallel programming, and presents the application programmer with a simple task-oriented interface, for which the application programmer need only define four callback functions. Yet, this simple model has been shown to readily adapt to a wide variety of algorithmic requirements.
TOP-C runs on most variants of UNIX/Linux. The source code is layered, making it easy to modify and easy to add a new communication module for a new hardware architecture. Current communication modules include one for distributed memory using sockets (e.g. networks of workstations), one for shared memory using POSIX threads, and one for a single CPU (useful for debugging). The same TOP-C application code runs with any of the three communication modules.
TOP-C has three fundamental concepts:
the task (specified by the master and executed by a slave process),
the global shared data, and the action chosen after a
task is completed. Communication between processes occurs only through
these three mechanisms. Yet, a TOP-C application is built around a single
system call:
TOPC_master_slave()
.
It takes four arguments consisting
of four application-defined callback procedures:
GenerateTaskInput() -> input DoTask(input) -> output CheckTaskResult(input, output) -> action UpdateSharedData(input, output) (executed only if UPDATE action returned) |
Upon receiving the output of a task, the master then decides on one of four actions:
NO_ACTION UPDATE (update the shared data) REDO (in case the result of some other task has altered the shared data) CONTINUATION(new_input_for_same_slave) |
A task re-executed due to a REDO action can be executed more quickly, because it is executed in the same process as the original task. Hence any information from the previous task computation and previous update can be saved in a global variable and used to accelerate the re-computation of the task under the new task input.
This simple parallel model turns out to be surprisingly adaptable for parallelizing existing sequential software, as is demonstrated in the parallelization of a 1,000,000 line C++ program, Geant4 at CERN, for simulation of particle-matter interaction, with applications to physics, engineering and biomedicine. A version of TOP-C (called ParGAP) has also been used to parallelize GAP (Groups, Algorithms and Programming), and is distributed from their site.