[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
It is best to postpone reading this section until the basic features discussed in the previous chapters are clear.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
DoTask()
was a result of a REDO
or CONTINUATION()
action, respectively. The result is
is not meaningful if called outside of DoTask()
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
TOPC_abort_tasks()
should be called in CheckTaskResult()
.
`TOP-C' then makes a best effort (no guarantee) to notify each
slave. TOP-C does not directly abort tasks. However,
TOPC_is_abort_pending()
returns 1 (true)
when invoked in DoTask()
on a slave. A typical
DoTask()
callback uses this to poll for an abort request
from the master, upon which it returns early with a special
task output. At the beginning of the next new task, REDO
or CONTINUATION
, `TOP-C' resets the pending abort
to 0 (false). See `examples/README' of the `TOP-C'
distribution for example code.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The principle of memory allocation in `TOP-C' is that if an
application allocates memory, then it is the responsibility of the
application to free that memory. This issue typically arises around the
issue of task buffers (see section 3.1.3 Task Input and Task Output Buffers) and calls
to TOPC_MSG(buf,buf_size)
. An application
often calls buf = malloc(...);
or
buf = new ...;
(in C++)
and copies data into that buffer before the call to TOPC_MSG
.
Since the last action of GenerateTaskInput()
or DoTask()
is typically to return TOPC_MSG(buf,buf_size)
,
there remains the question of how to free buf.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
malloc
and new
with Task Buffers
The best memory allocation solution for task buffers is to implement the
buffers as local variables, and therefore on the stack. This avoids the
need for malloc
and new
, and the question of how to later
free that memory.
If you use TOPC_MSG
(as opposed to TOPC_MSG_PTR
,
see section 8.3.2 Using TOPC_MSG_PTR()
to Avoid Copying Large Buffers), then recall that TOPC_MSG
copies its buffer to a separate TOP-C
space. For example,
{ int x; ... return TOPC_MSG(&x, size_of(x)); } |
If your task buffer is of fixed size, one can allocate it as a character
array on the stack: char buf[BUF_SIZE];
.
If your buffer contains variable size data, consider using
alloca
in place of malloc
to allocate on the
stack.
{ ... buf = alloca(buf_size); return TOPC_MSG(buf, buf_size); } |
In all of the above cases, there is no need to free the buffer, since
TOPC_MSG
will make a `TOP-C'-private copy and the
stack-allocated buffer will disappear when the current routine exits.
Note that alloca
may be unavailable on your system.
Alternatively, the use of alloca
may be undesirable due to very
large buffers and O/S limits on stack size. In such cases, consider the
following alternative.
{ TOPC_BUF tmp; ... buf = malloc(buf_size); tmp = TOPC_MSG(buf, buf_size); free(buf); return tmp; } |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
TOPC_MSG_PTR()
to Avoid Copying Large Buffers If the cost of copying a large buffer is a concern, `TOP-C' provides an alternative function, which avoids copying into `TOP-C' space.
TOPC_MSG()
, except that it does not copy
buf into `TOP-C' space. It is the responsibility
of the application not to free or modify buf as long
as `TOP-C' might potentially pass it to an
application callback function.
TOPC_MSG_PTR()
is inherently dangerous, if the application
modifies or frees a buffer and `TOP-C' later passes that buffer
to a callback function. It may be useful when the cost of copying
large buffers is an issue, or if one is concerned about `TOP-C'
making a call to malloc()
. Note that the invocation
./a.out --TOPC-safety=4 |
TOPC_MSG_PTR()
into
calls to TOPC_MSG()
. This is useful in deciding
if a bug is related to the use of TOPC_MSG_PTR()
.
An application should not pass a buffer on the stack to TOPC_MSG_PTR()
.
This can be avoided either by declaring a local variable to be
`static', or else using a global variable (or a class member
in the case of C++). In such cases, it is the responsibility of the
application to dynamically create and free buffers.
An example of how this can be done follows in the next section.
Note that if the application code must also be compatible with the shared memory model, then the static local variable or global variable must also be thread-private (8.4.2 Thread-Private Global Variables).
For examples of coding with TOPC_MSG_PTR()
that are compatible
with all memory models, including the shared memory model,
see `examples/README' and the corresponding examples
in the `TOP-C' distribution.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
TOPC_MSG_PTR()
Recall the syntax for creating a message buffer of type TOPC_BUF
using
TOPC_MSG_PTR(buf, buf_size)
. The two callback functions
GenerateTaskInput()
and DoTask()
both return such a
message buffer. In the case of
GenerateTaskInput()
, `TOP-C' saves a copy of the buffer,
which becomes an input argument to CheckTaskResult()
and to UpdateSharedData
on the master.
Hence, if buf points to a temporarily allocated buffer,
it is the responsibility of the `TOP-C' callback function to free the
buffer only after the callback function has returned.
This seeming contradiction can be easily handled by the following code.
TOPC_BUF GenerateTaskInput() { static void *buf = NULL; if ( buf == NULL ) { malloc(buf_size); } ... [ Add new message data to buf ] ... return TOPC_MSG_PTR(buf, buf_size); } |
buf_size
might vary dynamically between calls, the following
fragment solves the same problem.
TOPC_BUF GenerateTaskInput() { static void *buf = NULL; if ( buf != NULL ) { free(buf); } ... [ Compute buf_size for new message ] ... buf = malloc( buf_size ); ... [ Add new message data to buf ] ... return TOPC_MSG_PTR(buf, buf_size); } |
Note that buf
is allocated as a static local
variable. `TOP-C' restricts the buf of
TOPC_MSG_PTR(buf, buf_size)
to point to a buffer that is in
the heap (not on the stack). Hence, buf must not point to
non-static local data.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If you use a distributed memory model and the buffer pointed to by
input
includes fields with their own pointers, the application
must first follow all pointers and copy into a new buffer all data
referenced directly or indirectly by input
. The new buffer can
then be passed to TOPC_MSG()
. This copying process is called
marshaling. See section Marshaling and Heterogeneous Architectures.
If following all pointers is a burden, then one can
load the application on the master and slaves at a common absolute
address, and insure that all pointer references have been initialized
before the first call to TOPC_master_slave()
. In `gcc',
one specifies an absolute load address with code such as:
gcc -Wl,-Tdata -Wl,-Thex_addr ... |
Specifying an absolute load address has many risks, such as if the master and slaves use different versions of the operating system, the compiler, other software, or different hardware configurations. Hence, this technique is recommended only as a last resort.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The `TOP-C' programmer's model changes slightly for shared
memory. With careful design, one can use the same application source
code both for distributed memory and shared memory architectures.
Processes are replaced by threads. UpdateSharedData()
is
executed only by the master thread, and not by any slave thread. As
with distributed memory, TOPC_MSG()
buffers are copied to
`TOP-C' space (shallow copy). As usual, the application is responsible
for freeing any application buffers outside of `TOP-C' space.
Furthermore, since the master and slaves share memory, `TOP-C'
creates the slaves only during the first call to master_slave. If a
slave needs to initialize any private data (see
TOPC_thread_private
, below), then this can be done by the slave
the first time that it gains control through DoTask()
.
Two issues arise in porting a distributed memory `TOP-C' application to shared memory.
DoTask()
must not read
shared data while UpdateSharedData()
(on the master)
simultaneously writes to the shared data.
Most `TOP-C' applications for the distributed memory model will run unchanged in the shared memory model. In some cases, one must add additional `TOP-C' code to handle these additional issues. In all cases, one can easily retain compatibility with the distributed memory model.
8.4.1 Reader-Writer Synchronization 8.4.2 Thread-Private Global Variables 8.4.3 Sharing Variables between Master and Slave and Volatile Variables 8.4.4 SMP Performance
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In shared memory, `TOP-C' uses a classical single-writer,
multiple-reader strategy with writer-preferred for lock requests.
By default, DoTask()
acts as the critical section of the
readers (the slave threads) and UpdateSharedData()
acts as the
critical section of the writer (the master thread).
`TOP-C' sets a read lock around all of DoTask() and a write
lock around all of UpdateSharedData().
As always in the `TOP-C' model,
it is an error if an application writes to shared data outside
of UpdateSharedData()
. Note that GenerateTaskInput()
and CheckTaskResult()
can safely read the shared data without
a lock in this case, since these routines and UpdateSharedData()
are all invoked only by the master thread.
The default behavior implies that DoTask()
and
UpdateSharedData()
never run simultaneously. Optionally, one
can achieve greater concurrency through a finer level of granularity
by declaring to `TOP-C' which sections of code read or write
shared data. If `TOP-C' detects any call to TOPC_ATOMIC_READ(0)
,
`TOP-C' will follow the critical sections declared by the
application inside of DoTask()
and UpdateSharedData()
.
It is not useful to use TOPC_ATOMIC_READ()
outside of DoTask()
not to use TOPC_ATOMIC_WRITE()
outside of UpdateSharedData()
.
The number 0 refers to page 0 of shared data. `TOP-C' currently supports only a single common page of shared data, but future versions will support multiple pages. In the future, two threads will be able to simultaneously hold write locks if they are for different pages.
The following alternatives to TOPC_ATOMIC_READ()
and TOPC_ATOMIC_WRITE()
are provided for greater flexibility.
TOPC_ATOMIC_READ
and TOPC_ATOMIC_WRITE
.
In the distributed memory model of `TOP-C', all of the above invocations for atomic reading and writing are ignored, thus retaining full compatibility between the shared and distributed memory models.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A thread-private variable is a variable whose data is not
shared among threads: i.e., each thread has a private copy of
the variable.
The only variables that are thread-private by default in
shared memory are those on the stack (non-static, local variables). All
other variables exist as a single copy, shared by all threads.
This is inherent in the POSIX standard for threads in C/C++.
If DoTask()
accesses any global variables or local static variables,
then those variables must be made thread-private.
Ideally, if C allowed it, we would just write something like:
THREAD_PRIVATE int myvar = 0; /* NOT SUPPORTED */ |
TOPC_thread_private_t TOPC_thread_private; |
typedef int TOPC_thread_private_t; #define myvar TOPC_thread_private; int myvar_debug() {return myvar;} /* needed to access myvar in gdb */ |
`TOP-C' provides primitives to declare a single thread-private global variable. `TOP-C' allows the application programmer to declare the type of that variable.
TOPC_thread_private_t
. It may be
used like any C variable, and each thread has its own private
copy that will not be shared.
typedef
if TOPC_thread_private
is used.
If more than one thread-private variable is desired, define
TOPC_thread_private_t
as a struct, and use each
field as a separate thread-private variable.
EXAMPLE:
/* Ideally, if C allowed it, we would just write: * THREAD_PRIVATE struct {int my_rank; int rnd;} mystruct; * We emulate this using TOP-C's implicitly declared thread-private var: * TOPC_thread_private_t TOPC_thread_private; */ typedef struct {int my_rank; int rnd;} TOPC_thread_private_t; #define mystruct TOPC_thread_private void set_info() { mystruct.my_rank = TOPC_rank(); mystruct.rnd = rand(); } void get_info() { foo(); if (mystruct.my_rank != TOPC_rank()) printf("ERROR\n"); printf("Slave %d random num: %d\n", mystruct.my_rank, mystruct.rnd); } TOPC_BUF do_Task() { set_info(); /* info in mystruct is NOT shared among threads */ get_info(); ...; } |
Additional examples can be found by reading `examples/README' in the `TOP-C' distribution.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The shared memory model, like any `SMP' code,
allows the master and slaves to
communicate through global variables, which are shared by default.
It is recommended not to use this feature, and instead to maintain
communication through TOPC_MSG()
, for ease of code maintenance,
and to maintain portability with the other `TOP-C' models
(distributed memory and sequential). If you do use your own global shared
variables between master and slaves, be sure to declare them volatile
.
volatile int myvar; |
To be more precise, if a non-local variable is accessed more than once in a procedure, the compiler is allowed to keep the first access value in a thread register and reuse it at later occurrences, without consulting the shared memory. A volatile declaration tells the compiler to re-read the value from shared memory at each occurrence. Similarly, a write to a volatile variable causes the corresponding transfer of its value from a register to shared memory to occur at a time not much later than the execution of the write instruction.
If you suspect a missing volatile declaration, note that `gcc' support the following command-line options.
gcc -fvolatile -fvolatile-global ... # If topcc uses gcc: topcc --pthread -fvolatile -fvolatile-global myfile.c |
-fvolatile
tells `gcc' to compile all memory
references through pointers as volatile, and the option
-fvolatile-global
tells `gcc' to compile all memory
references to extern and global data as volatile.
However, note that this implies a performance penalty
since the compiler will issue
a load/store instruction for each volatile access, and
will not keep volatile values in registers.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Note that `SMP' involves certain performance issues that do not arise
in other modes. If you find a lack of performance, please read
7.3 Improving Performance. Also, note that the vendor-supplied
compiler, cc
, is often recommended over gcc
for
`SMP', due to specialized vendor-specific architectural issues.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
`TOP-C' also provides a sequential memory model. That model is useful for first debugging an application in a sequential context, and then re-compiling it with one of the parallel `TOP-C' libraries for production use. The application code for the sequential library is usually both source and object compatible with the application code for a parallel library. The sequential library emulates an application with a single `TOP-C' library.
The sequential memory model emulates an application in which
`DoTask()' is executed in the context of the single slave
process/thread, and all other code is executed in the context of the
master process/thread. This affects the values returned by
TOPC_is_master()
and TOPC_rank()
. In particular,
conditional code for execution on the master will work
correctly in the sequential memory model, but the following conditional
code for execution on the slave will probably not work correctly.
int main( int argc, char *argv[] ) { TOPC_init( &argc, &argv ); if ( TOPC_is_master() ) ...; /* is executed in sequential model */ else ...; /* is never executed in sequential model */ TOPC_master_slave( ..., ..., ..., ...); TOPC_finalize(); } |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
alarm()
before waiting to receive
message from master. By default, if the master does not reply in a
half hour (1800 seconds), then the slave receives
SIGALRM
and dies.
This is to prevent runaway processes in dist. memory version when master dies
without killing all slaves. 7.4 Long Jobs and Courtesy to Others,
in order to change this default.
If your applications also uses SIGALRM
, then run your
application with --TOPC-slave-timeout=0
and `TOP-C'
will not use SIGALRM
.
GenerateTaskInput()
and DoTask()
This memory is managed by `TOP-C'.
The slave process attempts to set current directory to the same as the master
inside TOPC_init()
and produces a warning if unsuccessful.
When a task buffer is copied into `TOP-C' space, it becomes word-aligned. If the buffer was originally not word-aligned, but some field in the buffer was word-aligned, the internal field will no longer be word-aligned. On some architectures, casting a non-word-aligned field to `int' or certain other types will cause a bus error.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |