(j3.2006) (SC22WG5.4935) [ukfortran] WG5 ballot on first draft TS 18508, Additional Parallel Features in Fortran

Bill Long longb
Thu Mar 14 14:58:35 EDT 2013



On 3/14/13 9:36 AM, N.M. Maclaren wrote:
>> >What is proposed is very similar to the way we treat I/O errors.  There
>> >is a mechanism for notification of a problem (STAT=,  like I/O) and a
>> >way to identify where the error occurred (failed images index values;
>> >the I/O unit number is already available to the users).  Unlike I/O
>> >where we have singled out some failure modes (end-of-file, for example),
>> >we did not specify particular modes of failure for images. In current
>> >experience, it is almost always a non-recoverable memory error, but I
>> >think we should wait for more data before being more specific.   The
>> >current spec is intentionally minimal.

> The major difference is that I/O errors affect just one file, and the
> minor one is that many of them are actually recoverable (though not, at
> present, in Fortran).  The killer about node failure is that they are
> necessarily NOT so localised.
>

I/O errors like reading past the end of file will affect just that file. 
Errors related to hardware failure of a disk array might affect all of 
the files used by the program.

Any incomplete data transfer into or out of a failed image is probably 
corrupt, and the standard needs to written  assuming that is the case. 
How many other images are affected will depend highly on the nature of 
the program.

I would note that we already have STAT= specifiers on existing 
statements like SYNC ALL.  These already provide a means to register a 
failed image by defining the status variable with a processor-dependent 
value.  The new feature in the TS draft is to make that particular error 
status  equal to the value of a standard-defined named constant.   This 
change is motivated by the new capability of effectively changing the 
number of images in the job, so it is potentially possible for the 
program to actually do something about the problem.

Cheers,
Bill


-- 
Bill Long                                           longb at cray.com
Fortran Technical Support    &                 voice: 651-605-9024
Bioinformatics Software Development            fax:   651-605-9142
Cray Inc./Cray Plaza, Suite 210/380 Jackson St./St. Paul, MN 55101





More information about the J3 mailing list