[J3] floating-point status

Malcolm Cohen malcolm at nag-j.co.jp
Mon Oct 22 20:48:35 EDT 2018


> But, focusing more closely, a typical image runs on a processor core or node that has many threads. How does a programmer deal with flags and traps that occur on a specific thread? GPUs are a notable instance of this issue, but even hyperthreads or different cores on the same chip are problems. 



No, hyperthreads and different cores are not problems.  If the IEEE flags and controls were shared they would be a contended-for resource… so in the context of threads they really ought to be separate for performance reasons!  They are indeed separate with x86/x64 cores/hyperthreads, and I believe they are with ARM too though I am a bit less familiar with the gory details there.  I don’t know of any hardware where different cpus share a single floating-point mode control word, let alone the flags (if they did, which one would trap if a flag were raised?).

 

> Does the IEEE spec take into account threads? Or does it assume that every entity in the processor that can perform floating point operations has its own set of flags and trap modes (and rounding modes)?

 

It makes no such demands on the processor.  The earlier 754 standard was for hardware, the current one is not… because few people program in assembler; it is for programming environments.

 

GPUs can indeed be problematic, mostly because in the past, few of them made a serious attempt to conform.  If the GPU doesn’t bother to handle NaNs or subnormals, or raise flags, it is hardly the fault of IEEE.  To be fair, when used to drive a display, such things are usually not all that useful, unlike the situation where they are being used for general-purpose computation.  “GPUs” intended for the latter usage are far more likely to conform to IEEE.

 

(The people participating in the IEEE 754 revision include those who are producing highly parallel hardware, so it would be quite wrong to think that they’ve never heard of threads before, or to think they’ve never discussed the implications of highly-parallel computation.  Whether they’ve gotten it right is of course a different matter.)

 

Cheers,

-- 

..............Malcolm Cohen, NAG Oxford/Tokyo.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.j3-fortran.org/pipermail/j3/attachments/20181023/9394d9f8/attachment.html>


More information about the J3 mailing list