(j3.2006) Get_Command_Argument

Tobias Burnus burnus
Thu Dec 15 05:50:24 EST 2011


On 12/15/2011 10:02 AM, Malcolm Cohen wrote:
>> Maybe it would be reasonable as well to allow the file name in OPEN and
>> INQUIRE statements to be of kind c_char.
>
> No.  Again, the canonical filename on some recent operating systems is 
> Unicode, not an 8-bit C char.

Yes, I have already seen a request that OPEN/INQUIRE file argument 
should accept [as vendor extension] arguments of the character kind 
selected_char_kind('ISO_10646'). It shouldn't be very difficult to 
implement it: On Unix systems with UTF-8, one just needs to convert it 
to UTF-8 as one does with I/O and on Windows one can also pass UTF-8 
strings to "fopen".

Fortunately, on systems based on UTF-8 (like most Unix systems), using 
UTF-8 characters in default-kind character strings mostly works. (It 
primarily fails if the length plays a role, e.g. for manipulating the 
strings or for I/O with "(a10)" format where the width will depend on 
the number of bytes per character.)

I have no idea how well that will work with file names on Windows, 
however. One can "fopen" a file with a UTF-8 file name ("ccs=UTF-8")  
but without Unicode support for OPEN/INQUIRE that won't happen and I 
would not be surprised if - depending on the input - one will get 
garbage file names. (Especially, since Windows likes 16-bit Unicode 
characters [e.g. for their wchar_t type and the w* API functions].)

Actually, for Windows, I also saw the request that it should be easy to 
convert a ISO_10646 into something passable to the Windows API. I think 
it was a UTF-16 string, but I might misremember it; I recall that it was 
a data type (like UTF-8) where a single character could have different 
byte-widths.

Tobias

PS: I wonder how widely non-8bit characters are used with Fortran; how 
many users are (not) satisfied with UTF-8 and default characters; and 
how widely the ISO_10646-kind characters are used. (I think the compiler 
support is still a bit limited; GCC supports it since 4.4, but still 
does not make it easy to type non-ASCII characters as character literals 
in the source code.)

Given that Fortran seems to have a large user base in countries with 
non-Latin scripts, I assume that the potential user base is large. 
(http://www.google.com/trends?q=Fortran lists as top Languages: Korean, 
Japanese, Greek, Chinese and as top regions South Korea, Iran, Japan, 
Taiwan, Greece, China, India, Russian Federation.)



More information about the J3 mailing list