(j3.2006) Get_Command_Argument
Tobias Burnus
burnus
Thu Dec 15 05:50:24 EST 2011
On 12/15/2011 10:02 AM, Malcolm Cohen wrote:
>> Maybe it would be reasonable as well to allow the file name in OPEN and
>> INQUIRE statements to be of kind c_char.
>
> No. Again, the canonical filename on some recent operating systems is
> Unicode, not an 8-bit C char.
Yes, I have already seen a request that OPEN/INQUIRE file argument
should accept [as vendor extension] arguments of the character kind
selected_char_kind('ISO_10646'). It shouldn't be very difficult to
implement it: On Unix systems with UTF-8, one just needs to convert it
to UTF-8 as one does with I/O and on Windows one can also pass UTF-8
strings to "fopen".
Fortunately, on systems based on UTF-8 (like most Unix systems), using
UTF-8 characters in default-kind character strings mostly works. (It
primarily fails if the length plays a role, e.g. for manipulating the
strings or for I/O with "(a10)" format where the width will depend on
the number of bytes per character.)
I have no idea how well that will work with file names on Windows,
however. One can "fopen" a file with a UTF-8 file name ("ccs=UTF-8")
but without Unicode support for OPEN/INQUIRE that won't happen and I
would not be surprised if - depending on the input - one will get
garbage file names. (Especially, since Windows likes 16-bit Unicode
characters [e.g. for their wchar_t type and the w* API functions].)
Actually, for Windows, I also saw the request that it should be easy to
convert a ISO_10646 into something passable to the Windows API. I think
it was a UTF-16 string, but I might misremember it; I recall that it was
a data type (like UTF-8) where a single character could have different
byte-widths.
Tobias
PS: I wonder how widely non-8bit characters are used with Fortran; how
many users are (not) satisfied with UTF-8 and default characters; and
how widely the ISO_10646-kind characters are used. (I think the compiler
support is still a bit limited; GCC supports it since 4.4, but still
does not make it easy to type non-ASCII characters as character literals
in the source code.)
Given that Fortran seems to have a large user base in countries with
non-Latin scripts, I assume that the potential user base is large.
(http://www.google.com/trends?q=Fortran lists as top Languages: Korean,
Japanese, Greek, Chinese and as top regions South Korea, Iran, Japan,
Taiwan, Greece, China, India, Russian Federation.)
More information about the J3
mailing list