(j3.2006) (SC22WG5.4678) Part 2 of the standard
John Reid
John.Reid
Mon Apr 16 11:35:17 EDT 2012
WG5,
Following helpful comments from Clive Page, Bill Long, and Malcolm
Cohen, I have revised both my paper and the code.
New version attached.
Once again, comments will be very welcome.
John.
-------------- next part --------------
ISO/IEC JTC1/SC22/WG5 N1913-2
The module for Part 2, varying length character strings
John Reid
16 April 2012
I promised at Garching to look into the conversion to Fortran 2003 of the
procedures of subclauses 3.6 and 3.7 of Part 2 of the Standard. These are
the I/O procedures GET, PUT, and PUT_LINE and the elemental procedures
EXTRACT, INSERT, REMOVE, REPLACE, and INSERT. The other procedures of
Part 2, except for CHAR with a character argument and VAR_STR, are already
in Part 1.
Lawrie Schonfelder wrote a sample module to implement Part 2. This is
available from
ftp://ftp.nag.co.uk/sc22wg5/ISO_VARYING_STRING/
He also wrote two sample programs ilustrating the use of the facilites.
These are included as an annex to Part 2 and are available from the same
ftp site. Bill Long has pointed out that the sample programs are not
portable because they rely on the IOSTAT value -1 indicating that the
end of the file has been reached.
I have made a new module and new versions of the two sample programs. These
are appended to this paper. They run successfully under
the Nag compiler. Each occurrence of
type(VARYING_STRING),INTENT(IN)
was replaced by
CHARACTER(LEN=*),INTENT(IN)
Each other occurrence of
type(VARYING_STRING)
is replaced by
CHARACTER(LEN=:),ALLOCATABLE
This resulted in arguments that were previously documented as either of
type CHARACTER or VARYING_STRING becoming only of type CHARACTER. As well
as the procedures subclauses 3.6 and 3.7, I included CHAR with a
character argument and VAR_STR. The number of specific procedures reduced
from 43 to 18.
There were obvious mechanical changes needed within the procedures.
The functions EXTRACT, INSERT, REMOVE, and REPLACE cannot be elemental
because each function result is allocatable with deferred character length.
The subroutine SPLIT cannot be elemental because the arguments string, word,
and separator are allocatable with deferred character length. This is because
the elements of a character array have to have the same character length.
A similar problem occurs in the vocabulary sample program, where an array
holds words of varying length. Following a suggestion from Malcolm Cohen, I
declared my own derived type with a single component of deferred character
length and worked with arrays of this type. In this example, none of the
functions was called elementally, but a user that wishes to do this could
write impure elemental wrappers to get the effect.
It might be thought that non-elemental invocations of EXTRACT, INSERT, REMOVE,
REPLACE, and SPLIT might be replaced by simple use of substrings and
concatenation. For example, REMOVE(string,start,finish) can usually be
replaced by string(1:start-1)//string(finish+1:), but there are end cases
where it cannot. The reader can see the details in the code that is appended.
In both sample programs, I access ISO_FORTRAN_ENV to get the IOSTAT value that
indicates that the end of the file has been reached.
Lawrie Schonfelder wrote a test suite that is available from:
http://www.fortran.com/vst_95a.zip
I have adapted codes from here to test the new module, but I am not confident
that these adapted codes are adequate to fully test the revised code. In
particular, they rely on interactive input from the keyboard and I was unsure
of what is needed. If this work is to be taken further, new tests are needed.
Two questions were asked at the meeting in Garching:
1. Can programs that use the varying length strings module be so easily
converted that Part 2 of the Standard should be withdrawn?
2. Should extra intrinsic procedures be added to part 1?
The user with arrays of type VARYING_STRING may declare a new derived type
with a single component of deferred character length and write impure
elemental wrappers to support elemental calls. Apart from this, the module
given here (after full testing) should allow simple conversion. Changes
are needed for the extraction of a single character; for example,
set%chars(j) needs to be replaced by set(j:j). They are also needed
if array features are employed for the character array component. For
example
IF( ALL(string%chars(ipos:ipos+lt-1) == target%chars) )THEN
becomes
IF( string(ipos:ipos+lt-1) == target )THEN
I prefer not to comment on whether these functions should be added to Part 1.
The case needs to be made by those who need them. We might consider providing
an open source and fully tested version of the appended module.
...............................................................................
MODULE ISO_VARYING_STRING
USE,INTRINSIC :: ISO_FORTRAN_ENV
IMPLICIT NONE
! Adapted by John Reid from the module written by J.L.Schonfelder.
! Version produced (12 April 2012)
!-----------------------------------------------------------------------------
! This module provides revised versions of the generic procedures GET, PUT,
! PUT_LINE, EXTRACT, INSERT, REMOVE, REPLACE, and INSERT of
! ISO/IEC 1539-2:2000(E).
!-----------------------------------------------------------------------------
! NB This module has not been fully tested.
PRIVATE
!-----------------------------------------------------------------------------!
! By default all entities declared or defined in this module are private to !
! the module. Only those entities declared explicitly as being public are !
! accessible to programs using the module. In particular, the procedures !
! defined herein are made accessible via their generic identifiers only; !
! their specific names are private. !
!-----------------------------------------------------------------------------!
!----- GENERIC PROCEDURE INTERFACE DEFINITIONS -------------------------------!
!----- Conversion procedure interfaces ---------------------------------------!
INTERFACE VAR_STR
MODULE PROCEDURE c_to_s ! character to string
ENDINTERFACE
INTERFACE CHAR
MODULE PROCEDURE s_to_c, & ! string to character
s_to_fix_c ! string to specified length character
ENDINTERFACE
!----- Input procedure interfaces --------------------------------------------!
INTERFACE GET
MODULE PROCEDURE get_d_eor, & ! default unit, EoR termination
get_u_eor, & ! specified unit, EoR termination
get_d_tset_c, & ! default unit, char set termination
get_u_tset_c ! specified unit, char set termination
ENDINTERFACE
!----- Output procedure interfaces -------------------------------------------!
INTERFACE PUT
MODULE PROCEDURE put_d_c, & ! char to default unit
put_u_c ! char to specified unit
ENDINTERFACE
INTERFACE PUT_LINE
MODULE PROCEDURE putline_d_c, & ! char to default unit
putline_u_c ! char to specified unit
ENDINTERFACE
!----- Insert procedure interfaces -------------------------------------------!
INTERFACE INSERT
MODULE PROCEDURE insert_cc ! string in string
ENDINTERFACE
!----- Replace procedure interfaces ------------------------------------------!
INTERFACE REPLACE
MODULE PROCEDURE replace_ss, & ! string by string, at specified
replace_ss_sf,& ! string by string, between
replace_sss ! in string replace string by string
ENDINTERFACE
!----- Remove procedure interface --------------------------------------------!
INTERFACE REMOVE
MODULE PROCEDURE remove_c ! characters from char , and finish
ENDINTERFACE
!----- Extract procedure interface -------------------------------------------!
INTERFACE EXTRACT
MODULE PROCEDURE extract_c ! from char extract string, and finish
ENDINTERFACE
!----- Split procedure interface ---------------------------------------------!
INTERFACE SPLIT
MODULE PROCEDURE split_c ! character in set
ENDINTERFACE
!----- specification of publicly accessible entities -------------------------!
PUBLIC :: CHAR,GET,PUT,PUT_LINE,EXTRACT,INSERT,REMOVE,REPLACE,SPLIT,VAR_STR
CONTAINS
!----- Conversion Procedures ------------------------------------------------!
FUNCTION c_to_s(chr) ! generic VAR_STR
CHARACTER(LEN=:),ALLOCATABLE :: c_to_s
CHARACTER(LEN=*),INTENT(IN) :: chr
c_to_s = chr
ENDFUNCTION c_to_s
PURE FUNCTION s_to_c(string) ! generic CHAR
CHARACTER(LEN=*),INTENT(IN) :: string
CHARACTER(LEN=:),ALLOCATABLE :: s_to_c
! returns the characters of string as an automatically sized character
s_to_c = string
ENDFUNCTION s_to_c
PURE FUNCTION s_to_fix_c(string,length) ! generic CHAR
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN) :: length
CHARACTER(LEN=length) :: s_to_fix_c
! returns the character of fixed length, length, containing the characters
! of string either padded with blanks or truncated on the right to fit
s_to_fix_c = string
ENDFUNCTION s_to_fix_c
!----- Input string procedure -----------------------------------------------!
SUBROUTINE get_d_eor(string,maxlen,iostat)
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
! the variable to be filled with
! characters read from the
! file connected to the default unit
INTEGER,INTENT(IN),OPTIONAL :: maxlen
! if present indicates the maximum
! number of characters that will be
! read from the file
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! reads string from the default unit starting at next character in the file
! and terminating at the end of record or after maxlen characters.
call get_u_eor (input_unit, string, maxlen, iostat)
ENDSUBROUTINE get_d_eor
SUBROUTINE get_u_eor(unit,string,maxlen,iostat)
INTEGER,INTENT(IN) :: unit
! identifies the input unit which must be
! connected for sequential formatted read
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
! the variable to be filled with
! characters read from the
! file connected to the unit
INTEGER,INTENT(IN),OPTIONAL :: maxlen
! if present indicates the maximum
! number of characters that will be
! read from the file
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! reads string from unit starting at next character in the file and
! terminating at the end of record or after maxlen characters.
CHARACTER(LEN=80) :: buffer
INTEGER :: ist,nch,toread,nb
IF(PRESENT(maxlen))THEN
toread=maxlen
ELSE
toread=HUGE(1)
ENDIF
string="" ! Start with an empty string
DO ! repeatedly read buffer and add to string until EoR
! or maxlen reached
IF(toread <= 0)EXIT
nb=MIN(80,toread)
READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,SIZE=nch,IOSTAT=ist) buffer(1:nb)
IF( ist /= 0 )THEN
IF(PRESENT(iostat)) THEN
iostat=ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist, &
" during READ_STRING of varying string on UNIT ",unit
STOP
ENDIF
ENDIF
string = string //buffer(1:nb)
toread = toread - nb
ENDDO
IF(PRESENT(iostat)) iostat = 0
RETURN
9999 string = string //buffer(1:nch)
IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_u_eor
SUBROUTINE get_d_tset_c(string,set,separator,maxlen,iostat)
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
! the variable to be filled with
! characters read from the
! file connected to the default unit
CHARACTER(LEN=*),INTENT(IN) :: set
! the set of characters which if found in
! the input terminate the read
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
! the actual separator character from set
! found as the input string terminator
! returned as zero length if termination
! by maxlen or EOR
INTEGER,INTENT(IN),OPTIONAL :: maxlen
! if present indicates the maximum
! number of characters that will be
! read from the file
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! reads string from the default unit starting at next character in the file and
! terminating at the end of record, occurance of a character in set,
! or after reading maxlen characters.
CALL get_u_tset_c(input_unit,string,set,separator,maxlen,iostat)
ENDSUBROUTINE get_d_tset_c
SUBROUTINE get_u_tset_c(unit,string,set,separator,maxlen,iostat)
INTEGER,INTENT(IN) :: unit
! identifies the input unit which must be
! connected for sequential formatted read
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
! the variable to be filled with
! characters read from the
! file connected to the unit
CHARACTER(LEN=*),INTENT(IN) :: set
! the set of characters which if found in
! the input terminate the read
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
! the actual separator character from set
! found as the input string terminator
! returned as zero length if termination
! by maxlen or EOR
INTEGER,INTENT(IN),OPTIONAL :: maxlen
! if present indicates the maximum
! number of characters that will be
! read from the file
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! reads string from unit starting at next character in the file and
! terminating at the end of record, occurance of a character in set,
! or after reading maxlen characters.
CHARACTER :: buffer ! characters must be read one at a time to detect
! first terminator character in set
INTEGER :: ist,j,toread,lenset
lenset = LEN(set)
IF(PRESENT(maxlen))THEN
toread=maxlen
ELSE
toread=HUGE(1)
ENDIF
string = "" ! clears return string N.B. will also deallocate string via the
! assignment operation
IF(PRESENT(separator)) separator="" ! clear separator
readchar:DO ! repeatedly read buffer and add to string
IF(toread <= 0)EXIT readchar ! maxlen reached
READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,IOSTAT=ist) buffer
IF( ist /= 0 )THEN
IF(PRESENT(iostat)) THEN
iostat=ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist, &
" during GET of varying string on unit ",unit
STOP
ENDIF
ENDIF
! check for occurance of set character in buffer
DO j = 1,lenset
IF(buffer == set(j:j))THEN
IF(PRESENT(separator)) separator=buffer
EXIT readchar ! separator terminator found
ENDIF
ENDDO
string = string//buffer
toread = toread - 1
ENDDO readchar
IF(PRESENT(iostat)) iostat = 0
RETURN
9999 CONTINUE ! EOR terminator read
IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_u_tset_c
!----- Output string procedures ----------------------------------------------!
SUBROUTINE put_d_c(string,iostat)
CHARACTER(LEN=*),INTENT(IN) :: string
! the character variable to be appended to
! the current record or to the start of
! the next record if there is no
! current record
! uses the default unit
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
CALL put_u_c(output_unit,string,iostat)
ENDSUBROUTINE put_d_c
SUBROUTINE put_u_c(unit,string,iostat)
INTEGER,INTENT(IN) :: unit
! identifies the output unit which must
! be connected for sequential formatted
! write
CHARACTER(LEN=*),INTENT(IN) :: string
! the character variable to be appended to
! the current record or to the start of
! the next record if there is no
! current record
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
INTEGER :: ist
WRITE(unit,FMT='(A)',ADVANCE='NO',IOSTAT=ist) string
IF( ist /= 0 )THEN
IF(PRESENT(iostat))THEN
iostat = ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist," during PUT of character on UNIT ",unit
STOP
ENDIF
ENDIF
IF(PRESENT(iostat)) iostat=0
ENDSUBROUTINE put_u_c
SUBROUTINE putline_d_c(string,iostat)
CHARACTER(LEN=*),INTENT(IN) :: string
! the character variable to be appended to
! the current record or to the start of
! the next record if there is no
! current record
! uses the default unit
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! appends the string to the current record and then ends the record
! leaves the file positioned after the record just completed which then
! becomes the previous and last record in the file.
CALL putline_u_c(output_unit,string,iostat)
ENDSUBROUTINE putline_d_c
SUBROUTINE putline_u_c(unit,string,iostat)
INTEGER,INTENT(IN) :: unit
! identifies the output unit which must
! be connected for sequential formatted
! write
CHARACTER(LEN=*),INTENT(IN) :: string
! the character variable to be appended to
! the current record or to the start of
! the next record if there is no
! current record
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! appends the string to the current record and then ends the record
! leaves the file positioned after the record just completed which then
! becomes the previous and last record in the file.
INTEGER :: ist
WRITE(unit,FMT='(A,/)',ADVANCE='NO',IOSTAT=ist) string
IF(PRESENT(iostat))THEN
iostat = ist
RETURN
ELSEIF( ist /= 0 )THEN
WRITE(*,*) " Error No.",ist, &
" during WRITE_LINE of character on UNIT",unit
STOP
ENDIF
ENDSUBROUTINE putline_u_c
!----- Insert procedure ----------------------------------------------------!
FUNCTION insert_cc(string,start,substring)
CHARACTER(LEN=:),ALLOCATABLE :: insert_cc
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN) :: start
CHARACTER(LEN=*),INTENT(IN) :: substring
! calculates result string by inserting the substring into string
! beginning at position start pushing the remainder of the string
! to the right and enlarging it accordingly,
! if start is greater than LEN(string) the substring is simply appended ! to string by concatenation. if start is less than 1
! substring is inserted before string, ie. start is treated as if it were 1
INTEGER :: ip,is,lsub,ls
lsub = LEN(substring); ls = LEN(string)
is = MAX(start,1)
ip = MIN(ls+1,is)
insert_cc = string(1:ip-1)//substring//string(ip:ls)
ENDFUNCTION insert_cc
!----- Replace procedures ---------------------------------------------------!
FUNCTION replace_ss(string,start,substring)
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN) :: start
CHARACTER(LEN=*),INTENT(IN) :: substring
CHARACTER(LEN=:),ALLOCATABLE :: replace_ss
! calculates the result string by the following actions:
! inserts the substring into string beginning at position
! start replacing the following LEN(substring) characters of the string
! and enlarging string if necessary. if start is greater than LEN(string)
! substring is simply appended to string by concatenation. If start is less
! than 1, substring replaces characters in string starting at 1
INTEGER :: ip,is,lsub,ls
lsub = LEN(substring); ls = LEN(string)
is = MAX(start,1)
ip = MIN(ls+1,is)
replace_ss = string(1:ip-1)//substring//string(ip+lsub:)
ENDFUNCTION replace_ss
FUNCTION replace_ss_sf(string,start,finish,substring)
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN) :: start,finish
CHARACTER(LEN=*),INTENT(IN) :: substring
CHARACTER(LEN=:),ALLOCATABLE :: replace_ss_sf
! calculates the result string by the following actions:
! inserts the substring into string beginning at position
! start replacing the following finish-start+1 characters of the string
! and enlarging or shrinking the string if necessary.
! If start is greater than LEN(string) substring is simply appended to string
! by concatenation. If start is less than 1, start = 1 is used
! If finish is greater than LEN(string), finish = LEN(string) is used
! If finish is less than start, substring is inserted before start
INTEGER :: ip,is,if,nw,lsub,ls
lsub = LEN(substring); ls = LEN(string)
is = MAX(start,1)
ip = MIN(ls+1,is)
if = MAX(ip-1,MIN(finish,ls))
nw = lsub + ls - if+ip-1
replace_ss_sf = string(1:ip-1)//substring//string(if+1:ls)
ENDFUNCTION replace_ss_sf
FUNCTION replace_sss(string,target,substring,every,back)
CHARACTER(LEN=*),INTENT(IN) :: string,target,substring
LOGICAL,INTENT(IN),OPTIONAL :: every,back
! calculates the result string by the following actions:
! searches for occurences of target in string, and replaces these with
! substring. if back present with value true search is backward otherwise
! search is done forward. if every present with value true all occurences
! of target in string are replaced, otherwise only the first found is
! replaced. if target is not found the result is the same as string.
LOGICAL :: dir_switch, rep_search
CHARACTER(LEN=:),ALLOCATABLE :: work
INTEGER :: ls,lt,lsub,ipos,ipow
CHARACTER(LEN=:),ALLOCATABLE :: replace_sss
ls = LEN(string); lt = LEN(target); lsub = LEN(substring)
IF(lt==0)THEN
IF(ls==0)THEN
replace_sss = substring
ELSE
replace_sss = string
ENDIF
RETURN
ENDIF
work = string
IF( PRESENT(back) )THEN
dir_switch = back
ELSE
dir_switch = .FALSE.
ENDIF
IF( PRESENT(every) )THEN
rep_search = every
ELSE
rep_search = .FALSE.
ENDIF
IF( dir_switch )THEN ! backwards search
ipos = ls-lt+1
DO
IF( ipos < 1 )EXIT ! search past start of string
! test for occurance of target in string at this position
IF( string(ipos:ipos+lt-1) == target )THEN
! match found allocate space for string with this occurance of
! target replaced by substring
! copy work into temp replacing this occurance of target by
! substring
work = work(1:ipos-1)//substring//work(ipos+lt:)
IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted
! move search and replacement positions over the effected positions
ipos = ipos-lt+1
ENDIF
ipos=ipos-1
ENDDO
ELSE ! forward search
ipos = 1; ipow = 1
DO
IF( ipos > ls-lt+1 )EXIT ! search past end of string
! test for occurance of target in string at this position
IF( string(ipos:ipos+lt-1) == target )THEN
! match found allocate space for string with this occurance of
! target replaced by substring
work = work(1:ipow-1)//substring//work(ipow+lt:)
IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted
! move search and replacement positions over the effected positions
ipos = ipos+lt-1; ipow = ipow+lsub-1
ENDIF
ipos=ipos+1; ipow=ipow+1
ENDDO
ENDIF
replace_sss = work
ENDFUNCTION replace_sss
!----- Remove procedure ----------------------------------------------------!
FUNCTION remove_c(string,start,finish)
CHARACTER(LEN=:),ALLOCATABLE :: remove_c
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN),OPTIONAL :: start
INTEGER,INTENT(IN),OPTIONAL :: finish
! returns as result the string produced by the actions
! removes the characters between start and finish from string reducing it in
! size by MAX(0,ABS(finish-start+1))
! if start < 1 or is missing then assumes start=1
! if finish > LEN(string) or is missing then assumes finish=LEN(string)
INTEGER :: is,if,ls
ls = LEN(string)
IF (PRESENT(start)) THEN
is = MAX(1,start)
ELSE
is = 1
ENDIF
IF (PRESENT(finish)) THEN
if = MIN(ls,finish)
ELSE
if = ls
ENDIF
IF( if < is ) THEN ! zero characters to be removed, string is unchanged
remove_c = string
ELSE
remove_c = string(1:is-1)//string(if+1:)
ENDIF
ENDFUNCTION remove_c
!----- Extract procedure ---------------------------------------------------!
FUNCTION extract_c(string,start,finish)
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN),OPTIONAL :: start
INTEGER,INTENT(IN),OPTIONAL :: finish
CHARACTER(LEN=:),ALLOCATABLE :: extract_c
! extracts the characters between start and finish from character string and
! delivers these as the result of the function, string is unchanged
! if start < 1 or is missing then it is treated as 1
! if finish > LEN(string) or is missing then it is treated as LEN(string)
INTEGER :: is,if
IF (PRESENT(start)) THEN
is = MAX(1,start)
ELSE
is = 1
ENDIF
IF (PRESENT(finish)) THEN
if = MIN(LEN(string),finish)
ELSE
if = LEN(string)
ENDIF
extract_c = string(is:if)
ENDFUNCTION extract_c
!----- Split procedure ------------------------------------------------------!
SUBROUTINE split_c(string,word,set,separator,back)
CHARACTER(LEN=:),ALLOCATABLE,INTENT(INOUT) :: string
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: word
CHARACTER(LEN=*),INTENT(IN) :: set
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
LOGICAL,INTENT(IN),OPTIONAL :: back
! splits the input string at the first(last) character in set
! returns the leading(trailing) substring in word and the trailing(leading)
! substring in string. The search is done in the forward or backward
! direction depending on back. If separator is present, the actual separator
! character found is returned in separator.
! If no character in set is found string and separator are returned as
! zero length and the whole input string is returned in word.
LOGICAL :: dir_switch
INTEGER :: i,ls,tpos,lset
CHARACTER(LEN=:),ALLOCATABLE :: wst ! working copy of string
ls = LEN(string); lset = LEN(set)
wst=string
IF( PRESENT(back) )THEN
dir_switch = back
ELSE
dir_switch = .FALSE.
ENDIF
IF(dir_switch)THEN ! backwards search
BSEARCH:DO tpos = ls,1,-1
DO i=1,lset
IF(wst(tpos:tpos) == set(i:i))EXIT BSEARCH
ENDDO
ENDDO BSEARCH
word = wst(tpos+1:ls)
IF(PRESENT(separator))THEN
IF(tpos==0)THEN
separator = ""
ELSE
separator = wst(tpos:tpos)
ENDIF
ENDIF
string = wst(1:tpos-1)
ELSE ! forwards search
FSEARCH:DO tpos =1,ls
DO i=1,lset
IF(wst(tpos:tpos) == set(i:i))EXIT FSEARCH
ENDDO
ENDDO FSEARCH
word = wst(1:tpos-1)
IF(PRESENT(separator))THEN
IF(tpos==ls+1)THEN
separator = ""
ELSE
separator = wst(tpos:tpos)
ENDIF
ENDIF
string = wst(tpos+1:ls)
ENDIF
ENDSUBROUTINE split_c
ENDMODULE ISO_VARYING_STRING
...............................................................................
PROGRAM word_count
!-----------------------------------------------------------------------------!
! Counts the number of "words" contained in a file. The words are assumed to !
! be terminated by any one of: !
! space,comma,period,!,?, or the EoR !
! The file may have records of any length and the file may contain any number !
! of records. !
! The program prompts for the name of the file to be subject to a word count !
! and the result is written to the default output unit !
!-----------------------------------------------------------------------------!
USE ISO_VARYING_STRING
USE,INTRINSIC :: ISO_FORTRAN_ENV
IMPLICIT NONE
CHARACTER(len=:),ALLOCATABLE :: line,fname
INTEGER :: ierr,nd,wcount=0
WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?"
CALL GET(STRING=fname) ! read the required filename from the default
! input unit assumed to be the whole of the record read
OPEN(UNIT=10,FILE=fname)
file_read: DO ! until EoF reached
CALL GET(10,line,IOSTAT=ierr) ! read next line of file
IF(ierr == IOSTAT_END .OR. ierr > 0 )EXIT file_read
word_scan: DO ! until end of line
nd=SCAN(line," ,.!?") ! scan to find end of word
IF(nd == 0)THEN ! EoR is end of word
nd = LEN(line)
EXIT word_scan
ENDIF
IF(nd > 1) wcount=wcount+1 ! at least one non-terminator character
! in the word
line = REMOVE(line,1,nd) ! strips the counted word and its terminator
! from the line reducing its length before
! rescanning for the next word
ENDDO word_scan
IF(nd > 0) wcount=wcount+1
ENDDO file_read
IF(ierr < 0)THEN
WRITE(*,*) "No. of words in file =",wcount
ELSEIF(ierr > 0)THEN
WRITE(*,*) "Error in GET file in word_count, No. ",ierr
ENDIF
ENDPROGRAM word_count
...............................................................................
PROGRAM vocabulary_word_count
!-----------------------------------------------------------------------------!
! Counts the number of "words" contained in a file. The words are assumed to !
! be terminated by any one of: !
! space,comma,period,!,?, or the EoR !
! The file may have records of any length and the file may contain any number !
! of records. !
! The program prompts for the name of the file to be subject to a word count !
! and the result is written to the default output unit !
! Also builds a list of the vocabulary found and the frequency of occurence !
! of each different word. !
!-----------------------------------------------------------------------------!
USE ISO_VARYING_STRING
USE,INTRINSIC :: ISO_FORTRAN_ENV
IMPLICIT NONE
type VARYING_STRING
CHARACTER(LEN=:),ALLOCATABLE :: chars
end type VARYING_STRING
CHARACTER(LEN=:),ALLOCATABLE :: line,word,fname
INTEGER :: ierr,nd,wcount=0
!-----------------------------------------------------------------------------!
! Vocabulary list and frequency count arrays. The size of these arrays will !
! be extended dynamically in steps of 100 as the used vocabulary grows !
!-----------------------------------------------------------------------------!
type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab
INTEGER,ALLOCATABLE,DIMENSION(:) :: freq
INTEGER :: list_size=200,list_top=0
INTEGER :: i ! loop index
!-----------------------------------------------------------------------------!
! Initialise the lists and determine the file to be processed !
!-----------------------------------------------------------------------------!
ALLOCATE(vocab(1:list_size),freq(1:list_size))
WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?"
CALL GET(STRING=fname) ! read the required filename from the default
! input unit assumed to be the whole of the record read
OPEN(UNIT=1,FILE=CHAR(fname)) ! CHAR(fname) converts to the type
! required by FILE= specifier
file_read: DO ! until EoF reached
CALL GET(1,line,IOSTAT=ierr) ! read next line of file
IF(ierr == IOSTAT_END .OR. ierr > 0)EXIT file_read
word_scan: DO ! until end of line
nd=SCAN(line," ,.!?") ! scan to find end of word
IF(nd == 0)THEN ! EoR is end of word
nd = LEN(line)+1
EXIT word_scan
ENDIF
IF(nd > 1)THEN ! at least one non-terminator character in the word
wcount=wcount+1
word = EXTRACT(line,1,nd-1)
CALL update_vocab_lists
ENDIF
line = REMOVE(line,1,nd) ! strips the counted word and its terminator
! from the line reducing its length before
! rescanning for the next word
ENDDO word_scan
IF(nd > 1)THEN ! at least one character in the word
wcount=wcount+1
word = EXTRACT(line,1,nd-1)
CALL update_vocab_lists
ENDIF
ENDDO file_read
IF(ierr < 0)THEN
WRITE(*,*) "No. of words in file =",wcount
WRITE(*,*) "There are ",list_top," distinct words"
WRITE(*,*) "with the following frequencies of occurence"
print_loop: DO i=1,list_top
WRITE(*,FMT='(1X,I6,2X)',ADVANCE='NO') freq(i)
CALL PUT_LINE(STRING=vocab(i)%chars)
ENDDO print_loop
ELSEIF(ierr > 0)THEN
WRITE(*,*) "Error in GET in vocabulary_word_count, No.",ierr
ENDIF
CONTAINS
SUBROUTINE extend_lists
!-----------------------------------------------------------------------------!
! Accesses the host variables: !
! type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab !
! INTEGER,ALLOCATABLE,DIMENSION(:) :: freq !
! INTEGER :: list_size !
! so as to extend the size of the lists preserving the existing vocabulary !
! and frequency information in the new extended lists !
!-----------------------------------------------------------------------------!
type(VARYING_STRING),DIMENSION(list_size) :: vocab_swap
INTEGER,DIMENSION(list_size) :: freq_swap
INTEGER,PARAMETER :: list_increment=100
INTEGER :: new_list_size,alerr
vocab_swap = vocab ! copy old list into temporary space
freq_swap =freq
new_list_size = list_size + list_increment
DEALLOCATE(vocab,freq)
ALLOCATE(vocab(1:new_list_size),freq(1:new_list_size),STAT=alerr)
IF(alerr /= 0)THEN
WRITE(*,*) "Unable to extend vocabulary list"
STOP
ENDIF
vocab(1:list_size) = vocab_swap ! copy old list back into bottom
freq(1:list_size) = freq_swap ! of new extended list
list_size = new_list_size
ENDSUBROUTINE extend_lists
SUBROUTINE update_vocab_lists
!-----------------------------------------------------------------------------!
! Accesses the host variables: !
! type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab !
! INTEGER,ALLOCATABLE,DIMENSION(:) :: freq !
! INTEGER :: list_size,list_top !
! type(VARYING_STRING) :: word !
! searches the existing words in vocab to find a match for word !
! if found increments the freq if not found adds word to !
! list_top + 1 vocab list and sets corresponding freq to 1 !
! if list_size exceeded extend the list size before updating !
!-----------------------------------------------------------------------------!
INTEGER :: i ! loop index
list_search: DO i=1,list_top
IF(word == vocab(i)%chars)THEN
freq(i) = freq(i) + 1
RETURN
ENDIF
ENDDO list_search
IF(list_top == list_size)THEN
CALL extend_lists
ENDIF
list_top = list_top + 1
vocab(list_top)%chars = word
freq(list_top) = 1
ENDSUBROUTINE update_vocab_lists
ENDPROGRAM vocabulary_word_count
...............................................................................
More information about the J3
mailing list