(j3.2006) (SC22WG5.4676) Part 2 of the standard
John Reid
John.Reid
Thu Apr 12 03:49:19 EDT 2012
WG5,
I have done a bit more work on the procedures of Part 2 of the standard.
I attach a draft WG5 paper on what I have done. Any comments?
John.
-------------- next part --------------
ISO/IEC JTC1/SC22/WG5 N1913-1
The module for Part 2, varying length character strings
John Reid
11 April 2012
I promised at Garching to look into the conversion to Fortran 2003 of the
procedures of subclauses 3.6 and 3.7 of Part 2 of the Standard. These are
the I/O procedures GET, PUT, and PUT_LINE and the elemental procedures
EXTRACT, INSERT, REMOVE, REPLACE, and INSERT. The other procedures of
Part 2, except for CHAR with a character argument and VAR_STR, are already
in Part 1.
Lawrie Schonfelder wrote a sample module to implement Part 2. This is
available from
ftp://ftp.nag.co.uk/sc22wg5/ISO_VARYING_STRING/
He also wrote two sample programs ilustrating the use of the facilites.
These are included as an annex to Part 2 and are available from the same
ftp site.
I have made a new module and new versions of the two sample programs. These
are appended to this paper. They run successfully under
the Nag compiler. Each occurrence of
type(VARYING_STRING),INTENT(IN)
was replaced by
CHARACTER(LEN=*),INTENT(IN)
Each other occurrence of
type(VARYING_STRING)
are replaced by
CHARACTER(LEN=:),ALLOCATABLE
This resulted in arguments that were previously documented as either of
type CHARACTER or VARYING_STRING becoming only of type CHARACTER. The number
of specific procedures reduced from 40 to 15.
There were obvious mechanical changes needed within the procedures.
The functions EXTRACT, INSERT, REMOVE, and REPLACE cannot be elemental
because each function result is allocatable with deferred character
length. The subroutine SPLIT cannot be elemental because the arguments string,
word, and separator are allocatable with deferred character length. This is
because the elements of a character array have to have the same character length.
A similar problem occurs in the vocabulary sample program, where an array holds
words of varying length. I have used a linked list instead.
Lawrie Schonfelder wrote a test suite that is available from:
http://www.fortran.com/vst_95a.zip
I have adapted codes from here to test the new module, but I am not confident
that these adapted codes are adequate to fully test the revised code. In
particular, they rely on interactive input from the keyboard and I was unsure
of what is needed. If this work is to be taken further, new tests are needed.
Two questions were asked at the meeting in Garching:
1. Can programs that use the varying length strings module be so easily
converted that Part 2 of the Standard should be withdrawn?
2. Should extra intrinsic procedures be added to part 1?
The existing module is needed if the elemental features of the procedures
EXTRACT, INSERT, REMOVE, REPLACE, and INSERT are required. Otherwise, the
module given here (after full testing) should allow simple conversion. Some
detailed changes will be needed. The use of the Part 2 function CHAR with
an argument of type character will need to be removed (perhaps this should be
added to the new module). Similar considerations apply to the Part 2 function
VAR_STR. Changes are needed for the extraction of a single character; for
example, set%chars(j) needs to be replaced by set(j:j). They are also needed
if array features are employed for the character array component. For
example
IF( ALL(string%chars(ipos:ipos+lt-1) == target%chars) )THEN
becomes
IF( string(ipos:ipos+lt-1) == target )THEN
I prefer not to comment on whether these functions should be added to Part 1.
The case needs to be made by those who need them. We might consider providing
an open source and fully tested version of the module here.
...............................................................................
MODULE ISO_VARYING_STRING
! Adapted by John Reid from the module written by J.L.Schonfelder.
! Version produced (11 April 2012)
!-----------------------------------------------------------------------------
! This module provides revised versions of the generic procedures GET, PUT,
! PUT_LINE, EXTRACT, INSERT, REMOVE, REPLACE, and INSERT of
! ISO/IEC 1539-2:2000(E).
!-----------------------------------------------------------------------------
! NB This module has not been fully tested.
PRIVATE
!-----------------------------------------------------------------------------!
! By default all entities declared or defined in this module are private to !
! the module. Only those entities declared explicitly as being public are !
! accessible to programs using the module. In particular, the procedures !
! defined herein are made accessible via their generic identifiers only; !
! their specific names are private. !
!-----------------------------------------------------------------------------!
!----- GENERIC PROCEDURE INTERFACE DEFINITIONS -------------------------------!
!----- Input procedure interfaces --------------------------------------------!
INTERFACE GET
MODULE PROCEDURE get_d_eor, & ! default unit, EoR termination
get_u_eor, & ! specified unit, EoR termination
get_d_tset_c, & ! default unit, char set termination
get_u_tset_c ! specified unit, char set termination
ENDINTERFACE
!----- Output procedure interfaces -------------------------------------------!
INTERFACE PUT
MODULE PROCEDURE put_d_c, & ! char to default unit
put_u_c ! char to specified unit
ENDINTERFACE
INTERFACE PUT_LINE
MODULE PROCEDURE putline_d_c, & ! char to default unit
putline_u_c ! char to specified unit
ENDINTERFACE
!----- Insert procedure interfaces -------------------------------------------!
INTERFACE INSERT
MODULE PROCEDURE insert_cc ! string in string
ENDINTERFACE
!----- Replace procedure interfaces ------------------------------------------!
INTERFACE REPLACE
MODULE PROCEDURE replace_ss, & ! string by string, at specified
replace_ss_sf,& ! string by string, between
replace_sss ! in string replace string by string
ENDINTERFACE
!----- Remove procedure interface --------------------------------------------!
INTERFACE REMOVE
MODULE PROCEDURE remove_c ! characters from char , and finish
ENDINTERFACE
!----- Extract procedure interface -------------------------------------------!
INTERFACE EXTRACT
MODULE PROCEDURE extract_c ! from char extract string, and finish
ENDINTERFACE
!----- Split procedure interface ---------------------------------------------!
INTERFACE SPLIT
MODULE PROCEDURE split_c ! character in set
ENDINTERFACE
!----- specification of publicly accessible entities -------------------------!
PUBLIC :: GET,PUT,PUT_LINE,EXTRACT,INSERT,REMOVE,REPLACE,SPLIT
CONTAINS
!----- Input string procedure -----------------------------------------------!
SUBROUTINE get_d_eor(string,maxlen,iostat)
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
! the variable to be filled with
! characters read from the
! file connected to the default unit
INTEGER,INTENT(IN),OPTIONAL :: maxlen
! if present indicates the maximum
! number of characters that will be
! read from the file
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! reads string from the default unit starting at next character in the file
! and terminating at the end of record or after maxlen characters.
CHARACTER(LEN=80) :: buffer
INTEGER :: ist,nch,toread,nb
IF(PRESENT(maxlen))THEN
toread=maxlen
ELSE
toread=HUGE(1)
ENDIF
string = "" ! clears return string N.B. will also deallocate string via the
! assignment operation
DO ! repeatedly read buffer and add to string until EoR
! or maxlen reached
IF(toread <= 0)EXIT
nb=MIN(80,toread)
READ(*,FMT='(A)',ADVANCE='NO',EOR=9999,SIZE=nch,IOSTAT=ist) buffer(1:nb)
IF( ist /= 0 )THEN
IF(PRESENT(iostat)) THEN
iostat=ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist, &
" during READ_STRING of varying string on default unit"
STOP
ENDIF
ENDIF
string = string //buffer(1:nb)
toread = toread - nb
ENDDO
IF(PRESENT(iostat)) iostat = 0
RETURN
9999 string = string //buffer(1:nch)
IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_d_eor
SUBROUTINE get_u_eor(unit,string,maxlen,iostat)
INTEGER,INTENT(IN) :: unit
! identifies the input unit which must be
! connected for sequential formatted read
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
! the variable to be filled with
! characters read from the
! file connected to the unit
INTEGER,INTENT(IN),OPTIONAL :: maxlen
! if present indicates the maximum
! number of characters that will be
! read from the file
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! reads string from unit starting at next character in the file and
! terminating at the end of record or after maxlen characters.
CHARACTER(LEN=80) :: buffer
INTEGER :: ist,nch,toread,nb
IF(PRESENT(maxlen))THEN
toread=maxlen
ELSE
toread=HUGE(1)
ENDIF
string="" ! clears return string N.B. will also deallocate string via the
! assignment operation
DO ! repeatedly read buffer and add to string until EoR
! or maxlen reached
IF(toread <= 0)EXIT
nb=MIN(80,toread)
READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,SIZE=nch,IOSTAT=ist) buffer(1:nb)
IF( ist /= 0 )THEN
IF(PRESENT(iostat)) THEN
iostat=ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist, &
" during READ_STRING of varying string on UNIT ",unit
STOP
ENDIF
ENDIF
string = string //buffer(1:nb)
toread = toread - nb
ENDDO
IF(PRESENT(iostat)) iostat = 0
RETURN
9999 string = string //buffer(1:nch)
IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_u_eor
SUBROUTINE get_d_tset_c(string,set,separator,maxlen,iostat)
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
! the variable to be filled with
! characters read from the
! file connected to the default unit
CHARACTER(LEN=*),INTENT(IN) :: set
! the set of characters which if found in
! the input terminate the read
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
! the actual separator character from set
! found as the input string terminator
! returned as zero length if termination
! by maxlen or EOR
INTEGER,INTENT(IN),OPTIONAL :: maxlen
! if present indicates the maximum
! number of characters that will be
! read from the file
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! reads string from the default unit starting at next character in the file and
! terminating at the end of record, occurance of a character in set,
! or after reading maxlen characters.
CHARACTER :: buffer ! characters must be read one at a time to detect
! first terminator character in set
INTEGER :: ist,toread,lenset
lenset = LEN(set)
IF(PRESENT(maxlen))THEN
toread=maxlen
ELSE
toread=HUGE(1)
ENDIF
string = "" ! clears return string N.B. will also deallocate string via the
! assignment operation
IF(PRESENT(separator)) separator="" ! clear separator
readchar:DO ! repeatedly read buffer and add to string
IF(toread <= 0)EXIT readchar ! maxlen reached
READ(*,FMT='(A)',ADVANCE='NO',EOR=9999,IOSTAT=ist) buffer
IF( ist /= 0 )THEN
IF(PRESENT(iostat)) THEN
iostat=ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist, &
" during GET of varying string on default unit"
STOP
ENDIF
ENDIF
! check for occurance of set character in buffer
DO j = 1,lenset
IF(buffer == set(j:j))THEN
IF(PRESENT(separator)) separator=buffer
EXIT readchar
ENDIF
ENDDO
string = string//buffer
toread = toread - 1
ENDDO readchar
IF(PRESENT(iostat)) iostat = 0
RETURN
9999 CONTINUE ! EOR terminator read
IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_d_tset_c
SUBROUTINE get_u_tset_c(unit,string,set,separator,maxlen,iostat)
INTEGER,INTENT(IN) :: unit
! identifies the input unit which must be
! connected for sequential formatted read
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
! the variable to be filled with
! characters read from the
! file connected to the unit
CHARACTER(LEN=*),INTENT(IN) :: set
! the set of characters which if found in
! the input terminate the read
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
! the actual separator character from set
! found as the input string terminator
! returned as zero length if termination
! by maxlen or EOR
INTEGER,INTENT(IN),OPTIONAL :: maxlen
! if present indicates the maximum
! number of characters that will be
! read from the file
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! reads string from unit starting at next character in the file and
! terminating at the end of record, occurance of a character in set,
! or after reading maxlen characters.
CHARACTER :: buffer ! characters must be read one at a time to detect
! first terminator character in set
INTEGER :: ist,toread,lenset
lenset = LEN(set)
IF(PRESENT(maxlen))THEN
toread=maxlen
ELSE
toread=HUGE(1)
ENDIF
string = "" ! clears return string N.B. will also deallocate string via the
! assignment operation
IF(PRESENT(separator)) separator="" ! clear separator
readchar:DO ! repeatedly read buffer and add to string
IF(toread <= 0)EXIT readchar ! maxlen reached
READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,IOSTAT=ist) buffer
IF( ist /= 0 )THEN
IF(PRESENT(iostat)) THEN
iostat=ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist, &
" during GET of varying string on unit ",unit
STOP
ENDIF
ENDIF
! check for occurance of set character in buffer
DO j = 1,lenset
IF(buffer == set(j:j))THEN
IF(PRESENT(separator)) separator=buffer
EXIT readchar ! separator terminator found
ENDIF
ENDDO
string = string//buffer
toread = toread - 1
ENDDO readchar
IF(PRESENT(iostat)) iostat = 0
RETURN
9999 CONTINUE ! EOR terminator read
IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_u_tset_c
!----- Output string procedures ----------------------------------------------!
SUBROUTINE put_d_c(string,iostat)
CHARACTER(LEN=*),INTENT(IN) :: string
! the character variable to be appended to
! the current record or to the start of
! the next record if there is no
! current record
! uses the default unit
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
INTEGER :: ist
WRITE(*,FMT='(A)',ADVANCE='NO',IOSTAT=ist) string
IF( ist /= 0 )THEN
IF(PRESENT(iostat))THEN
iostat = ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist, &
" during PUT of character on default unit"
STOP
ENDIF
ENDIF
IF(PRESENT(iostat)) iostat=0
ENDSUBROUTINE put_d_c
SUBROUTINE put_u_c(unit,string,iostat)
INTEGER,INTENT(IN) :: unit
! identifies the output unit which must
! be connected for sequential formatted
! write
CHARACTER(LEN=*),INTENT(IN) :: string
! the character variable to be appended to
! the current record or to the start of
! the next record if there is no
! current record
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
INTEGER :: ist
WRITE(unit,FMT='(A)',ADVANCE='NO',IOSTAT=ist) string
IF( ist /= 0 )THEN
IF(PRESENT(iostat))THEN
iostat = ist
RETURN
ELSE
WRITE(*,*) " Error No.",ist," during PUT of character on UNIT ",unit
STOP
ENDIF
ENDIF
IF(PRESENT(iostat)) iostat=0
ENDSUBROUTINE put_u_c
SUBROUTINE putline_d_c(string,iostat)
CHARACTER(LEN=*),INTENT(IN) :: string
! the character variable to be appended to
! the current record or to the start of
! the next record if there is no
! current record
! uses the default unit
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! appends the string to the current record and then ends the record
! leaves the file positioned after the record just completed which then
! becomes the previous and last record in the file.
INTEGER :: ist
WRITE(*,FMT='(A,/)',ADVANCE='NO',IOSTAT=ist) string
IF(PRESENT(iostat))THEN
iostat = ist
RETURN
ELSEIF( ist /= 0 )THEN
WRITE(*,*) " Error No.",ist, &
" during PUT_LINE of character on default unit"
STOP
ENDIF
ENDSUBROUTINE putline_d_c
SUBROUTINE putline_u_c(unit,string,iostat)
INTEGER,INTENT(IN) :: unit
! identifies the output unit which must
! be connected for sequential formatted
! write
CHARACTER(LEN=*),INTENT(IN) :: string
! the character variable to be appended to
! the current record or to the start of
! the next record if there is no
! current record
INTEGER,INTENT(OUT),OPTIONAL :: iostat
! if present used to return the status
! of the data transfer
! if absent errors cause termination
! appends the string to the current record and then ends the record
! leaves the file positioned after the record just completed which then
! becomes the previous and last record in the file.
INTEGER :: ist
WRITE(unit,FMT='(A,/)',ADVANCE='NO',IOSTAT=ist) string
IF(PRESENT(iostat))THEN
iostat = ist
RETURN
ELSEIF( ist /= 0 )THEN
WRITE(*,*) " Error No.",ist, &
" during WRITE_LINE of character on UNIT",unit
STOP
ENDIF
ENDSUBROUTINE putline_u_c
!----- Insert procedure ----------------------------------------------------!
FUNCTION insert_cc(string,start,substring)
CHARACTER(LEN=:),ALLOCATABLE :: insert_cc
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN) :: start
CHARACTER(LEN=*),INTENT(IN) :: substring
! calculates result string by inserting the substring into string
! beginning at position start pushing the remainder of the string
! to the right and enlarging it accordingly,
! if start is greater than LEN(string) the substring is simply appended ! to string by concatenation. if start is less than 1
! substring is inserted before string, ie. start is treated as if it were 1
INTEGER :: ip,is,lsub,ls
lsub = LEN(substring); ls = LEN(string)
is = MAX(start,1)
ip = MIN(ls+1,is)
insert_cc = string(1:ip-1)//substring//string(ip:ls)
ENDFUNCTION insert_cc
!----- Replace procedures ---------------------------------------------------!
FUNCTION replace_ss(string,start,substring)
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN) :: start
CHARACTER(LEN=*),INTENT(IN) :: substring
CHARACTER(LEN=:),ALLOCATABLE :: replace_ss
! calculates the result string by the following actions:
! inserts the substring into string beginning at position
! start replacing the following LEN(substring) characters of the string
! and enlarging string if necessary. if start is greater than LEN(string)
! substring is simply appended to string by concatenation. If start is less
! than 1, substring replaces characters in string starting at 1
INTEGER :: ip,is,lsub
lsub = LEN(substring)
is = MAX(start,1)
ip = MIN(ls+1,is)
replace_ss = string(1:ip-1)//substring//string(ip+lsub:)
ENDFUNCTION replace_ss
FUNCTION replace_ss_sf(string,start,finish,substring)
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN) :: start,finish
CHARACTER(LEN=*),INTENT(IN) :: substring
CHARACTER(LEN=:),ALLOCATABLE :: replace_ss_sf
! calculates the result string by the following actions:
! inserts the substring into string beginning at position
! start replacing the following finish-start+1 characters of the string
! and enlarging or shrinking the string if necessary.
! If start is greater than LEN(string) substring is simply appended to string
! by concatenation. If start is less than 1, start = 1 is used
! If finish is greater than LEN(string), finish = LEN(string) is used
! If finish is less than start, substring is inserted before start
INTEGER :: ip,is,if,nw,lsub,ls
lsub = LEN(substring); ls = LEN(string)
is = MAX(start,1)
ip = MIN(ls+1,is)
if = MAX(ip-1,MIN(finish,ls))
nw = lsub + ls - if+ip-1
replace_ss_sf = string(1:ip-1)//substring//string(if+1:ls)
ENDFUNCTION replace_ss_sf
FUNCTION replace_sss(string,target,substring,every,back)
CHARACTER(LEN=*),INTENT(IN) :: string,target,substring
LOGICAL,INTENT(IN),OPTIONAL :: every,back
! calculates the result string by the following actions:
! searches for occurences of target in string, and replaces these with
! substring. if back present with value true search is backward otherwise
! search is done forward. if every present with value true all occurences
! of target in string are replaced, otherwise only the first found is
! replaced. if target is not found the result is the same as string.
LOGICAL :: dir_switch, rep_search
CHARACTER(LEN=:),ALLOCATABLE :: work
INTEGER :: ls,lt,lsub,ipos,ipow
CHARACTER(LEN=:),ALLOCATABLE :: replace_sss
ls = LEN(string); lt = LEN(target); lsub = LEN(substring)
IF(lt==0)THEN
IF(ls==0)THEN
replace_sss = substring
ELSE
replace_sss = string
ENDIF
RETURN
ENDIF
work = string
IF( PRESENT(back) )THEN
dir_switch = back
ELSE
dir_switch = .FALSE.
ENDIF
IF( PRESENT(every) )THEN
rep_search = every
ELSE
rep_search = .FALSE.
ENDIF
IF( dir_switch )THEN ! backwards search
ipos = ls-lt+1
DO
IF( ipos < 1 )EXIT ! search past start of string
! test for occurance of target in string at this position
IF( string(ipos:ipos+lt-1) == target )THEN
! match found allocate space for string with this occurance of
! target replaced by substring
! copy work into temp replacing this occurance of target by
! substring
work = work(1:ipos-1)//substring//work(ipos+lt:)
IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted
! move search and replacement positions over the effected positions
ipos = ipos-lt+1
ENDIF
ipos=ipos-1
ENDDO
ELSE ! forward search
ipos = 1; ipow = 1
DO
IF( ipos > ls-lt+1 )EXIT ! search past end of string
! test for occurance of target in string at this position
IF( string(ipos:ipos+lt-1) == target )THEN
! match found allocate space for string with this occurance of
! target replaced by substring
work = work(1:ipow-1)//substring//work(ipow+lt:)
IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted
! move search and replacement positions over the effected positions
ipos = ipos+lt-1; ipow = ipow+lsub-1
ENDIF
ipos=ipos+1; ipow=ipow+1
ENDDO
ENDIF
replace_sss = work
ENDFUNCTION replace_sss
!----- Remove procedure ----------------------------------------------------!
FUNCTION remove_c(string,start,finish)
CHARACTER(LEN=:),ALLOCATABLE :: remove_c
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN),OPTIONAL :: start
INTEGER,INTENT(IN),OPTIONAL :: finish
! returns as result the string produced by the actions
! removes the characters between start and finish from string reducing it in
! size by MAX(0,ABS(finish-start+1))
! if start < 1 or is missing then assumes start=1
! if finish > LEN(string) or is missing then assumes finish=LEN(string)
INTEGER :: is,if,ls
ls = LEN(string)
IF (PRESENT(start)) THEN
is = MAX(1,start)
ELSE
is = 1
ENDIF
IF (PRESENT(finish)) THEN
if = MIN(ls,finish)
ELSE
if = ls
ENDIF
IF( if < is ) THEN ! zero characters to be removed, string is unchanged
remove_c = string
ELSE
remove_c = string(1:is-1)//string(if+1:)
ENDIF
ENDFUNCTION remove_c
!----- Extract procedure ---------------------------------------------------!
FUNCTION extract_c(string,start,finish)
CHARACTER(LEN=*),INTENT(IN) :: string
INTEGER,INTENT(IN),OPTIONAL :: start
INTEGER,INTENT(IN),OPTIONAL :: finish
CHARACTER(LEN=:),ALLOCATABLE :: extract_c
! extracts the characters between start and finish from character string and
! delivers these as the result of the function, string is unchanged
! if start < 1 or is missing then it is treated as 1
! if finish > LEN(string) or is missing then it is treated as LEN(string)
INTEGER :: is,if
IF (PRESENT(start)) THEN
is = MAX(1,start)
ELSE
is = 1
ENDIF
IF (PRESENT(finish)) THEN
if = MIN(LEN(string),finish)
ELSE
if = LEN(string)
ENDIF
extract_c = string(is:if)
ENDFUNCTION extract_c
!----- Split procedure ------------------------------------------------------!
SUBROUTINE split_c(string,word,set,separator,back)
CHARACTER(LEN=:),ALLOCATABLE,INTENT(INOUT) :: string
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: word
CHARACTER(LEN=*),INTENT(IN) :: set
CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
LOGICAL,INTENT(IN),OPTIONAL :: back
! splits the input string at the first(last) character in set
! returns the leading(trailing) substring in word and the trailing(leading)
! substring in string. The search is done in the forward or backward
! direction depending on back. If separator is present, the actual separator
! character found is returned in separator.
! If no character in set is found string and separator are returned as
! zero length and the whole input string is returned in word.
LOGICAL :: dir_switch
INTEGER :: ls,tpos,lset
CHARACTER(LEN=:),ALLOCATABLE :: wst ! working copy of string
ls = LEN(string); lset = LEN(set)
wst=string
IF( PRESENT(back) )THEN
dir_switch = back
ELSE
dir_switch = .FALSE.
ENDIF
IF(dir_switch)THEN ! backwards search
BSEARCH:DO tpos = ls,1,-1
DO i=1,lset
IF(wst(tpos:tpos) == set(i:i))EXIT BSEARCH
ENDDO
ENDDO BSEARCH
word = wst(tpos+1:ls)
IF(PRESENT(separator))THEN
IF(tpos==0)THEN
separator = ""
ELSE
separator = wst(tpos:tpos)
ENDIF
ENDIF
string = wst(1:tpos-1)
ELSE ! forwards search
FSEARCH:DO tpos =1,ls
DO i=1,lset
IF(wst(tpos:tpos) == set(i:i))EXIT FSEARCH
ENDDO
ENDDO FSEARCH
word = wst(1:tpos-1)
IF(PRESENT(separator))THEN
IF(tpos==ls+1)THEN
separator = ""
ELSE
separator = wst(tpos:tpos)
ENDIF
ENDIF
string = wst(tpos+1:ls)
ENDIF
ENDSUBROUTINE split_c
ENDMODULE ISO_VARYING_STRING
...............................................................................
PROGRAM word_count
!-----------------------------------------------------------------------------!
! Counts the number of "words" contained in a file. The words are assumed to !
! be terminated by any one of: !
! space,comma,period,!,?, or the EoR !
! The file may have records of any length and the file may contain any number !
! of records. !
! The program prompts for the name of the file to be subject to a word count !
! and the result is written to the default output unit !
!-----------------------------------------------------------------------------!
USE ISO_VARYING_STRING
IMPLICIT NONE
CHARACTER(len=:),ALLOCATABLE :: line,fname
INTEGER :: ierr,nd,wcount=0
fname = "initial string"
WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?"
CALL GET(STRING=fname) ! read the required filename from the default
! input unit assumed to be the whole of the record read
OPEN(UNIT=10,FILE=fname)
file_read: DO ! until EoF reached
CALL GET(10,line,IOSTAT=ierr) ! read next line of file
IF(ierr == -1 .OR. ierr > 0 )EXIT file_read
word_scan: DO ! until end of line
nd=SCAN(line," ,.!?") ! scan to find end of word
IF(nd == 0)THEN ! EoR is end of word
nd = LEN(line)
EXIT word_scan
ENDIF
IF(nd > 1) wcount=wcount+1 ! at least one non-terminator character
! in the word
line = REMOVE(line,1,nd) ! strips the counted word and its terminator
! from the line reducing its length before
! rescanning for the next word
ENDDO word_scan
IF(nd > 0) wcount=wcount+1
ENDDO file_read
IF(ierr < 0)THEN
WRITE(*,*) "No. of words in file =",wcount
ELSEIF(ierr > 0)THEN
WRITE(*,*) "Error in GET file in word_count, No. ",ierr
ENDIF
ENDPROGRAM word_count
...............................................................................
PROGRAM vocabulary_word_count
!-----------------------------------------------------------------------------!
! Counts the number of "words" contained in a file. The words are assumed to !
! be terminated by any one of: !
! space,comma,period,!,?, or the EoR !
! The file may have records of any length and the file may contain any number !
! of records. !
! The program prompts for the name of the file to be subject to a word count !
! and the result is written to the default output unit !
! Also builds a list of the vocabulary found and the frequency of occurence !
! of each different word. !
!-----------------------------------------------------------------------------!
USE ISO_VARYING_STRING
IMPLICIT NONE
type node
character(len=:),allocatable :: word
integer :: freq = 0
type(node), pointer :: next
end type
type(node), pointer :: top=>null(), current
CHARACTER(LEN=:),ALLOCATABLE:: line,word,fname
INTEGER :: ierr,nd,wcount=0, dcount=0
!-----------------------------------------------------------------------------!
! Vocabulary list and frequency count arrays. The size of these arrays will !
! be extended dynamically in steps of 100 as the used vocabulary grows !
!-----------------------------------------------------------------------------!
CHARACTER(LEN=:),ALLOCATABLE,DIMENSION(:) :: vocab
INTEGER :: list_size=200,list_top=0
INTEGER :: i ! loop index
!-----------------------------------------------------------------------------!
! Initialise the lists and determine the file to be processed !
!-----------------------------------------------------------------------------!
! WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?"
! CALL GET(STRING=fname) ! read the required filename from the default
! input unit assumed to be the whole of the record read
fname = '../readme.txt'
OPEN(UNIT=1,FILE=fname)
file_read: DO ! until EoF reached
CALL GET(1,line,IOSTAT=ierr) ! read next line of file
write(*,*)'line=',line
IF(ierr == -1 .OR. ierr > 0)EXIT file_read
word_scan: DO ! until end of line
nd=SCAN(line," ,.!?") ! scan to find end of word
IF(nd == 0)THEN ! EoR is end of word
nd = LEN(line)+1
EXIT word_scan
ENDIF
IF(nd > 1)THEN ! at least one non-terminator character in the word
wcount=wcount+1
word = EXTRACT(line,1,nd-1)
write(*,*)word
CALL update_vocab_lists
ENDIF
line = REMOVE(line,1,nd) ! strips the counted word and its terminator
! from the line reducing its length before
! rescanning for the next word
ENDDO word_scan
IF(nd > 1)THEN ! at least one character in the word
wcount=wcount+1
word = EXTRACT(line,1,nd-1)
CALL update_vocab_lists
ENDIF
ENDDO file_read
IF(ierr < 0)THEN
WRITE(*,*) "No. of words in file =",wcount
WRITE(*,*) "There are ",dcount," distinct words"
WRITE(*,*) "with the following frequencies of occurence"
current => top
print_loop: DO
if(.not.associated(current))exit
WRITE(*,FMT='(1X,I6,2X)',ADVANCE='NO') current%freq
CALL PUT_LINE(STRING=current%word)
current => current%next
ENDDO print_loop
ELSEIF(ierr > 0)THEN
WRITE(*,*) "Error in GET in vocabulary_word_count, No.",ierr
ENDIF
CONTAINS
SUBROUTINE update_vocab_lists
!-----------------------------------------------------------------------------!
! Accesses the host variables: !
! type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab !
! INTEGER,ALLOCATABLE,DIMENSION(:) :: freq !
! INTEGER :: list_size,list_top !
! CHARACTER(LEN=:),ALLOCATABLE :: word !
! searches the existing words in vocab to find a match for word !
! if found increments the freq if not found adds word to !
! list_top + 1 vocab list and sets corresponding freq to 1 !
! if list_size exceeded extend the list size before updating !
!-----------------------------------------------------------------------------!
current => top
list_search: DO
if(.not.associated(current))exit
IF(word == current%word)THEN
current%freq = current%freq + 1
RETURN
ENDIF
current => current%next
ENDDO list_search
dcount = dcount+1
allocate(current)
current%next => top
current%word = word
current%freq = 1
top => current
ENDSUBROUTINE update_vocab_lists
ENDPROGRAM vocabulary_word_count
More information about the J3
mailing list