(j3.2006) (SC22WG5.4676) Part 2 of the standard

John Reid John.Reid
Thu Apr 12 03:49:19 EDT 2012


WG5,

I have done a bit more work on the procedures of Part 2 of the standard.

I attach a draft WG5 paper on what I have done. Any comments?

John.

-------------- next part --------------
                                         ISO/IEC JTC1/SC22/WG5 N1913-1

        The module for Part 2, varying length character strings
        
                           John Reid

                          11 April 2012


I promised at Garching to look into the conversion to Fortran 2003 of the 
procedures of subclauses 3.6 and 3.7 of Part 2 of the Standard. These are 
the I/O procedures GET, PUT, and PUT_LINE and the elemental procedures 
EXTRACT, INSERT, REMOVE, REPLACE, and INSERT. The other procedures of
Part 2, except for CHAR with a character argument and VAR_STR, are already 
in Part 1. 

Lawrie Schonfelder wrote a sample module to implement Part 2. This is 
available from
     ftp://ftp.nag.co.uk/sc22wg5/ISO_VARYING_STRING/
He also wrote two sample programs ilustrating the use of the facilites.
These are included as an annex to Part 2 and are available from the same
ftp site. 

I have made a new module and new versions of the two sample programs. These 
are appended to this paper. They run successfully under 
the Nag compiler. Each occurrence of
   type(VARYING_STRING),INTENT(IN)
was replaced by
   CHARACTER(LEN=*),INTENT(IN)
Each other occurrence of
   type(VARYING_STRING)
are replaced by
   CHARACTER(LEN=:),ALLOCATABLE
This resulted in arguments that were previously documented as either of
type CHARACTER or VARYING_STRING becoming only of type CHARACTER. The number 
of specific procedures reduced from 40 to 15.

There were obvious mechanical changes needed within the procedures.

The functions EXTRACT, INSERT, REMOVE, and REPLACE cannot be elemental 
because each function result is allocatable with deferred character
length. The subroutine SPLIT cannot be elemental because the arguments string,
word, and separator are allocatable with deferred character length. This is 
because the elements of a character array have to have the same character length. 

A similar problem occurs in the vocabulary sample program, where an array holds 
words of varying length. I have used a linked list instead.

Lawrie Schonfelder wrote a test suite that is available from:
   http://www.fortran.com/vst_95a.zip
I have adapted codes from here to test the new module, but I am not confident
that these adapted codes are adequate to fully test the revised code. In
particular, they rely on interactive input from the keyboard and I was unsure
of what is needed. If this work is to be taken further, new tests are needed. 

Two questions were asked at the meeting in Garching:

1. Can programs that use the varying length strings module be so easily 
converted that Part 2 of the Standard should be withdrawn?
   
2. Should extra intrinsic procedures be added to part 1?

The existing module is needed if the elemental features of the procedures 
EXTRACT, INSERT, REMOVE, REPLACE, and INSERT are required. Otherwise, the
module given here (after full testing) should allow simple conversion. Some
detailed changes will be needed. The use of the Part 2 function CHAR with 
an argument of type character will need to be removed (perhaps this should be
added to the new module). Similar considerations apply to the Part 2 function 
VAR_STR. Changes are needed for the extraction of a single character; for
example, set%chars(j) needs to be replaced by set(j:j). They are also needed
if array features are employed for the character array component. For
example
      IF( ALL(string%chars(ipos:ipos+lt-1) == target%chars) )THEN
becomes
      IF( string(ipos:ipos+lt-1) == target )THEN

I prefer not to comment on whether these functions should be added to Part 1. 
The case needs to be made by those who need them. We might consider providing
an open source and fully tested version of the module here. 

...............................................................................

MODULE ISO_VARYING_STRING
  
! Adapted by John Reid from the module written by J.L.Schonfelder.  

! Version produced (11 April 2012)

!-----------------------------------------------------------------------------
! This module provides revised versions of the generic procedures GET, PUT,  
! PUT_LINE, EXTRACT, INSERT, REMOVE, REPLACE, and INSERT of                  
! ISO/IEC 1539-2:2000(E).
!----------------------------------------------------------------------------- 

! NB This module has not been fully tested.   
  
PRIVATE 
  
!-----------------------------------------------------------------------------! 
! By default all entities declared or defined in this module are private to   !
! the module. Only those entities declared explicitly as being public are     ! 
! accessible to programs using the module. In particular, the procedures      ! 
! defined herein are made accessible via their generic identifiers only;      ! 
! their specific names are private.                                     ! 
!-----------------------------------------------------------------------------! 
  
 
!----- GENERIC PROCEDURE INTERFACE DEFINITIONS -------------------------------! 
 
!----- Input procedure interfaces --------------------------------------------!
INTERFACE GET
  MODULE PROCEDURE get_d_eor, &    ! default unit, EoR termination
                   get_u_eor, &    ! specified unit, EoR termination
                   get_d_tset_c, & ! default unit, char set termination
                   get_u_tset_c    ! specified unit, char set termination
ENDINTERFACE   
!----- Output procedure interfaces -------------------------------------------!
INTERFACE PUT
  MODULE PROCEDURE put_d_c, & ! char to default unit
                   put_u_c    ! char to specified unit
ENDINTERFACE 
  
INTERFACE PUT_LINE
  MODULE PROCEDURE putline_d_c, & ! char to default unit
                   putline_u_c    ! char to specified unit
ENDINTERFACE 
  
!----- Insert procedure interfaces -------------------------------------------!
INTERFACE INSERT 
  MODULE PROCEDURE insert_cc ! string in string
ENDINTERFACE 

!----- Replace procedure interfaces ------------------------------------------!
INTERFACE REPLACE 
  MODULE PROCEDURE replace_ss, &   ! string by string, at specified
                   replace_ss_sf,& ! string by string, between
                   replace_sss     ! in string replace string by string
ENDINTERFACE 

!----- Remove procedure interface --------------------------------------------! 
INTERFACE REMOVE 
  MODULE PROCEDURE remove_c    ! characters from char  , and finish
ENDINTERFACE 
  
!----- Extract procedure interface -------------------------------------------!
INTERFACE EXTRACT 
  MODULE PROCEDURE extract_c    ! from char   extract string, and finish
ENDINTERFACE 
  
!----- Split procedure interface ---------------------------------------------!
INTERFACE SPLIT
  MODULE PROCEDURE split_c    !   character in set
ENDINTERFACE

!----- specification of publicly accessible entities -------------------------! 
PUBLIC :: GET,PUT,PUT_LINE,EXTRACT,INSERT,REMOVE,REPLACE,SPLIT
          
CONTAINS 

!----- Input string procedure -----------------------------------------------!
SUBROUTINE get_d_eor(string,maxlen,iostat)
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
                                  ! the variable to be filled with
                                  ! characters read from the
                                  ! file connected to the default unit
 INTEGER,INTENT(IN),OPTIONAL      :: maxlen
                                  ! if present indicates the maximum
                                  ! number of characters that will be
                                  ! read from the file
 INTEGER,INTENT(OUT),OPTIONAL     :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! reads string from the default unit starting at next character in the file
! and terminating at the end of record or after maxlen characters.
 CHARACTER(LEN=80) :: buffer
 INTEGER           :: ist,nch,toread,nb
 IF(PRESENT(maxlen))THEN
   toread=maxlen
 ELSE
   toread=HUGE(1)
 ENDIF
 string = "" ! clears return string N.B. will also deallocate string via the
             ! assignment operation
 DO  ! repeatedly read buffer and add to string until EoR
     ! or maxlen reached
   IF(toread <= 0)EXIT
   nb=MIN(80,toread)
   READ(*,FMT='(A)',ADVANCE='NO',EOR=9999,SIZE=nch,IOSTAT=ist) buffer(1:nb)
   IF( ist /= 0 )THEN 
     IF(PRESENT(iostat)) THEN 
       iostat=ist 
       RETURN 
     ELSE 
       WRITE(*,*) " Error No.",ist, &
                  " during READ_STRING of varying string on default unit"
       STOP 
     ENDIF 
   ENDIF 
   string = string //buffer(1:nb)
   toread = toread - nb
 ENDDO
 IF(PRESENT(iostat)) iostat = 0
 RETURN
 9999 string = string //buffer(1:nch) 
 IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_d_eor
  
SUBROUTINE get_u_eor(unit,string,maxlen,iostat)
 INTEGER,INTENT(IN)               :: unit
                                  ! identifies the input unit which must be
                                  ! connected for sequential formatted read
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
                                  ! the variable to be filled with
                                  ! characters read from the
                                  ! file connected to the unit
 INTEGER,INTENT(IN),OPTIONAL      :: maxlen
                                  ! if present indicates the maximum
                                  ! number of characters that will be
                                  ! read from the file
 INTEGER,INTENT(OUT),OPTIONAL     :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! reads string from unit starting at next character in the file and 
! terminating at the end of record or after maxlen characters.
 CHARACTER(LEN=80) :: buffer
 INTEGER           :: ist,nch,toread,nb
 IF(PRESENT(maxlen))THEN
   toread=maxlen
 ELSE
   toread=HUGE(1)
 ENDIF
 string=""  ! clears return string N.B. will also deallocate string via the
             ! assignment operation
 DO  ! repeatedly read buffer and add to string until EoR
     ! or maxlen reached
   IF(toread <= 0)EXIT
   nb=MIN(80,toread)
   READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,SIZE=nch,IOSTAT=ist) buffer(1:nb)
   IF( ist /= 0 )THEN 
     IF(PRESENT(iostat)) THEN 
       iostat=ist 
       RETURN 
     ELSE 
       WRITE(*,*) " Error No.",ist, &
                  " during READ_STRING of varying string on UNIT ",unit
       STOP 
     ENDIF 
   ENDIF 
   string = string //buffer(1:nb)
   toread = toread - nb
 ENDDO
 IF(PRESENT(iostat)) iostat = 0
 RETURN
 9999 string = string //buffer(1:nch) 
 IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_u_eor

SUBROUTINE get_d_tset_c(string,set,separator,maxlen,iostat)
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
                                  ! the variable to be filled with
                                  ! characters read from the
                                  ! file connected to the default unit
 CHARACTER(LEN=*),INTENT(IN)      :: set
                                  ! the set of characters which if found in
                                  ! the input terminate the read
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
                                  ! the actual separator character from set
                                  ! found as the input string terminator
                                  ! returned as zero length if termination
                                  ! by maxlen or EOR
 INTEGER,INTENT(IN),OPTIONAL      :: maxlen
                                  ! if present indicates the maximum
                                  ! number of characters that will be
                                  ! read from the file
 INTEGER,INTENT(OUT),OPTIONAL     :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! reads string from the default unit starting at next character in the file and
! terminating at the end of record, occurance of a character in set,
! or after reading maxlen characters.
 CHARACTER :: buffer  ! characters must be read one at a time to detect
                      ! first terminator character in set
 INTEGER   :: ist,toread,lenset
 lenset = LEN(set)
 IF(PRESENT(maxlen))THEN
   toread=maxlen
 ELSE
   toread=HUGE(1)
 ENDIF
 string = ""  ! clears return string N.B. will also deallocate string via the
             ! assignment operation
 IF(PRESENT(separator)) separator=""  ! clear separator
 readchar:DO  ! repeatedly read buffer and add to string
   IF(toread <= 0)EXIT readchar ! maxlen reached
   READ(*,FMT='(A)',ADVANCE='NO',EOR=9999,IOSTAT=ist) buffer
   IF( ist /= 0 )THEN 
     IF(PRESENT(iostat)) THEN 
       iostat=ist 
       RETURN 
     ELSE 
       WRITE(*,*) " Error No.",ist, &
                  " during GET of varying string on default unit"
       STOP 
     ENDIF 
   ENDIF 
   ! check for occurance of set character in buffer
     DO j = 1,lenset
       IF(buffer == set(j:j))THEN
         IF(PRESENT(separator)) separator=buffer
         EXIT readchar
       ENDIF
     ENDDO
   string = string//buffer
   toread = toread - 1
 ENDDO readchar
 IF(PRESENT(iostat)) iostat = 0
 RETURN
 9999 CONTINUE ! EOR terminator read
 IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_d_tset_c

SUBROUTINE get_u_tset_c(unit,string,set,separator,maxlen,iostat)
 INTEGER,INTENT(IN)               :: unit
                                  ! identifies the input unit which must be
                                  ! connected for sequential formatted read
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
                                  ! the variable to be filled with
                                  ! characters read from the
                                  ! file connected to the unit
 CHARACTER(LEN=*),INTENT(IN)      :: set
                                  ! the set of characters which if found in
                                  ! the input terminate the read
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
                                  ! the actual separator character from set
                                  ! found as the input string terminator
                                  ! returned as zero length if termination
                                  ! by maxlen or EOR
 INTEGER,INTENT(IN),OPTIONAL      :: maxlen
                                  ! if present indicates the maximum
                                  ! number of characters that will be
                                  ! read from the file
 INTEGER,INTENT(OUT),OPTIONAL     :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! reads string from unit starting at next character in the file and
! terminating at the end of record, occurance of a character in set,
! or after reading maxlen characters.
 CHARACTER :: buffer  ! characters must be read one at a time to detect
                      ! first terminator character in set
 INTEGER   :: ist,toread,lenset
 lenset = LEN(set)
 IF(PRESENT(maxlen))THEN
   toread=maxlen
 ELSE
   toread=HUGE(1)
 ENDIF
 string = ""  ! clears return string N.B. will also deallocate string via the
             ! assignment operation
 IF(PRESENT(separator)) separator="" ! clear separator
 readchar:DO  ! repeatedly read buffer and add to string
   IF(toread <= 0)EXIT readchar ! maxlen reached
   READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,IOSTAT=ist) buffer
   IF( ist /= 0 )THEN 
     IF(PRESENT(iostat)) THEN 
       iostat=ist 
       RETURN 
     ELSE 
       WRITE(*,*) " Error No.",ist, &
                  " during GET of varying string on unit ",unit
       STOP 
     ENDIF 
   ENDIF 
   ! check for occurance of set character in buffer
     DO j = 1,lenset
       IF(buffer == set(j:j))THEN
         IF(PRESENT(separator)) separator=buffer
         EXIT readchar ! separator terminator found
       ENDIF
     ENDDO
   string = string//buffer
   toread = toread - 1
 ENDDO readchar
 IF(PRESENT(iostat)) iostat = 0
 RETURN
 9999 CONTINUE ! EOR terminator read
 IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_u_tset_c

!----- Output string procedures ----------------------------------------------!
  
SUBROUTINE put_d_c(string,iostat)
 CHARACTER(LEN=*),INTENT(IN)     :: string
                                  ! the character variable to be appended to
                                  ! the current record or to the start of
                                  ! the next record if there is no
                                  ! current record
                                  ! uses the default unit
 INTEGER,INTENT(OUT),OPTIONAL    :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
 INTEGER :: ist
 WRITE(*,FMT='(A)',ADVANCE='NO',IOSTAT=ist) string
 IF( ist /= 0 )THEN
  IF(PRESENT(iostat))THEN
   iostat = ist
   RETURN
  ELSE
   WRITE(*,*) " Error No.",ist, &
              " during PUT of character on default unit"
   STOP
  ENDIF
 ENDIF
 IF(PRESENT(iostat)) iostat=0
ENDSUBROUTINE put_d_c

SUBROUTINE put_u_c(unit,string,iostat)
 INTEGER,INTENT(IN)              :: unit
                                  ! identifies the output unit which must
                                  ! be connected for sequential formatted
                                  ! write
 CHARACTER(LEN=*),INTENT(IN)     :: string
                                  ! the character variable to be appended to
                                  ! the current record or to the start of
                                  ! the next record if there is no
                                  ! current record
 INTEGER,INTENT(OUT),OPTIONAL    :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
 INTEGER :: ist
 WRITE(unit,FMT='(A)',ADVANCE='NO',IOSTAT=ist) string
 IF( ist /= 0 )THEN
  IF(PRESENT(iostat))THEN
   iostat = ist
   RETURN
  ELSE
   WRITE(*,*) " Error No.",ist," during PUT of character on UNIT ",unit
   STOP
  ENDIF
 ENDIF
 IF(PRESENT(iostat)) iostat=0
ENDSUBROUTINE put_u_c


SUBROUTINE putline_d_c(string,iostat)
 CHARACTER(LEN=*),INTENT(IN)     :: string
                                  ! the character variable to be appended to
                                  ! the current record or to the start of
                                  ! the next record if there is no
                                  ! current record
                                  ! uses the default unit
 INTEGER,INTENT(OUT),OPTIONAL    :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! appends the string to the current record and then ends the record
! leaves the file positioned after the record just completed which then
! becomes the previous and last record in the file.
 INTEGER :: ist
 WRITE(*,FMT='(A,/)',ADVANCE='NO',IOSTAT=ist) string
 IF(PRESENT(iostat))THEN
  iostat = ist
  RETURN
 ELSEIF( ist /= 0 )THEN
  WRITE(*,*) " Error No.",ist, &
              " during PUT_LINE of character on default unit"
  STOP
 ENDIF
ENDSUBROUTINE putline_d_c
  
SUBROUTINE putline_u_c(unit,string,iostat)
 INTEGER,INTENT(IN)              :: unit
                                  ! identifies the output unit which must
                                  ! be connected for sequential formatted
                                  ! write
 CHARACTER(LEN=*),INTENT(IN)     :: string
                                  ! the character variable to be appended to
                                  ! the current record or to the start of
                                  ! the next record if there is no
                                  ! current record
 INTEGER,INTENT(OUT),OPTIONAL    :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! appends the string to the current record and then ends the record
! leaves the file positioned after the record just completed which then
! becomes the previous and last record in the file.
 INTEGER :: ist
 WRITE(unit,FMT='(A,/)',ADVANCE='NO',IOSTAT=ist) string
 IF(PRESENT(iostat))THEN
  iostat = ist
  RETURN
 ELSEIF( ist /= 0 )THEN
  WRITE(*,*) " Error No.",ist, &
              " during WRITE_LINE of character on UNIT",unit
  STOP
 ENDIF
ENDSUBROUTINE putline_u_c
  
!----- Insert procedure ----------------------------------------------------!
 FUNCTION insert_cc(string,start,substring)
  CHARACTER(LEN=:),ALLOCATABLE        :: insert_cc
  CHARACTER(LEN=*),INTENT(IN) :: string
  INTEGER,INTENT(IN)          :: start
  CHARACTER(LEN=*),INTENT(IN) :: substring
  ! calculates result string by inserting the substring into string
  ! beginning at position start pushing the remainder of the string
  ! to the right and enlarging it accordingly,
  ! if start is greater than LEN(string) the substring is simply appended  ! to string by concatenation. if start is less than 1
  ! substring is inserted before string, ie. start is treated as if it were 1 
  INTEGER                        :: ip,is,lsub,ls 
  lsub = LEN(substring); ls = LEN(string)
  is = MAX(start,1) 
  ip = MIN(ls+1,is) 
  insert_cc =  string(1:ip-1)//substring//string(ip:ls)
 ENDFUNCTION insert_cc
  

!----- Replace procedures ---------------------------------------------------!
 FUNCTION replace_ss(string,start,substring)
 CHARACTER(LEN=*),INTENT(IN) :: string
 INTEGER,INTENT(IN)          :: start
 CHARACTER(LEN=*),INTENT(IN) :: substring
 CHARACTER(LEN=:),ALLOCATABLE  :: replace_ss
  !  calculates the result string by the following actions:
 !  inserts the substring into string beginning at position 
 !  start replacing the following LEN(substring) characters of the string 
 !  and enlarging string if necessary. if start is greater than LEN(string) 
 !  substring is simply appended to string by concatenation. If start is less 
 !  than 1, substring replaces characters in string starting at 1
 INTEGER                        :: ip,is,lsub
 lsub = LEN(substring)
 is = MAX(start,1)
 ip = MIN(ls+1,is)
 replace_ss = string(1:ip-1)//substring//string(ip+lsub:)
ENDFUNCTION replace_ss
  
 FUNCTION replace_ss_sf(string,start,finish,substring)
 CHARACTER(LEN=*),INTENT(IN)  :: string
 INTEGER,INTENT(IN)           :: start,finish
 CHARACTER(LEN=*),INTENT(IN)  :: substring
 CHARACTER(LEN=:),ALLOCATABLE   :: replace_ss_sf
 !  calculates the result string by the following actions:
 !  inserts the substring into string beginning at position 
 !  start replacing the following finish-start+1 characters of the string
 !  and enlarging or shrinking the string if necessary.
 !  If start is greater than LEN(string) substring is simply appended to string
 !  by concatenation. If start is less than 1, start = 1 is used
 !  If finish is greater than LEN(string), finish = LEN(string) is used
 !  If finish is less than start, substring is inserted before start
 INTEGER                        :: ip,is,if,nw,lsub,ls
 lsub = LEN(substring); ls = LEN(string)
 is = MAX(start,1)
 ip = MIN(ls+1,is)
 if = MAX(ip-1,MIN(finish,ls))
 nw = lsub + ls - if+ip-1
 replace_ss_sf = string(1:ip-1)//substring//string(if+1:ls)
ENDFUNCTION replace_ss_sf

FUNCTION replace_sss(string,target,substring,every,back)
 CHARACTER(LEN=*),INTENT(IN) :: string,target,substring
 LOGICAL,INTENT(IN),OPTIONAL     :: every,back
 !  calculates the result string by the following actions:
 !  searches for occurences of target in string, and replaces these with
 !  substring. if back present with value true search is backward otherwise
 !  search is done forward. if every present with value true all occurences
 !  of target in string are replaced, otherwise only the first found is
 !  replaced. if target is not found the result is the same as string.
 LOGICAL                      :: dir_switch, rep_search
 CHARACTER(LEN=:),ALLOCATABLE :: work
 INTEGER                      :: ls,lt,lsub,ipos,ipow
 CHARACTER(LEN=:),ALLOCATABLE  :: replace_sss
 ls = LEN(string); lt = LEN(target); lsub = LEN(substring)
 IF(lt==0)THEN
   IF(ls==0)THEN
     replace_sss = substring
   ELSE
     replace_sss = string
   ENDIF
   RETURN
 ENDIF
 work = string
 IF( PRESENT(back) )THEN
   dir_switch = back
 ELSE
   dir_switch = .FALSE.
 ENDIF
 IF( PRESENT(every) )THEN
   rep_search = every
 ELSE
   rep_search = .FALSE.
 ENDIF
 IF( dir_switch )THEN ! backwards search
   ipos = ls-lt+1
   DO
     IF( ipos < 1 )EXIT ! search past start of string
     ! test for occurance of target in string at this position
     IF( string(ipos:ipos+lt-1) == target )THEN
       ! match found allocate space for string with this occurance of
       ! target replaced by substring
       ! copy work into temp replacing this occurance of target by
       ! substring
       work = work(1:ipos-1)//substring//work(ipos+lt:)
       IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted
       ! move search and replacement positions over the effected positions
       ipos = ipos-lt+1
     ENDIF
     ipos=ipos-1
   ENDDO
 ELSE ! forward search
   ipos = 1; ipow = 1
   DO
     IF( ipos > ls-lt+1 )EXIT ! search past end of string
     ! test for occurance of target in string at this position
     IF( string(ipos:ipos+lt-1) == target )THEN
       ! match found allocate space for string with this occurance of
       ! target replaced by substring
       work = work(1:ipow-1)//substring//work(ipow+lt:)
       IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted
       ! move search and replacement positions over the effected positions
       ipos = ipos+lt-1; ipow = ipow+lsub-1
     ENDIF
     ipos=ipos+1; ipow=ipow+1
   ENDDO
 ENDIF
 replace_sss = work
ENDFUNCTION replace_sss


!----- Remove procedure ----------------------------------------------------!
  
 FUNCTION remove_c(string,start,finish)
 CHARACTER(LEN=:),ALLOCATABLE        :: remove_c
 CHARACTER(LEN=*),INTENT(IN) :: string
 INTEGER,INTENT(IN),OPTIONAL :: start
 INTEGER,INTENT(IN),OPTIONAL :: finish
 !  returns as result the string produced by the actions
 !  removes the characters between start and finish from string reducing it in 
 !  size by MAX(0,ABS(finish-start+1)) 
 !  if start < 1 or is missing then assumes start=1 
 !  if finish > LEN(string) or is missing then assumes finish=LEN(string) 
 INTEGER                        :: is,if,ls
 
 ls = LEN(string)
 IF (PRESENT(start)) THEN
   is = MAX(1,start)
 ELSE
   is = 1
 ENDIF
 IF (PRESENT(finish)) THEN
   if = MIN(ls,finish)
 ELSE
   if = ls
 ENDIF
 IF( if < is ) THEN  ! zero characters to be removed, string is unchanged
   remove_c = string
 ELSE
   remove_c = string(1:is-1)//string(if+1:)
 ENDIF
ENDFUNCTION remove_c
  
!----- Extract procedure ---------------------------------------------------!
  
FUNCTION extract_c(string,start,finish)
  CHARACTER(LEN=*),INTENT(IN) :: string 
  INTEGER,INTENT(IN),OPTIONAL :: start   
  INTEGER,INTENT(IN),OPTIONAL :: finish  
  CHARACTER(LEN=:),ALLOCATABLE        :: extract_c 
  ! extracts the characters between start and finish from character string and 
  ! delivers these as the result of the function, string is unchanged 
  ! if start < 1 or is missing then it is treated as 1 
  ! if finish > LEN(string) or is missing then it is treated as LEN(string) 
  INTEGER                      :: is,if 
  IF (PRESENT(start)) THEN    
     is = MAX(1,start) 
  ELSE 
     is = 1 
  ENDIF 
  IF (PRESENT(finish)) THEN  
     if = MIN(LEN(string),finish) 
  ELSE 
     if = LEN(string) 
  ENDIF 
  extract_c = string(is:if) 
 ENDFUNCTION extract_c 

!----- Split procedure ------------------------------------------------------!

SUBROUTINE split_c(string,word,set,separator,back)
  CHARACTER(LEN=:),ALLOCATABLE,INTENT(INOUT)        :: string
  CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT)          :: word
  CHARACTER(LEN=*),INTENT(IN)               :: set
  CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
  LOGICAL,INTENT(IN),OPTIONAL               :: back
  ! splits the input string at the first(last) character in set
  ! returns the leading(trailing) substring in word and the trailing(leading)
  ! substring in string. The search is done in the forward or backward
  ! direction depending on back. If separator is present, the actual separator
  ! character found is returned in separator.
  ! If no character in set is found string and separator are returned as
  ! zero length and the whole input string is returned in word.
  LOGICAL                    :: dir_switch 
  INTEGER                    :: ls,tpos,lset
  CHARACTER(LEN=:),ALLOCATABLE :: wst ! working copy of string
  ls = LEN(string); lset = LEN(set)
  wst=string
  IF( PRESENT(back) )THEN 
    dir_switch = back 
  ELSE 
    dir_switch = .FALSE. 
  ENDIF 
  IF(dir_switch)THEN ! backwards search 
    BSEARCH:DO tpos = ls,1,-1
       DO i=1,lset
         IF(wst(tpos:tpos) == set(i:i))EXIT BSEARCH
       ENDDO
    ENDDO BSEARCH
    word = wst(tpos+1:ls)
    IF(PRESENT(separator))THEN
      IF(tpos==0)THEN
        separator = ""
      ELSE
        separator = wst(tpos:tpos)
      ENDIF
    ENDIF
    string = wst(1:tpos-1)
  ELSE ! forwards search
    FSEARCH:DO tpos =1,ls
       DO i=1,lset
         IF(wst(tpos:tpos) == set(i:i))EXIT FSEARCH
       ENDDO
    ENDDO FSEARCH
    word = wst(1:tpos-1)
    IF(PRESENT(separator))THEN
      IF(tpos==ls+1)THEN
        separator = ""
      ELSE
        separator = wst(tpos:tpos)
      ENDIF
    ENDIF
    string = wst(tpos+1:ls)
  ENDIF
 ENDSUBROUTINE split_c
  
ENDMODULE ISO_VARYING_STRING

............................................................................... 

PROGRAM word_count 
!-----------------------------------------------------------------------------!
! Counts the number of "words" contained in a file. The words are assumed to  ! 
! be terminated by any one of:                                                ! 
! space,comma,period,!,?, or the EoR                                          ! 
! The file may have records of any length and the file may contain any number ! 
! of records.                                                                 ! 
! The program prompts for the name of the file to be subject to a word count  ! 
! and the result is written to the default output unit                        ! 
!-----------------------------------------------------------------------------! 
USE ISO_VARYING_STRING 
IMPLICIT NONE
CHARACTER(len=:),ALLOCATABLE :: line,fname
INTEGER              :: ierr,nd,wcount=0 
fname = "initial string"
WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?" 
CALL GET(STRING=fname) ! read the required filename from the default 
                     ! input unit assumed to be the whole of the record read
OPEN(UNIT=10,FILE=fname) 
file_read: DO  ! until EoF reached 
  CALL GET(10,line,IOSTAT=ierr)  ! read next line of file 
  IF(ierr == -1 .OR. ierr > 0 )EXIT file_read
  word_scan: DO ! until end of line 
    nd=SCAN(line," ,.!?")  ! scan to find end of word 
    IF(nd == 0)THEN  ! EoR is end of word 
      nd = LEN(line)
      EXIT word_scan 
    ENDIF 
    IF(nd > 1) wcount=wcount+1  ! at least one non-terminator character 
                                ! in the word 
    line = REMOVE(line,1,nd)  ! strips the counted word and its terminator 
                            ! from the line reducing its length before 
                            ! rescanning for the next word 
  ENDDO word_scan 
  IF(nd > 0) wcount=wcount+1 
ENDDO file_read 
IF(ierr < 0)THEN 
  WRITE(*,*) "No. of words in file =",wcount 
ELSEIF(ierr > 0)THEN 
  WRITE(*,*) "Error in GET file in word_count, No. ",ierr 
ENDIF 
ENDPROGRAM word_count 

............................................................................... 

PROGRAM vocabulary_word_count
!-----------------------------------------------------------------------------!
! Counts the number of "words" contained in a file. The words are assumed to  ! 
! be terminated by any one of:                                                ! 
! space,comma,period,!,?, or the EoR                                          ! 
! The file may have records of any length and the file may contain any number ! 
! of records.                                                                 ! 
! The program prompts for the name of the file to be subject to a word count  ! 
! and the result is written to the default output unit                        ! 
! Also builds a list of the vocabulary found and the frequency of occurence   !
! of each different word.                                                     !
!-----------------------------------------------------------------------------! 
USE ISO_VARYING_STRING 
IMPLICIT NONE
type node
   character(len=:),allocatable :: word
   integer :: freq = 0
   type(node), pointer :: next
end type 
   type(node), pointer :: top=>null(), current
CHARACTER(LEN=:),ALLOCATABLE:: line,word,fname 
INTEGER              :: ierr,nd,wcount=0, dcount=0 
!-----------------------------------------------------------------------------!
! Vocabulary list and frequency count arrays. The size of these arrays will   !
! be extended dynamically in steps of 100 as the used vocabulary grows        !
!-----------------------------------------------------------------------------!
CHARACTER(LEN=:),ALLOCATABLE,DIMENSION(:) :: vocab
INTEGER                                       :: list_size=200,list_top=0
INTEGER :: i   ! loop index
!-----------------------------------------------------------------------------!
! Initialise the lists and determine the file to be processed                 !
!-----------------------------------------------------------------------------!
! WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?" 
! CALL GET(STRING=fname)  ! read the required filename from the default 
                  ! input unit assumed to be the whole of the record read
fname = '../readme.txt'
OPEN(UNIT=1,FILE=fname) 
file_read: DO ! until EoF reached 
  CALL GET(1,line,IOSTAT=ierr)  ! read next line of file 
  write(*,*)'line=',line
  IF(ierr == -1 .OR. ierr > 0)EXIT file_read
  word_scan: DO ! until end of line 
    nd=SCAN(line," ,.!?")  ! scan to find end of word 
    IF(nd == 0)THEN  ! EoR is end of word 
      nd = LEN(line)+1
      EXIT word_scan 
    ENDIF 
    IF(nd > 1)THEN  ! at least one non-terminator character in the word
       wcount=wcount+1
       word = EXTRACT(line,1,nd-1)
       write(*,*)word
       CALL update_vocab_lists
    ENDIF
    line = REMOVE(line,1,nd)  ! strips the counted word and its terminator 
                              ! from the line reducing its length before
                              ! rescanning for the next word
  ENDDO word_scan 
  IF(nd > 1)THEN  ! at least one character in the word
     wcount=wcount+1
     word = EXTRACT(line,1,nd-1)
     CALL update_vocab_lists
   ENDIF
ENDDO file_read 
IF(ierr < 0)THEN 
  WRITE(*,*) "No. of words in file =",wcount 
  WRITE(*,*) "There are ",dcount,"  distinct words"
  WRITE(*,*) "with the following frequencies of occurence"
  current => top
  print_loop: DO 
    if(.not.associated(current))exit
    WRITE(*,FMT='(1X,I6,2X)',ADVANCE='NO') current%freq
    CALL PUT_LINE(STRING=current%word)
    current => current%next
  ENDDO print_loop
ELSEIF(ierr > 0)THEN 
  WRITE(*,*) "Error in GET in vocabulary_word_count, No.",ierr
ENDIF 

CONTAINS

SUBROUTINE update_vocab_lists
!-----------------------------------------------------------------------------!
! Accesses the host variables:                                                !
!  type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab                     !
!  INTEGER,ALLOCATABLE,DIMENSION(:)              :: freq                      !
!  INTEGER                                       :: list_size,list_top        !
!  CHARACTER(LEN=:),ALLOCATABLE                         :: word                      !
! searches the existing words in vocab to find a match for word               !
! if found increments the freq if not found adds word to                      !
! list_top + 1  vocab list and sets corresponding freq to 1                   !
! if list_size exceeded extend the list size before updating                  !
!-----------------------------------------------------------------------------!
current => top
list_search: DO 
  if(.not.associated(current))exit
  IF(word == current%word)THEN
    current%freq = current%freq + 1
    RETURN
  ENDIF
  current => current%next
ENDDO list_search
dcount = dcount+1
allocate(current)
current%next => top
current%word = word
current%freq = 1
top => current 
ENDSUBROUTINE update_vocab_lists

ENDPROGRAM vocabulary_word_count



More information about the J3 mailing list