(j3.2006) (SC22WG5.4678) Part 2 of the standard

John Reid John.Reid
Mon Apr 16 11:35:17 EDT 2012


WG5,

Following helpful comments from Clive Page, Bill Long, and Malcolm 
Cohen, I have revised both my paper and the code.

New version attached.

Once again, comments will be very welcome.

John.

-------------- next part --------------
                                         ISO/IEC JTC1/SC22/WG5 N1913-2

        The module for Part 2, varying length character strings
        
                           John Reid

                          16 April 2012


I promised at Garching to look into the conversion to Fortran 2003 of the 
procedures of subclauses 3.6 and 3.7 of Part 2 of the Standard. These are 
the I/O procedures GET, PUT, and PUT_LINE and the elemental procedures 
EXTRACT, INSERT, REMOVE, REPLACE, and INSERT. The other procedures of
Part 2, except for CHAR with a character argument and VAR_STR, are already 
in Part 1. 

Lawrie Schonfelder wrote a sample module to implement Part 2. This is 
available from
     ftp://ftp.nag.co.uk/sc22wg5/ISO_VARYING_STRING/
He also wrote two sample programs ilustrating the use of the facilites.
These are included as an annex to Part 2 and are available from the same
ftp site. Bill Long has pointed out that the sample programs are not
portable because they rely on the IOSTAT value -1 indicating that the
end of the file has been reached. 

I have made a new module and new versions of the two sample programs. These 
are appended to this paper. They run successfully under 
the Nag compiler. Each occurrence of
   type(VARYING_STRING),INTENT(IN)
was replaced by
   CHARACTER(LEN=*),INTENT(IN)
Each other occurrence of
   type(VARYING_STRING)
is replaced by
   CHARACTER(LEN=:),ALLOCATABLE
This resulted in arguments that were previously documented as either of
type CHARACTER or VARYING_STRING becoming only of type CHARACTER. As well 
as the procedures subclauses 3.6 and 3.7, I included CHAR with a 
character argument and VAR_STR. The number of specific procedures reduced 
from 43 to 18.

There were obvious mechanical changes needed within the procedures. 

The functions EXTRACT, INSERT, REMOVE, and REPLACE cannot be elemental 
because each function result is allocatable with deferred character length. 
The subroutine SPLIT cannot be elemental because the arguments string, word, 
and separator are allocatable with deferred character length. This is because 
the elements of a character array have to have the same character length. 
A similar problem occurs in the vocabulary sample program, where an array 
holds words of varying length. Following a suggestion from Malcolm Cohen, I 
declared my own derived type with a single component of deferred character 
length and worked with arrays of this type. In this example, none of the 
functions was called elementally, but a user that wishes to do this could 
write impure elemental wrappers to get the effect. 

It might be thought that non-elemental invocations of EXTRACT, INSERT, REMOVE, 
REPLACE, and SPLIT might be replaced by simple use of substrings and 
concatenation. For example, REMOVE(string,start,finish) can usually be
replaced by string(1:start-1)//string(finish+1:), but there are end cases
where it cannot. The reader can see the details in the code that is appended. 

In both sample programs, I access ISO_FORTRAN_ENV to get the IOSTAT value that
indicates that the end of the file has been reached. 

Lawrie Schonfelder wrote a test suite that is available from:
   http://www.fortran.com/vst_95a.zip
I have adapted codes from here to test the new module, but I am not confident
that these adapted codes are adequate to fully test the revised code. In
particular, they rely on interactive input from the keyboard and I was unsure
of what is needed. If this work is to be taken further, new tests are needed. 

Two questions were asked at the meeting in Garching:

1. Can programs that use the varying length strings module be so easily 
converted that Part 2 of the Standard should be withdrawn?
   
2. Should extra intrinsic procedures be added to part 1?

The user with arrays of type VARYING_STRING may declare a new derived type
with a single component of deferred character length and write impure 
elemental wrappers to support elemental calls. Apart from this, the module 
given here (after full testing) should allow simple conversion. Changes 
are needed for the extraction of a single character; for example, 
set%chars(j) needs to be replaced by set(j:j). They are also needed
if array features are employed for the character array component. For
example
      IF( ALL(string%chars(ipos:ipos+lt-1) == target%chars) )THEN
becomes
      IF( string(ipos:ipos+lt-1) == target )THEN

I prefer not to comment on whether these functions should be added to Part 1. 
The case needs to be made by those who need them. We might consider providing
an open source and fully tested version of the appended module. 

...............................................................................

MODULE ISO_VARYING_STRING
USE,INTRINSIC :: ISO_FORTRAN_ENV 
IMPLICIT NONE
! Adapted by John Reid from the module written by J.L.Schonfelder.  

! Version produced (12 April 2012)

!-----------------------------------------------------------------------------
! This module provides revised versions of the generic procedures GET, PUT,  
! PUT_LINE, EXTRACT, INSERT, REMOVE, REPLACE, and INSERT of                  
! ISO/IEC 1539-2:2000(E).
!----------------------------------------------------------------------------- 

! NB This module has not been fully tested.   
  
PRIVATE 
  
!-----------------------------------------------------------------------------! 
! By default all entities declared or defined in this module are private to   !
! the module. Only those entities declared explicitly as being public are     ! 
! accessible to programs using the module. In particular, the procedures      ! 
! defined herein are made accessible via their generic identifiers only;      ! 
! their specific names are private.                                     ! 
!-----------------------------------------------------------------------------! 
  
 
!----- GENERIC PROCEDURE INTERFACE DEFINITIONS -------------------------------! 
 
!----- Conversion procedure interfaces ---------------------------------------!
INTERFACE VAR_STR
  MODULE PROCEDURE c_to_s   ! character to string
ENDINTERFACE 
  
INTERFACE CHAR
  MODULE PROCEDURE s_to_c, &   ! string to character
                   s_to_fix_c  ! string to specified length character
ENDINTERFACE   

!----- Input procedure interfaces --------------------------------------------!
INTERFACE GET
  MODULE PROCEDURE get_d_eor, &    ! default unit, EoR termination
                   get_u_eor, &    ! specified unit, EoR termination
                   get_d_tset_c, & ! default unit, char set termination
                   get_u_tset_c    ! specified unit, char set termination
ENDINTERFACE   
!----- Output procedure interfaces -------------------------------------------!
INTERFACE PUT
  MODULE PROCEDURE put_d_c, & ! char to default unit
                   put_u_c    ! char to specified unit
ENDINTERFACE 
  
INTERFACE PUT_LINE
  MODULE PROCEDURE putline_d_c, & ! char to default unit
                   putline_u_c    ! char to specified unit
ENDINTERFACE 
  
!----- Insert procedure interfaces -------------------------------------------!
INTERFACE INSERT 
  MODULE PROCEDURE insert_cc ! string in string
ENDINTERFACE 

!----- Replace procedure interfaces ------------------------------------------!
INTERFACE REPLACE 
  MODULE PROCEDURE replace_ss, &   ! string by string, at specified
                   replace_ss_sf,& ! string by string, between
                   replace_sss     ! in string replace string by string
ENDINTERFACE 

!----- Remove procedure interface --------------------------------------------! 
INTERFACE REMOVE 
  MODULE PROCEDURE remove_c    ! characters from char  , and finish
ENDINTERFACE 
  
!----- Extract procedure interface -------------------------------------------!
INTERFACE EXTRACT 
  MODULE PROCEDURE extract_c    ! from char   extract string, and finish
ENDINTERFACE 
  
!----- Split procedure interface ---------------------------------------------!
INTERFACE SPLIT
  MODULE PROCEDURE split_c    !   character in set
ENDINTERFACE

!----- specification of publicly accessible entities -------------------------! 
PUBLIC :: CHAR,GET,PUT,PUT_LINE,EXTRACT,INSERT,REMOVE,REPLACE,SPLIT,VAR_STR
          
CONTAINS 

  
!----- Conversion Procedures ------------------------------------------------! 
  FUNCTION c_to_s(chr) ! generic VAR_STR
  CHARACTER(LEN=:),ALLOCATABLE :: c_to_s 
  CHARACTER(LEN=*),INTENT(IN) :: chr 
    c_to_s = chr 
  ENDFUNCTION c_to_s 
  
 PURE FUNCTION s_to_c(string) ! generic CHAR
  CHARACTER(LEN=*),INTENT(IN)   :: string 
  CHARACTER(LEN=:),ALLOCATABLE :: s_to_c 
  ! returns the characters of string as an automatically sized character 
    s_to_c = string
 ENDFUNCTION s_to_c 
  
 PURE FUNCTION s_to_fix_c(string,length) ! generic CHAR
  CHARACTER(LEN=*),INTENT(IN)   :: string 
  INTEGER,INTENT(IN)            :: length
  CHARACTER(LEN=length)         :: s_to_fix_c
  ! returns the character of fixed length, length, containing the characters
  ! of string either padded with blanks or truncated on the right to fit
    s_to_fix_c = string
  ENDFUNCTION s_to_fix_c


!----- Input string procedure -----------------------------------------------!
SUBROUTINE get_d_eor(string,maxlen,iostat)
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
                                  ! the variable to be filled with
                                  ! characters read from the
                                  ! file connected to the default unit
 INTEGER,INTENT(IN),OPTIONAL      :: maxlen
                                  ! if present indicates the maximum
                                  ! number of characters that will be
                                  ! read from the file
 INTEGER,INTENT(OUT),OPTIONAL     :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! reads string from the default unit starting at next character in the file
! and terminating at the end of record or after maxlen characters.
  call get_u_eor (input_unit, string, maxlen, iostat)
ENDSUBROUTINE get_d_eor
  
SUBROUTINE get_u_eor(unit,string,maxlen,iostat)
 INTEGER,INTENT(IN)               :: unit
                                  ! identifies the input unit which must be
                                  ! connected for sequential formatted read
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
                                  ! the variable to be filled with
                                  ! characters read from the
                                  ! file connected to the unit
 INTEGER,INTENT(IN),OPTIONAL      :: maxlen
                                  ! if present indicates the maximum
                                  ! number of characters that will be
                                  ! read from the file
 INTEGER,INTENT(OUT),OPTIONAL     :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! reads string from unit starting at next character in the file and 
! terminating at the end of record or after maxlen characters.
 CHARACTER(LEN=80) :: buffer
 INTEGER           :: ist,nch,toread,nb
 IF(PRESENT(maxlen))THEN
   toread=maxlen
 ELSE
   toread=HUGE(1)
 ENDIF
 string=""  ! Start with an empty string
 DO  ! repeatedly read buffer and add to string until EoR
     ! or maxlen reached
   IF(toread <= 0)EXIT
   nb=MIN(80,toread)
   READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,SIZE=nch,IOSTAT=ist) buffer(1:nb)
   IF( ist /= 0 )THEN 
     IF(PRESENT(iostat)) THEN 
       iostat=ist 
       RETURN 
     ELSE 
       WRITE(*,*) " Error No.",ist, &
                  " during READ_STRING of varying string on UNIT ",unit
       STOP 
     ENDIF 
   ENDIF 
   string = string //buffer(1:nb)
   toread = toread - nb
 ENDDO
 IF(PRESENT(iostat)) iostat = 0
 RETURN
 9999 string = string //buffer(1:nch) 
 IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_u_eor

SUBROUTINE get_d_tset_c(string,set,separator,maxlen,iostat)
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
                                  ! the variable to be filled with
                                  ! characters read from the
                                  ! file connected to the default unit
 CHARACTER(LEN=*),INTENT(IN)      :: set
                                  ! the set of characters which if found in
                                  ! the input terminate the read
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
                                  ! the actual separator character from set
                                  ! found as the input string terminator
                                  ! returned as zero length if termination
                                  ! by maxlen or EOR
 INTEGER,INTENT(IN),OPTIONAL      :: maxlen
                                  ! if present indicates the maximum
                                  ! number of characters that will be
                                  ! read from the file
 INTEGER,INTENT(OUT),OPTIONAL     :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! reads string from the default unit starting at next character in the file and
! terminating at the end of record, occurance of a character in set,
! or after reading maxlen characters.
 CALL get_u_tset_c(input_unit,string,set,separator,maxlen,iostat) 
ENDSUBROUTINE get_d_tset_c

SUBROUTINE get_u_tset_c(unit,string,set,separator,maxlen,iostat)
 INTEGER,INTENT(IN)               :: unit
                                  ! identifies the input unit which must be
                                  ! connected for sequential formatted read
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT) :: string
                                  ! the variable to be filled with
                                  ! characters read from the
                                  ! file connected to the unit
 CHARACTER(LEN=*),INTENT(IN)      :: set
                                  ! the set of characters which if found in
                                  ! the input terminate the read
 CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
                                  ! the actual separator character from set
                                  ! found as the input string terminator
                                  ! returned as zero length if termination
                                  ! by maxlen or EOR
 INTEGER,INTENT(IN),OPTIONAL      :: maxlen
                                  ! if present indicates the maximum
                                  ! number of characters that will be
                                  ! read from the file
 INTEGER,INTENT(OUT),OPTIONAL     :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! reads string from unit starting at next character in the file and
! terminating at the end of record, occurance of a character in set,
! or after reading maxlen characters.
 CHARACTER :: buffer  ! characters must be read one at a time to detect
                      ! first terminator character in set
 INTEGER   :: ist,j,toread,lenset
 lenset = LEN(set)
 IF(PRESENT(maxlen))THEN
   toread=maxlen
 ELSE
   toread=HUGE(1)
 ENDIF
 string = ""  ! clears return string N.B. will also deallocate string via the
             ! assignment operation
 IF(PRESENT(separator)) separator="" ! clear separator
 readchar:DO  ! repeatedly read buffer and add to string
   IF(toread <= 0)EXIT readchar ! maxlen reached
   READ(unit,FMT='(A)',ADVANCE='NO',EOR=9999,IOSTAT=ist) buffer
   IF( ist /= 0 )THEN 
     IF(PRESENT(iostat)) THEN 
       iostat=ist 
       RETURN 
     ELSE 
       WRITE(*,*) " Error No.",ist, &
                  " during GET of varying string on unit ",unit
       STOP 
     ENDIF 
   ENDIF 
   ! check for occurance of set character in buffer
     DO j = 1,lenset
       IF(buffer == set(j:j))THEN
         IF(PRESENT(separator)) separator=buffer
         EXIT readchar ! separator terminator found
       ENDIF
     ENDDO
   string = string//buffer
   toread = toread - 1
 ENDDO readchar
 IF(PRESENT(iostat)) iostat = 0
 RETURN
 9999 CONTINUE ! EOR terminator read
 IF(PRESENT(iostat)) iostat = ist
ENDSUBROUTINE get_u_tset_c

!----- Output string procedures ----------------------------------------------!
  
SUBROUTINE put_d_c(string,iostat)
 CHARACTER(LEN=*),INTENT(IN)     :: string
                                  ! the character variable to be appended to
                                  ! the current record or to the start of
                                  ! the next record if there is no
                                  ! current record
                                  ! uses the default unit
 INTEGER,INTENT(OUT),OPTIONAL    :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
 CALL put_u_c(output_unit,string,iostat)
ENDSUBROUTINE put_d_c

SUBROUTINE put_u_c(unit,string,iostat)
 INTEGER,INTENT(IN)              :: unit
                                  ! identifies the output unit which must
                                  ! be connected for sequential formatted
                                  ! write
 CHARACTER(LEN=*),INTENT(IN)     :: string
                                  ! the character variable to be appended to
                                  ! the current record or to the start of
                                  ! the next record if there is no
                                  ! current record
 INTEGER,INTENT(OUT),OPTIONAL    :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
 INTEGER :: ist
 WRITE(unit,FMT='(A)',ADVANCE='NO',IOSTAT=ist) string
 IF( ist /= 0 )THEN
  IF(PRESENT(iostat))THEN
   iostat = ist
   RETURN
  ELSE
   WRITE(*,*) " Error No.",ist," during PUT of character on UNIT ",unit
   STOP
  ENDIF
 ENDIF
 IF(PRESENT(iostat)) iostat=0
ENDSUBROUTINE put_u_c


SUBROUTINE putline_d_c(string,iostat)
 CHARACTER(LEN=*),INTENT(IN)     :: string
                                  ! the character variable to be appended to
                                  ! the current record or to the start of
                                  ! the next record if there is no
                                  ! current record
                                  ! uses the default unit
 INTEGER,INTENT(OUT),OPTIONAL    :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! appends the string to the current record and then ends the record
! leaves the file positioned after the record just completed which then
! becomes the previous and last record in the file.
 CALL putline_u_c(output_unit,string,iostat)
ENDSUBROUTINE putline_d_c
  
SUBROUTINE putline_u_c(unit,string,iostat)
 INTEGER,INTENT(IN)              :: unit
                                  ! identifies the output unit which must
                                  ! be connected for sequential formatted
                                  ! write
 CHARACTER(LEN=*),INTENT(IN)     :: string
                                  ! the character variable to be appended to
                                  ! the current record or to the start of
                                  ! the next record if there is no
                                  ! current record
 INTEGER,INTENT(OUT),OPTIONAL    :: iostat
                                  ! if present used to return the status
                                  ! of the data transfer
                                  ! if absent errors cause termination
! appends the string to the current record and then ends the record
! leaves the file positioned after the record just completed which then
! becomes the previous and last record in the file.
 INTEGER :: ist
 WRITE(unit,FMT='(A,/)',ADVANCE='NO',IOSTAT=ist) string
 IF(PRESENT(iostat))THEN
  iostat = ist
  RETURN
 ELSEIF( ist /= 0 )THEN
  WRITE(*,*) " Error No.",ist, &
              " during WRITE_LINE of character on UNIT",unit
  STOP
 ENDIF
ENDSUBROUTINE putline_u_c
  
!----- Insert procedure ----------------------------------------------------!
 FUNCTION insert_cc(string,start,substring)
  CHARACTER(LEN=:),ALLOCATABLE        :: insert_cc
  CHARACTER(LEN=*),INTENT(IN) :: string
  INTEGER,INTENT(IN)          :: start
  CHARACTER(LEN=*),INTENT(IN) :: substring
  ! calculates result string by inserting the substring into string
  ! beginning at position start pushing the remainder of the string
  ! to the right and enlarging it accordingly,
  ! if start is greater than LEN(string) the substring is simply appended  ! to string by concatenation. if start is less than 1
  ! substring is inserted before string, ie. start is treated as if it were 1 
  INTEGER                        :: ip,is,lsub,ls 
  lsub = LEN(substring); ls = LEN(string)
  is = MAX(start,1) 
  ip = MIN(ls+1,is) 
  insert_cc =  string(1:ip-1)//substring//string(ip:ls)
 ENDFUNCTION insert_cc
  

!----- Replace procedures ---------------------------------------------------!
 FUNCTION replace_ss(string,start,substring)
 CHARACTER(LEN=*),INTENT(IN) :: string
 INTEGER,INTENT(IN)          :: start
 CHARACTER(LEN=*),INTENT(IN) :: substring
 CHARACTER(LEN=:),ALLOCATABLE  :: replace_ss
  !  calculates the result string by the following actions:
 !  inserts the substring into string beginning at position 
 !  start replacing the following LEN(substring) characters of the string 
 !  and enlarging string if necessary. if start is greater than LEN(string) 
 !  substring is simply appended to string by concatenation. If start is less 
 !  than 1, substring replaces characters in string starting at 1
 INTEGER                        :: ip,is,lsub,ls
 lsub = LEN(substring); ls = LEN(string)
 is = MAX(start,1)
 ip = MIN(ls+1,is)
 replace_ss = string(1:ip-1)//substring//string(ip+lsub:)
ENDFUNCTION replace_ss
  
 FUNCTION replace_ss_sf(string,start,finish,substring)
 CHARACTER(LEN=*),INTENT(IN)  :: string
 INTEGER,INTENT(IN)           :: start,finish
 CHARACTER(LEN=*),INTENT(IN)  :: substring
 CHARACTER(LEN=:),ALLOCATABLE   :: replace_ss_sf
 !  calculates the result string by the following actions:
 !  inserts the substring into string beginning at position 
 !  start replacing the following finish-start+1 characters of the string
 !  and enlarging or shrinking the string if necessary.
 !  If start is greater than LEN(string) substring is simply appended to string
 !  by concatenation. If start is less than 1, start = 1 is used
 !  If finish is greater than LEN(string), finish = LEN(string) is used
 !  If finish is less than start, substring is inserted before start
 INTEGER                        :: ip,is,if,nw,lsub,ls
 lsub = LEN(substring); ls = LEN(string)
 is = MAX(start,1)
 ip = MIN(ls+1,is)
 if = MAX(ip-1,MIN(finish,ls))
 nw = lsub + ls - if+ip-1
 replace_ss_sf = string(1:ip-1)//substring//string(if+1:ls)
ENDFUNCTION replace_ss_sf

FUNCTION replace_sss(string,target,substring,every,back)
 CHARACTER(LEN=*),INTENT(IN) :: string,target,substring
 LOGICAL,INTENT(IN),OPTIONAL     :: every,back
 !  calculates the result string by the following actions:
 !  searches for occurences of target in string, and replaces these with
 !  substring. if back present with value true search is backward otherwise
 !  search is done forward. if every present with value true all occurences
 !  of target in string are replaced, otherwise only the first found is
 !  replaced. if target is not found the result is the same as string.
 LOGICAL                      :: dir_switch, rep_search
 CHARACTER(LEN=:),ALLOCATABLE :: work
 INTEGER                      :: ls,lt,lsub,ipos,ipow
 CHARACTER(LEN=:),ALLOCATABLE  :: replace_sss
 ls = LEN(string); lt = LEN(target); lsub = LEN(substring)
 IF(lt==0)THEN
   IF(ls==0)THEN
     replace_sss = substring
   ELSE
     replace_sss = string
   ENDIF
   RETURN
 ENDIF
 work = string
 IF( PRESENT(back) )THEN
   dir_switch = back
 ELSE
   dir_switch = .FALSE.
 ENDIF
 IF( PRESENT(every) )THEN
   rep_search = every
 ELSE
   rep_search = .FALSE.
 ENDIF
 IF( dir_switch )THEN ! backwards search
   ipos = ls-lt+1
   DO
     IF( ipos < 1 )EXIT ! search past start of string
     ! test for occurance of target in string at this position
     IF( string(ipos:ipos+lt-1) == target )THEN
       ! match found allocate space for string with this occurance of
       ! target replaced by substring
       ! copy work into temp replacing this occurance of target by
       ! substring
       work = work(1:ipos-1)//substring//work(ipos+lt:)
       IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted
       ! move search and replacement positions over the effected positions
       ipos = ipos-lt+1
     ENDIF
     ipos=ipos-1
   ENDDO
 ELSE ! forward search
   ipos = 1; ipow = 1
   DO
     IF( ipos > ls-lt+1 )EXIT ! search past end of string
     ! test for occurance of target in string at this position
     IF( string(ipos:ipos+lt-1) == target )THEN
       ! match found allocate space for string with this occurance of
       ! target replaced by substring
       work = work(1:ipow-1)//substring//work(ipow+lt:)
       IF(.NOT.rep_search)EXIT ! exit if only first replacement wanted
       ! move search and replacement positions over the effected positions
       ipos = ipos+lt-1; ipow = ipow+lsub-1
     ENDIF
     ipos=ipos+1; ipow=ipow+1
   ENDDO
 ENDIF
 replace_sss = work
ENDFUNCTION replace_sss


!----- Remove procedure ----------------------------------------------------!
  
 FUNCTION remove_c(string,start,finish)
 CHARACTER(LEN=:),ALLOCATABLE        :: remove_c
 CHARACTER(LEN=*),INTENT(IN) :: string
 INTEGER,INTENT(IN),OPTIONAL :: start
 INTEGER,INTENT(IN),OPTIONAL :: finish
 !  returns as result the string produced by the actions
 !  removes the characters between start and finish from string reducing it in 
 !  size by MAX(0,ABS(finish-start+1)) 
 !  if start < 1 or is missing then assumes start=1 
 !  if finish > LEN(string) or is missing then assumes finish=LEN(string) 
 INTEGER                        :: is,if,ls
 
 ls = LEN(string)
 IF (PRESENT(start)) THEN
   is = MAX(1,start)
 ELSE
   is = 1
 ENDIF
 IF (PRESENT(finish)) THEN
   if = MIN(ls,finish)
 ELSE
   if = ls
 ENDIF
 IF( if < is ) THEN  ! zero characters to be removed, string is unchanged
   remove_c = string
 ELSE
   remove_c = string(1:is-1)//string(if+1:)
 ENDIF
ENDFUNCTION remove_c
  
!----- Extract procedure ---------------------------------------------------!
  
FUNCTION extract_c(string,start,finish)
  CHARACTER(LEN=*),INTENT(IN) :: string 
  INTEGER,INTENT(IN),OPTIONAL :: start   
  INTEGER,INTENT(IN),OPTIONAL :: finish  
  CHARACTER(LEN=:),ALLOCATABLE        :: extract_c 
  ! extracts the characters between start and finish from character string and 
  ! delivers these as the result of the function, string is unchanged 
  ! if start < 1 or is missing then it is treated as 1 
  ! if finish > LEN(string) or is missing then it is treated as LEN(string) 
  INTEGER                      :: is,if 
  IF (PRESENT(start)) THEN    
     is = MAX(1,start) 
  ELSE 
     is = 1 
  ENDIF 
  IF (PRESENT(finish)) THEN  
     if = MIN(LEN(string),finish) 
  ELSE 
     if = LEN(string) 
  ENDIF 
  extract_c = string(is:if) 
 ENDFUNCTION extract_c 

!----- Split procedure ------------------------------------------------------!

SUBROUTINE split_c(string,word,set,separator,back)
  CHARACTER(LEN=:),ALLOCATABLE,INTENT(INOUT)        :: string
  CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT)          :: word
  CHARACTER(LEN=*),INTENT(IN)               :: set
  CHARACTER(LEN=:),ALLOCATABLE,INTENT(OUT),OPTIONAL :: separator
  LOGICAL,INTENT(IN),OPTIONAL               :: back
  ! splits the input string at the first(last) character in set
  ! returns the leading(trailing) substring in word and the trailing(leading)
  ! substring in string. The search is done in the forward or backward
  ! direction depending on back. If separator is present, the actual separator
  ! character found is returned in separator.
  ! If no character in set is found string and separator are returned as
  ! zero length and the whole input string is returned in word.
  LOGICAL                    :: dir_switch 
  INTEGER                    :: i,ls,tpos,lset
  CHARACTER(LEN=:),ALLOCATABLE :: wst ! working copy of string
  ls = LEN(string); lset = LEN(set)
  wst=string
  IF( PRESENT(back) )THEN 
    dir_switch = back 
  ELSE 
    dir_switch = .FALSE. 
  ENDIF 
  IF(dir_switch)THEN ! backwards search 
    BSEARCH:DO tpos = ls,1,-1
       DO i=1,lset
         IF(wst(tpos:tpos) == set(i:i))EXIT BSEARCH
       ENDDO
    ENDDO BSEARCH
    word = wst(tpos+1:ls)
    IF(PRESENT(separator))THEN
      IF(tpos==0)THEN
        separator = ""
      ELSE
        separator = wst(tpos:tpos)
      ENDIF
    ENDIF
    string = wst(1:tpos-1)
  ELSE ! forwards search
    FSEARCH:DO tpos =1,ls
       DO i=1,lset
         IF(wst(tpos:tpos) == set(i:i))EXIT FSEARCH
       ENDDO
    ENDDO FSEARCH
    word = wst(1:tpos-1)
    IF(PRESENT(separator))THEN
      IF(tpos==ls+1)THEN
        separator = ""
      ELSE
        separator = wst(tpos:tpos)
      ENDIF
    ENDIF
    string = wst(tpos+1:ls)
  ENDIF
 ENDSUBROUTINE split_c
  
ENDMODULE ISO_VARYING_STRING

...............................................................................

PROGRAM word_count 
!-----------------------------------------------------------------------------!
! Counts the number of "words" contained in a file. The words are assumed to  ! 
! be terminated by any one of:                                                ! 
! space,comma,period,!,?, or the EoR                                          ! 
! The file may have records of any length and the file may contain any number ! 
! of records.                                                                 ! 
! The program prompts for the name of the file to be subject to a word count  ! 
! and the result is written to the default output unit                        ! 
!-----------------------------------------------------------------------------! 
USE ISO_VARYING_STRING 
USE,INTRINSIC :: ISO_FORTRAN_ENV 
IMPLICIT NONE
CHARACTER(len=:),ALLOCATABLE :: line,fname
INTEGER              :: ierr,nd,wcount=0 
WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?" 
CALL GET(STRING=fname) ! read the required filename from the default 
                     ! input unit assumed to be the whole of the record read
OPEN(UNIT=10,FILE=fname) 
file_read: DO  ! until EoF reached 
  CALL GET(10,line,IOSTAT=ierr)  ! read next line of file 
  IF(ierr == IOSTAT_END .OR. ierr > 0 )EXIT file_read
  word_scan: DO ! until end of line 
    nd=SCAN(line," ,.!?")  ! scan to find end of word 
    IF(nd == 0)THEN  ! EoR is end of word 
      nd = LEN(line)
      EXIT word_scan 
    ENDIF 
    IF(nd > 1) wcount=wcount+1  ! at least one non-terminator character 
                                ! in the word 
    line = REMOVE(line,1,nd)  ! strips the counted word and its terminator 
                            ! from the line reducing its length before 
                            ! rescanning for the next word 
  ENDDO word_scan 
  IF(nd > 0) wcount=wcount+1 
ENDDO file_read 
IF(ierr < 0)THEN 
  WRITE(*,*) "No. of words in file =",wcount 
ELSEIF(ierr > 0)THEN 
  WRITE(*,*) "Error in GET file in word_count, No. ",ierr 
ENDIF 
ENDPROGRAM word_count 

...............................................................................

PROGRAM vocabulary_word_count
!-----------------------------------------------------------------------------!
! Counts the number of "words" contained in a file. The words are assumed to  ! 
! be terminated by any one of:                                                ! 
! space,comma,period,!,?, or the EoR                                          ! 
! The file may have records of any length and the file may contain any number ! 
! of records.                                                                 ! 
! The program prompts for the name of the file to be subject to a word count  ! 
! and the result is written to the default output unit                        ! 
! Also builds a list of the vocabulary found and the frequency of occurence   !
! of each different word.                                                     !
!-----------------------------------------------------------------------------! 
USE ISO_VARYING_STRING 
USE,INTRINSIC :: ISO_FORTRAN_ENV 
IMPLICIT NONE
type VARYING_STRING
  CHARACTER(LEN=:),ALLOCATABLE :: chars
end type VARYING_STRING
CHARACTER(LEN=:),ALLOCATABLE :: line,word,fname 
INTEGER              :: ierr,nd,wcount=0 
!-----------------------------------------------------------------------------!
! Vocabulary list and frequency count arrays. The size of these arrays will   !
! be extended dynamically in steps of 100 as the used vocabulary grows        !
!-----------------------------------------------------------------------------!
type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab
INTEGER,ALLOCATABLE,DIMENSION(:)              :: freq
INTEGER                                       :: list_size=200,list_top=0
INTEGER :: i   ! loop index
!-----------------------------------------------------------------------------!
! Initialise the lists and determine the file to be processed                 !
!-----------------------------------------------------------------------------!
ALLOCATE(vocab(1:list_size),freq(1:list_size))
WRITE(*,ADVANCE='NO',FMT='(A)') " Input name of file?" 
CALL GET(STRING=fname)  ! read the required filename from the default 
                  ! input unit assumed to be the whole of the record read
OPEN(UNIT=1,FILE=CHAR(fname))  ! CHAR(fname) converts to the type 
                               ! required by FILE= specifier 
file_read: DO ! until EoF reached 
  CALL GET(1,line,IOSTAT=ierr)  ! read next line of file 
  IF(ierr == IOSTAT_END .OR. ierr > 0)EXIT file_read
  word_scan: DO ! until end of line 
    nd=SCAN(line," ,.!?")  ! scan to find end of word 
    IF(nd == 0)THEN  ! EoR is end of word 
      nd = LEN(line)+1
      EXIT word_scan 
    ENDIF 
    IF(nd > 1)THEN  ! at least one non-terminator character in the word
       wcount=wcount+1
       word = EXTRACT(line,1,nd-1)
       CALL update_vocab_lists
    ENDIF
    line = REMOVE(line,1,nd)  ! strips the counted word and its terminator 
                              ! from the line reducing its length before
                              ! rescanning for the next word
  ENDDO word_scan 
  IF(nd > 1)THEN  ! at least one character in the word
     wcount=wcount+1
     word = EXTRACT(line,1,nd-1)
     CALL update_vocab_lists
   ENDIF
ENDDO file_read 
IF(ierr < 0)THEN 
  WRITE(*,*) "No. of words in file =",wcount 
  WRITE(*,*) "There are ",list_top,"  distinct words"
  WRITE(*,*) "with the following frequencies of occurence"
  print_loop: DO i=1,list_top
    WRITE(*,FMT='(1X,I6,2X)',ADVANCE='NO') freq(i)
    CALL PUT_LINE(STRING=vocab(i)%chars)
  ENDDO print_loop
ELSEIF(ierr > 0)THEN 
  WRITE(*,*) "Error in GET in vocabulary_word_count, No.",ierr
ENDIF 

CONTAINS

SUBROUTINE extend_lists
!-----------------------------------------------------------------------------!
! Accesses the host variables:                                                !
!  type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab                     !
!  INTEGER,ALLOCATABLE,DIMENSION(:)              :: freq                      !
!  INTEGER                                       :: list_size                 !
! so as to extend the size of the lists preserving the existing vocabulary    !
! and frequency information in the new extended lists                         !
!-----------------------------------------------------------------------------!
type(VARYING_STRING),DIMENSION(list_size) :: vocab_swap
INTEGER,DIMENSION(list_size)              :: freq_swap
INTEGER,PARAMETER :: list_increment=100
INTEGER           :: new_list_size,alerr
vocab_swap = vocab  ! copy old list into temporary space
freq_swap =freq
new_list_size = list_size + list_increment
DEALLOCATE(vocab,freq)
ALLOCATE(vocab(1:new_list_size),freq(1:new_list_size),STAT=alerr)
IF(alerr /= 0)THEN
  WRITE(*,*) "Unable to extend vocabulary list"
  STOP
ENDIF
vocab(1:list_size) = vocab_swap   ! copy old list back into bottom
freq(1:list_size) = freq_swap     ! of new extended list
list_size = new_list_size
ENDSUBROUTINE extend_lists

SUBROUTINE update_vocab_lists
!-----------------------------------------------------------------------------!
! Accesses the host variables:                                                !
!  type(VARYING_STRING),ALLOCATABLE,DIMENSION(:) :: vocab                     !
!  INTEGER,ALLOCATABLE,DIMENSION(:)              :: freq                      !
!  INTEGER                                       :: list_size,list_top        !
!  type(VARYING_STRING)                          :: word                      !
! searches the existing words in vocab to find a match for word               !
! if found increments the freq if not found adds word to                      !
! list_top + 1  vocab list and sets corresponding freq to 1                   !
! if list_size exceeded extend the list size before updating                  !
!-----------------------------------------------------------------------------!
INTEGER :: i   ! loop index 
list_search: DO i=1,list_top
  IF(word == vocab(i)%chars)THEN
    freq(i) = freq(i) + 1
    RETURN
  ENDIF
ENDDO list_search
IF(list_top == list_size)THEN
  CALL extend_lists
ENDIF
list_top = list_top + 1
vocab(list_top)%chars = word
freq(list_top) = 1
ENDSUBROUTINE update_vocab_lists

ENDPROGRAM vocabulary_word_count

...............................................................................




More information about the J3 mailing list