rank(3f) - [M_datapac:SORT] rank a vector of sample observations
SUBROUTINE RANK(X,N,Xr)
REAL(kind=wp),intent(in) :: X(:)
INTEGER,intent(in) :: N
REAL(kind=wp),intent(out) :: Xr(:)
RANK(3f) ranks (in ascending order) the N elements of the precision
precision vector X, and puts the resulting N ranks into the precision
precision vector XR.
RANK(3f) gives the data analyst the ability to (for example) rank
the data preliminary to certain distribution-free analyses.
The rank of the first element of the vector X will be placed in the first position of the vector XR, the rank of the second element of the vector X will be placed in the second position of the vector XR, etc.
The smallest element in the vector X will have a rank of 1 (unless ties exist). the largest element in the vector X will have a rank of N (unless ties exist).
Although ranks are usually (unless ties exist) integral values from 1 to N, it is to be noted that they are outputted as REAL values in the vector XR. XR is so as to be consistent with the fact that all vector arguments in all other datapac subroutines are REAL; but more importantly, because ties frequently do exist in data sets and so some of the resulting ranks will be non-integral and so the output vector of ranks must necessarily be REAL and not INTEGER.
The input vector X remains unaltered.
Due to conflicting use of labeled common /block2_real32/ by this rank subroutine and the SPCORR (Spearman rank correlation coefficient) subroutine, the vector XS of this rank subroutine has been placed in labeled common /block4_real32/
The first and third arguments in the calling sequence may be identical; that is, an ‘in place’ ranking is permitted. The calling sequence call RANK(X,N,X) is valid, if desired.
The sorting algorthm used herein is the binary sort. This algorthim is extremely fast as the following time trials indicate. These time trials were carried out on the UNIVAC 1108 EXEC 8 system at NBS in August of 1974. by way of comparison, the time trial values for the easy-to-program but extremely inefficient bubble sort algorithm have also been included–
Number of random Binary sort Bubble sort numbers sorted n = 10 .002 sec .002 sec n = 100 .011 sec .045 sec n = 1000 .141 sec 4.332 sec n = 3000 .476 sec 37.683 sec n = 10000 1.887 sec not computed
X The vector of observations to be ranked.
N The integer number of observations in the vector X.
The maximum allowable value of N for this subroutine is 7500.
XR The vector into which the ranks of the X values will be placed
(in ascending order)
Sample program:
program demo_rank
use M_datapac, only : rank
implicit none
! call rank(x,y)
end program demo_rank
Results:
The original DATAPAC library was written by James Filliben of the
Statistical Engineering Division, National Institute of Standards
and Technology.
John Urban, 2022.05.31
CC0-1.0
Type | Intent | Optional | Attributes | Name | ||
---|---|---|---|---|---|---|
real(kind=wp), | intent(in) | :: | X(:) | |||
integer, | intent(in) | :: | N | |||
real(kind=wp), | intent(out) | :: | Xr(:) |
Type | Attributes | Name | Initial | |||
---|---|---|---|---|---|---|
real | :: | XS(7500) |
SUBROUTINE RANK(X,N,Xr) REAL(kind=wp),intent(in) :: X(:) INTEGER,intent(in) :: N REAL(kind=wp),intent(out) :: Xr(:) REAL(kind=wp) :: an , avrank , hold , rprev , xprev , XS INTEGER :: i , ibran , iupper , j , jmin , jp1 , k , nm1 COMMON /BLOCK4_real32/ XS(7500) !--------------------------------------------------------------------- an = N iupper = 7500 ! ! CHECK THE INPUT ARGUMENTS FOR ERRORS ! IF ( N<1 .OR. N>iupper ) THEN WRITE (G_IO,99001) iupper 99001 FORMAT(& & ' ***** FATAL ERROR--THE SECOND INPUT ARGUMENT TO RANK(3f) IS OUTSIDE THE ALLOWABLE (1,',I0,') INTERVAL *****') WRITE (G_IO,99002) N 99002 FORMAT (' ','***** THE VALUE OF THE ARGUMENT IS ',I0,' *****') RETURN ELSE IF ( N==1 ) THEN WRITE (G_IO,99003) 99003 FORMAT (' ***** NON-FATAL DIAGNOSTIC--THE SECOND INPUT ARGUMENT TO RANK(3f) HAS THE VALUE 1 *****') Xr(1) = 1.0_wp RETURN ELSE hold = X(1) DO i = 2 , N IF ( X(i)/=hold ) GOTO 50 ENDDO WRITE (G_IO,99004) hold 99004 FORMAT (' ***** NON-FATAL DIAGNOSTIC--THE FIRST INPUT ARGUMENT (A VECTOR) TO RANK(3f) HAS ALL ELEMENTS = ', & & E15.8,' *****') avrank = (an+1.0_wp)/2.0_wp DO i = 1 , N Xr(i) = avrank ENDDO RETURN ENDIF ! !-----START POINT----------------------------------------------------- ! ! FIRST SORT THE DATA FROM THE INPUT VECTOR X ! INTO THE INTERMEDIATE STORAGE VECTOR XS. ! 50 CALL SORT(X,N,XS) ! ! NOW DETERMINE THE RANKS. ! THE BASIC ALGORITHM IS TO TAKE A GIVEN ELEMENT ! IN THE ORIGINAL INPUT VECTOR X, ! AND SCAN THE SORTED VALUES IN THE XS VECTOR ! UNTIL A MATCH IS FOUND; ! WHEN A MATCH IS FOUND, THEN THE RANK FOR THAT ! VALUE IN THE XS VECTOR IS DETERMINED. ! THAT RANK IS THEN WRITTEN INTO THAT POSITION ! IN THE OUTPUT Y VECTOR WHICH CORRESPONDS TO THE POSITION OF THE ! GIVEN ELEMENT OF INTEREST IN THE ORIGINAL X VECTOR. ! THE CODE IS LENGTHENED FROM THIS BASIC ALGORITHM ! BY A SECTION WHICH CUTS DOWN THE SEARCH IN THE XS VECTOR, ! AND BY A SECTION WHICH OBVIATES (UNDER CERTAIN CIRCUMSTANCES) ! THE NEED FOR RECALCULATING THE RANK OF AN ELEMENT IN XS. ! nm1 = N - 1 xprev = X(1) DO i = 1 , N jmin = 1 IF ( X(i)>xprev ) THEN jmin = k IF ( jmin>=N ) THEN IF ( jmin==N ) GOTO 60 ibran = 1 WRITE (G_IO,99007) ibran WRITE (G_IO,99005) jmin 99005 FORMAT (' ','JMIN = ',I0) STOP ENDIF ELSEIF ( i/=1 ) THEN IF ( X(i)==xprev ) THEN Xr(i) = rprev GOTO 80 ENDIF ENDIF DO j = jmin , nm1 IF ( X(i)==XS(j) ) THEN jp1 = j + 1 DO k = jp1 , N IF ( XS(k)/=XS(j) ) GOTO 55 ENDDO k = N + 1 55 avrank = j + k - 1 avrank = avrank/2.0_wp Xr(i) = avrank GOTO 80 ENDIF ENDDO 60 j = N k = N + 1 IF ( X(i)==XS(j) ) THEN Xr(i) = N ELSE ibran = 2 WRITE (G_IO,99007) ibran WRITE (G_IO,99006) X(i) , XS(j) 99006 FORMAT (' ','X(I) = ',F15.7,' XS(J) = ',F15.7) STOP ENDIF 80 xprev = X(i) rprev = Xr(i) ENDDO ENDIF 99007 FORMAT (' ','*****INTERNAL ERROR IN RANK SUBROUTINE-- IMPOSSIBLE BRANCH CONDITION AT BRANCH POINT = ',I0) END SUBROUTINE RANK