C Library Functions - M_orderpack (3)

NAME

M_orderpack(3f) - [M_orderpack::INTRO]General and Specialized Ranking and Sorting Routines

Synopsis
Description
Sorting
Ranking
Unique
Multiplicity
Permutation
Rationale
Introduction
Ranking Versus Sorting
Ranking
Optimization Choices
Interface
A Word Of Apology
Authors
Maintainers
License

SYNOPSIS

Procedure names and syntax:

    use M_orderpack, only : &
     Sort,             & ! Subroutine Sort             (INOUTVALS)
     Sort_Special,     & ! Subroutine Sort_Special     (INOUTVALS)
     Psort,            & ! Subroutine Psort            (INOUTVALS, nord)

     Rank,             & ! Subroutine Rank             (INVALS, imult)
     Rank_Basic,       & ! Subroutine Rank_Basic       (INVALS, irngt)
     Rank_Decreasing,  & ! Subroutine Rank_Decreasing  (INVALS, igoest)
     Rank_Unique,      & ! Subroutine Rank_Unique      (INVALS, irngt, nuni)

     Prank,            & ! Subroutine Prank            (INVALS, irngt, nord)
     Prank_Basic,      & ! Subroutine Prank_Basic      (INVALS, irngt, nord)
     Prank_Decreasing, & ! Subroutine Prank_Decreasing (INVALS, irngt, nord)
     Prank_Special,    & ! Subroutine Prank_Special,   (INVALS, irngt, nord)
     Prank_Unique,     & ! Subroutine Prank_Unique     (INVALS, irngt, nord)

     Median,           & ! Function Median             (INVALS)
     MedianVal,        & ! Function MedianVal          (INVALS)
     MedianLoc,        & ! Subroutine MedianLoc        (INVALS, indm)

     Orderval,         & ! Function OrderVal           (INVALS, nord)
     OrderLoc,         & ! Integer Function OrderLoc   (INVALS, nord)
     Orderval_Special, & ! Function OrderVal_Special   (INVALS, nord)

     Occurrences,      & ! Subroutine Occurrences      (INVALS, imult)
     Unique,           & ! Subroutine Unique           (INOUTVALS, nuni)
     Perturb             ! Subroutine Perturb          (INOUTVALS, CLOSENESS)

The procedures may be accessed via their original names in ORDERPACK2.0 as well, one per module:

! previous ORDERPACK2.0 name ! ORDERPACK 2.1 name use M_orderpack__refsor, only : refsor ! Sort use M_orderpack__inssor, only : inssor ! Sort_special use M_orderpack__inspar, only : inspar ! psort use M_orderpack__mrgrnk, only : mrgrnk ! rank use M_orderpack__mrgref, only : mrgref ! rank_basic use M_orderpack__uniinv, only : uniinv ! rank_decreasing use M_orderpack__unirnk, only : unirnk ! rank_unique use M_orderpack__rnkpar, only : rnkpar ! prank use M_orderpack__refpar, only : refpar ! prank_basic use M_orderpack__rapknr, only : rapknr ! prank_decreasing use M_orderpack__rinpar, only : rinpar ! prank_special use M_orderpack__unipar, only : unipar ! prank_unique use M_orderpack__median, only : median ! median use M_orderpack__valmed, only : valmed ! medianval use M_orderpack__indmed, only : indmed ! medianloc use M_orderpack__valnth, only : valnth ! orderval use M_orderpack__indnth, only : indnth ! orderloc use M_orderpack__fndnth, only : fndnth ! orderval_special use M_orderpack__mulcnt, only : mulcnt ! occurrences use M_orderpack__unista, only : unista ! unique use M_orderpack__ctrper, only : ctrper ! perturb

DESCRIPTION

M_ORDERPACK 2.1 - Unconditional, Unique and Partial Ranking, Sorting, and Permutation
M_ORDERPACK 2.1 performs both conventional sorting and ranking as well as the rarer specialized ordering tasks such as partial sorting, partial ranking, unique sorting, unique ranking, inverse unique ranking, and more. These partial sort and ranking routines can greatly accelerate many computations when users need only the M largest or smallest elements out of a N-element vector.
All the specialized procedures have a range over which they far outperform a basic sort, and most have a range where they dramatically underperform. If you are not limited by memory requirements or have no issues with runtimes the simplest solution may be just to use SORT(3f) and RANK(3f).
Otherwise, your solution method may very well depend on the size of the input arrays, whether the data is already close to the required order, or how costly it is to create work arrays or an index array.
So, if you want the smallest value in an array call the intrinsic MINVAL(3f), not ORDERVAL(3f).

SORTING

FULL SORTING

Sort Sorts array into ascending order (Quick-sort)

Sort_Special
Sorts array into ascending order (Insertion sort, generally for small or nearly sorted arrays)

PARTIAL SORTING

Psort partially sorts an array

Orderval
Return VALUE of Nth lowest value of array (Quick-Sort)

Orderval_Special
Return Nth lowest value of an array (Insert-sort, generally for small or nearly sorted arrays))

MedianVal
finds the median of an array

Median Return median value of array. If number of elements is even, return average of the two "medians"

RANKING

UNCONDITIONAL RANKING

Rank ranks array (optimized merge-sort)

Rank_Basic
ranks array (basic merge-sort)

PARTIAL RANKING

Prank partially ranks array (Optimized Quick-Sort)

Prank_Basic
partially ranks array

Prank_Decreasing
partially ranks array in DECREASING order

Prank_Special
partially ranks array (Basic Insert-Sort)

Orderloc
Return INDEX of Nth value of array (Quick-Sort-like)

MedianLoc
Returns INDEX of median value of an array.

UNIQUE RANKING

Rank_Unique
performs a Merge-Sort ranking of an array, with removal of duplicate entries.

Rank_Decreasing
an inverse ranking of an array, with duplicate entries assigned the same rank.

Prank_Unique
partially rank an array removing duplicates

UNIQUE

Unique Removes duplicates from an array otherwise retaining original order

MULTIPLICITY

Occurrences
Give the multiplicity for each array value

PERMUTATION

Perturb
a random permutation of an array, optionally leaving elements close to initial locations

RATIONALE

While Fortran 90 and later variants have made life much easier for scientific programmers than Fortran 77, the language still lacks depth in public domain utilities. The following package, M_ORDERPACK 2.1, provides important but uncommon routines needed to complete the Fortran programming environment.

INTRODUCTION

The existing fortran code base provides many conventional ranking or sorting routines, but very few specialized ranking or sorting routines. Specifically, we know of no other Fortran code which sorts or ranks only a small proportion of an array (partial ordering). Such partial ranking routines have applications in statistics for rapidly computing extreme order statistics, finding nearest neighbors, and other clustering operations. In addition, many applications need to work with only the unique values in an array (unique ordering). Such unique ranking routines allow users to isolate individual cases out of a mass of discrete data. Many times the frequency of the unique values proves interesting (e.g., empirical distributions).
M_ORDERPACK handles all of these ordering needs.
Also, M_ORDERPACK contains a partial unique ranking routine. Such a routine would prove useful in finding a limited number of unique values in an array.
Inversion of orderings becomes difficult when duplicates exist (not a one-to-one relation). The M_ORDERPACK inverse ranking routine handles this difficult case.
As an added bonus M_ORDERPACK provides an unusual routine which allows user controllable partial random permutation of arrays.
M_ORDERPACK of course contains conventional or unconditional sorting routines as well.
Finally, many Fortran sorting or ranking routines do not take advantage of available memory and cache to maximize performance. The routines in M_ORDERPACK have been designed to take advantage of modern machines.

RANKING VERSUS SORTING

Ranking consists in finding, for each element of a set, its order (rank) in the sorted set, without effectively changing the initial order (or disorder! ) of the set. In many instances, it better suits the actual need of the user than sorting, as the ranks can then be used to order other related sets or components of a user type.
Ranking is especially needed when the sizes of the elements are large, and therefore moving them around is resource-consuming.

RANKING

In some instances, one is not actually interested in modifying the order of the elements in a set, but only in knowing how to access them in increasing -- or decreasing -- order. Ranking, as it is called, provides the index array I(:) such as the set S(I(:)) is ordered. One of the advantages of carrying out ranking rather than sorting is that the index array can be computed without the performance penalty of moving the elements around when they are of large sizes. A similar point is that the index array can be used to index other data.

OPTIMIZATION CHOICES

We tried to take into account the recent trends in computing to make our compromise choices. Of course, no two problems are the same, and for some of them the following decisions may happen to be wrong. We just hope that for most cases, they will be right.

o Make extensive use of work arrays: Memory can be extended, time cannot.

o Try to reduce the number of operations in the inner loops, even if it increases code size.

o Assume that cache size is relatively small, and try to maximize cache hits.

INTERFACE

Robust routines make their interface known to the calling program. There are three main ways to implement this in Fortran:

o Explicit interfaces, either included in the body of the calling routine, or gathered in an ’interface module’. An example of including an interface block in the calling program can be found in the sample program sort7.f90.

o Embedding the routine of interest as a "contained routine" into the calling procedure. An example of such way can be found in the follow.f90 program, that rebuilds a curve from a set of X, Y coordinates.

o Embedding the routine of interest into a MODULE, and USEing that module in the procedure that calls the routine. This creates order dependencies when compiling code, generally resulting in requiring such tools as Makefiles but has many other benefits, such as most easily allowing for generic versions of the routines, This is the way we used here. An example of use is provided as the test program tstvalnth.f90.

A WORD OF APOLOGY

When one looks at the description of a sorting algorithm, the process seems pretty simple, and can usually be held in 10 to 20 lines of pseudo-code. But if one wants an optimized program, one takes this simple implementation, and looks for redundant operations, investigates runs with sample data sets with a profiling tool, and is led to duplicate code with slight modifications rather than use tests in inner loops, to process differently the first and the last iterations, or to take into account some special cases that are only special in that they can be done faster.
In the end, the number of lines of source code may be multiplied tenfold, and the readability decreased in a similar proportion. Unfortunately, this is the price to pay for speed of execution. It was that way when I started programming more than 20 years ago, and I have forsaken any hope that it might become otherwise before I return to dust. So please accept my apologies that this code is often complex and difficult to read.

AUTHORS

Michel Olagnon IFREMER Brest / Michel.Olagnon@ifremer.fr
2000- 2013/11/06

MAINTAINERS

John S. Urban, 2022-04-16

LICENSE

CC0-1.0

Nemo Release 3.1

M_orderpack (3)

February 23, 2025

Generated by manServer 1.08 from 7915ee66-426e-4a8e-9704-54499ac97b13 using man macros.

Sort	Sorts array into ascending order (Quick-sort)
Sort_Special
	Sorts array into ascending order (Insertion sort, generally for small or nearly sorted arrays)

Psort	partially sorts an array
Orderval
	Return VALUE of Nth lowest value of array (Quick-Sort)
Orderval_Special
	Return Nth lowest value of an array (Insert-sort, generally for small or nearly sorted arrays))
MedianVal
	finds the median of an array
Median	Return median value of array. If number of elements is even, return average of the two "medians"

Rank	ranks array (optimized merge-sort)
Rank_Basic
	ranks array (basic merge-sort)

Prank	partially ranks array (Optimized Quick-Sort)
Prank_Basic
	partially ranks array
Prank_Decreasing
	partially ranks array in DECREASING order
Prank_Special
	partially ranks array (Basic Insert-Sort)
Orderloc
	Return INDEX of Nth value of array (Quick-Sort-like)
MedianLoc
	Returns INDEX of median value of an array.

Rank_Unique
	performs a Merge-Sort ranking of an array, with removal of duplicate entries.
Rank_Decreasing
	an inverse ranking of an array, with duplicate entries assigned the same rank.
Prank_Unique
	partially rank an array removing duplicates

Perturb
	a random permutation of an array, optionally leaving elements close to initial locations

o	Make extensive use of work arrays: Memory can be extended, time cannot.
o	Try to reduce the number of operations in the inner loops, even if it increases code size.
o	Assume that cache size is relatively small, and try to maximize cache hits.

o	Explicit interfaces, either included in the body of the calling routine, or gathered in an ’interface module’. An example of including an interface block in the calling program can be found in the sample program sort7.f90.
o	Embedding the routine of interest as a "contained routine" into the calling procedure. An example of such way can be found in the follow.f90 program, that rebuilds a curve from a set of X, Y coordinates.
o	Embedding the routine of interest into a MODULE, and USEing that module in the procedure that calls the routine. This creates order dependencies when compiling code, generally resulting in requiring such tools as Makefiles but has many other benefits, such as most easily allowing for generic versions of the routines, This is the way we used here. An example of use is provided as the test program tstvalnth.f90.

C Library Functions - M_orderpack (3)

NAME

CONTENTS

SYNOPSIS

DESCRIPTION

SORTING

FULL SORTING

PARTIAL SORTING

RANKING

UNCONDITIONAL RANKING

PARTIAL RANKING

UNIQUE RANKING

UNIQUE

MULTIPLICITY

PERMUTATION

RATIONALE

INTRODUCTION

RANKING VERSUS SORTING

RANKING

OPTIMIZATION CHOICES

INTERFACE

A WORD OF APOLOGY

AUTHORS

MAINTAINERS

LICENSE