NAME
sdbm_hash(3f) - [M_hashkeys:bucket_hash] sdbm string hash
(LICENSE:PD)
SYNOPSIS
use,intrinsic : ISO_FORTRAN_ENV, only : int64
function sdbm_hash_arr(anything,continue) result(hash_128)
class(*),intent(in) :: anything(:)
logical,intent(in),optional :: continue
integer(kind=int128) :: hash_128
DESCRIPTION
sdbm_hash(3f) is based on the string hash routine commonly known as
sdbm(3c).
this algorithm was created for the sdbm (a public-domain
reimplementation of ndbm) database library. It was found to do well
in scrambling bits, causing good distribution of the keys and fewer
splits. it also happens to be a good general hashing function with
good distribution. the actual function is
hash(i) = hash(i - 1) * 65599 + str[i]
what is available here is the faster version used
in gawk. [there is even a faster, duff-device version]. The magic
constant 65599 was picked out of thin air while experimenting with
different constants, and turns out to be a prime. this is one of the
algorithms used in berkeley db (see sleepycat) and elsewhere.
This version returns a value calculated using a 64-bit hash, which
is returned as a 128bit value (not always available in Fortran) to
allow the value to always be a positive value; as Fortran does not
(currently) support a standard unsigned integer. If the value is
changed to be a 64-bit value on platforms that do not support 128-bit
INTEGER values the value may be negative, but is otherwise usable.
Such non-reversible hashes may be used for data or file fingerprints,
to confirm unchanging results during regression testing, ...
More information is widely available on string hashes (including the
well-known sdbm(3c) algorithm) on such sources as Wikipedia. Consult
such resources to confirm the suitability of this algorithm for
your use.
The algorithm does not consider the Endian of the programming
environment.
OPTIONS
STR May be a CHARACTER string or an array of common intrinsic
types. Currently, the types defined in the procedure
are character(len=*); complex; integer(kind=int8);
integer(kind=int16); integer(kind=int32); integer(kind=int64);
integer(kind=int128); real(kind=real32); real(kind=real64);
real(kind=real128).
CONTINUE indicate whether to continue accumulating the hash value
from the last call. This is not threadsafe. This allows
for continued hashes so that a hash can be calculated for
a series of calls.
RETURNS
sdbm_hash A 128-bit INTEGER hash value for the (possibly accumulated) data.
EXAMPLE
Sample program:
program demo_sdbm_hash
use M_hashkeys, only : sdbm_hash, int128
implicit none
integer(kind=int128) :: hash
character(len=:),allocatable :: string
integer :: i
! string
string='test sdbm_hash'
hash=sdbm_hash(string)
write(*,*)'string=',string,' hash=',hash
! array of characters
hash=sdbm_hash(['t','e','s','t',' ','s','d','b','m','_','h','a','s','h'])
write(*,*)'string=',string,' hash=',hash
! continued hash
hash=sdbm_hash(['t','e','s','t'])
hash=sdbm_hash([' ','s','d','b','m'],continue=.true.)
hash=sdbm_hash(['_','h','a','s','h'],continue=.true.)
write(*,*)'string=',string,' hash=',hash
! array of integers
hash=sdbm_hash([(i,i=0,100)])
write(*,*)'hash for values 0 to 100 is ',hash
!
end program demo_sdbm_hash
Arguments
Type |
Intent | Optional | Attributes |
|
Name |
|
class(*),
|
intent(in) |
|
|
:: |
anything(:) |
|
logical,
|
intent(in), |
optional |
|
:: |
continue |
|
Return Value
integer(kind=int128)