sdbm_hash(3f) - [M_hashkeys:bucket_hash] sdbm string hash (LICENSE:PD)
Synopsis
Description
Options
Returns
Example
use,intrinsic : ISO_FORTRAN_ENV, only : int64 function sdbm_hash_arr(anything,continue) result(hash_128)
class(*),intent(in) :: anything(:) logical,intent(in),optional :: continue integer(kind=int128) :: hash_128
sdbm_hash(3f) is based on the string hash routine commonly known as sdbm(3c).this algorithm was created for the sdbm (a public-domain reimplementation of ndbm) database library. It was found to do well in scrambling bits, causing good distribution of the keys and fewer splits. it also happens to be a good general hashing function with good distribution. the actual function is
hash(i) = hash(i - 1) * 65599 + str[i]what is available here is the faster version used in gawk. [there is even a faster, duff-device version]. The magic constant 65599 was picked out of thin air while experimenting with different constants, and turns out to be a prime. this is one of the algorithms used in berkeley db (see sleepycat) and elsewhere.
This version returns a value calculated using a 64-bit hash, which is returned as a 128bit value (not always available in Fortran) to allow the value to always be a positive value; as Fortran does not (currently) support a standard unsigned integer. If the value is changed to be a 64-bit value on platforms that do not support 128-bit INTEGER values the value may be negative, but is otherwise usable.
Such non-reversible hashes may be used for data or file fingerprints, to confirm unchanging results during regression testing, ...
More information is widely available on string hashes (including the well-known sdbm(3c) algorithm) on such sources as Wikipedia. Consult such resources to confirm the suitability of this algorithm for your use.
The algorithm does not consider the Endian of the programming environment.
STR May be a CHARACTER string or an array of common intrinsic types. Currently, the types defined in the procedure are character(len=*); complex; integer(kind=int8); integer(kind=int16); integer(kind=int32); integer(kind=int64); integer(kind=int128); real(kind=real32); real(kind=real64); real(kind=real128). CONTINUE indicate whether to continue accumulating the hash value from the last call. This is not threadsafe. This allows for continued hashes so that a hash can be calculated for a series of calls.
sdbm_hash A 128-bit INTEGER hash value for the (possibly accumulated) data.
Sample program:
program demo_sdbm_hash use M_hashkeys, only : sdbm_hash, int128 implicit none integer(kind=int128) :: hash character(len=:),allocatable :: string integer :: i ! string string=test sdbm_hash hash=sdbm_hash(string) write(*,*)string=,string, hash=,hash ! array of characters hash=sdbm_hash([t,e,s,t, ,s,d,b,m,_,h,a,s,h]) write(*,*)string=,string, hash=,hash ! continued hash hash=sdbm_hash([t,e,s,t]) hash=sdbm_hash([ ,s,d,b,m],continue=.true.) hash=sdbm_hash([_,h,a,s,h],continue=.true.) write(*,*)string=,string, hash=,hash ! array of integers hash=sdbm_hash([(i,i=0,100)]) write(*,*)hash for values 0 to 100 is ,hash ! end program demo_sdbm_hash
Nemo Release 3.1 | sdbm_hash (3) | February 23, 2025 |