Package squidpony.squidmath
Class CrossHash
java.lang.Object
squidpony.squidmath.CrossHash
public class CrossHash extends Object
64-bit and 32-bit hashing functions that we can rely on staying the same cross-platform.
Several algorithms are present here, each with some tradeoffs for performance, quality,
and extra features. Each algorithm was designed for speed and general-purpose usability,
but not cryptographic security.
The hashes this returns are always 0 when given null to hash. Arrays with identical elements of identical types will hash identically. Arrays with identical numerical values but different types will sometimes hash differently. This class always provides 64-bit hashes via hash64() and 32-bit hashes via hash(), and Wisp provides a hash32() method that matches older behavior and uses only 32-bit math. The hash64() and hash() methods, except in Hive, use 64-bit math even when producing 32-bit hashes, for GWT reasons. GWT doesn't have the same behavior as desktop and Android applications when using ints because it treats doubles mostly like ints, sometimes, due to it using JavaScript. If we use mainly longs, though, GWT emulates the longs with a more complex technique behind-the-scenes, that behaves the same on the web as it does on desktop or on a phone. Since CrossHash is supposed to be stable cross-platform, this is the way we need to go, despite it being slightly slower.
The static methods in CrossHash, like
IHasher values are provided as static fields, and use Water to hash a specific type or fall back to Object.hashCode if given an object with the wrong type. IHasher values are optional parts of OrderedMap, OrderedSet, Arrangement, and the various classes that use Arrangement like K2 and K2V1, and allow arrays to be used as keys in those collections while keeping hashing by value instead of the normal hashing by reference for arrays. You probably won't ever need to make a class that implements IHasher yourself; for some cases you may want to look at the
Note: This class was formerly called StableHash, but since that refers to a specific category of hashing algorithm that this is not, and since the goal is to be cross- platform, the name was changed to CrossHash. Note 2: FNV-1a was removed from SquidLib on July 25, 2017, and replaced as default with Wisp; Wisp was later replaced as default by Hive, and in June 2019 Hive was replaced by Water. Wisp was used because at the time SquidLib preferred 64-bit math when math needed to be the same across platforms; math on longs behaves the same on GWT as on desktop, despite being slower. Hive passed an older version of SMHasher, a testing suite for hashes, where Wisp does not (it fails just like Arrays.hashCode() does). Hive uses a cross-platform subset of the possible 32-bit math operations when producing 32-bit hashes of data that doesn't involve longs or doubles, and this should speed up the CrossHash.Hive.hash() methods a lot on GWT, but it slows down 32-bit output on desktop-class JVMs. Water became the default when newer versions of SMHasher showed that Hive wasn't as high-quality as it had appeared, and the recently-debuted wyhash by Wang Yi, a variation on a hash called MUM, opened some possibilities for structures that are simple but also very fast. Water is modeled after wyhash and uses the same constants in its hash64() methods, but avoids the 128-bit multiplication that wyhash uses. Because both wyhash and Water operate on 4 items at a time, they tend to be very fast on desktop platforms, but Water probably won't be amazing at GWT performance. Similarly, the recently-added Curlup performs very well due to SIMD optimizations that HotSpot performs, and probably won't do as well on GWT or Android.
Created by Tommy Ettinger on 1/16/2016.
The hashes this returns are always 0 when given null to hash. Arrays with identical elements of identical types will hash identically. Arrays with identical numerical values but different types will sometimes hash differently. This class always provides 64-bit hashes via hash64() and 32-bit hashes via hash(), and Wisp provides a hash32() method that matches older behavior and uses only 32-bit math. The hash64() and hash() methods, except in Hive, use 64-bit math even when producing 32-bit hashes, for GWT reasons. GWT doesn't have the same behavior as desktop and Android applications when using ints because it treats doubles mostly like ints, sometimes, due to it using JavaScript. If we use mainly longs, though, GWT emulates the longs with a more complex technique behind-the-scenes, that behaves the same on the web as it does on desktop or on a phone. Since CrossHash is supposed to be stable cross-platform, this is the way we need to go, despite it being slightly slower.
The static methods in CrossHash, like
hash64(int[])
, delegate to the CrossHash.Water
algorithm. This is a fairly fast and heavily-tested hash that developed from something like
Wang Yi's wyhash algorithm, though only the constants and the general concept of a mum()
function are shared with wyhash. There are several static inner classes in CrossHash
CrossHash.Water
(already mentioned), CrossHash.Yolk
(which is very close to Water but allows a
64-bit salt or seed), CrossHash.Curlup
(which is the fastest hash here for larger inputs, and
also allows a 64-bit seed), CrossHash.Mist
(which allows a 128-bit salt, but has mediocre
quality), CrossHash.Hive
(which is mostly here for compatibility, but has OK quality and good
collision rates), and CrossHash.Wisp
(which is fast for small inputs but has bad collision
rates). There's also the inner IHasher interface, and the classes that implement it.
Water, Yolk, and Curlup all pass the rigorous SMHasher test battery. The others don't pass
it in full, or sometimes at all.
IHasher values are provided as static fields, and use Water to hash a specific type or fall back to Object.hashCode if given an object with the wrong type. IHasher values are optional parts of OrderedMap, OrderedSet, Arrangement, and the various classes that use Arrangement like K2 and K2V1, and allow arrays to be used as keys in those collections while keeping hashing by value instead of the normal hashing by reference for arrays. You probably won't ever need to make a class that implements IHasher yourself; for some cases you may want to look at the
Hashers
class for additional functions.
Note: This class was formerly called StableHash, but since that refers to a specific category of hashing algorithm that this is not, and since the goal is to be cross- platform, the name was changed to CrossHash. Note 2: FNV-1a was removed from SquidLib on July 25, 2017, and replaced as default with Wisp; Wisp was later replaced as default by Hive, and in June 2019 Hive was replaced by Water. Wisp was used because at the time SquidLib preferred 64-bit math when math needed to be the same across platforms; math on longs behaves the same on GWT as on desktop, despite being slower. Hive passed an older version of SMHasher, a testing suite for hashes, where Wisp does not (it fails just like Arrays.hashCode() does). Hive uses a cross-platform subset of the possible 32-bit math operations when producing 32-bit hashes of data that doesn't involve longs or doubles, and this should speed up the CrossHash.Hive.hash() methods a lot on GWT, but it slows down 32-bit output on desktop-class JVMs. Water became the default when newer versions of SMHasher showed that Hive wasn't as high-quality as it had appeared, and the recently-debuted wyhash by Wang Yi, a variation on a hash called MUM, opened some possibilities for structures that are simple but also very fast. Water is modeled after wyhash and uses the same constants in its hash64() methods, but avoids the 128-bit multiplication that wyhash uses. Because both wyhash and Water operate on 4 items at a time, they tend to be very fast on desktop platforms, but Water probably won't be amazing at GWT performance. Similarly, the recently-added Curlup performs very well due to SIMD optimizations that HotSpot performs, and probably won't do as well on GWT or Android.
Created by Tommy Ettinger on 1/16/2016.
- Author:
- Tommy Ettinger
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
CrossHash.Curlup
Like Yolk, this is a class for hash functors, each an object with a 64-bit long seed.static class
CrossHash.Hive
A reasonably-fast hashing function that passes some of SMHasher's quality tests, but neither critically fails nor overwhelmingly succeeds the full SMHasher test battery.static interface
CrossHash.IHasher
An interface that can be used to move the logic for the hashCode() and equals() methods from a class' methods to an implementation of IHasher that certain collections in SquidLib can use.static class
CrossHash.Lightning
A quick, simple hashing function that seems to have good results.static class
CrossHash.Mist
A whole cluster of Wisp-like hash functions that sacrifice a small degree of speed, but can be built with up to 128 bits of salt values that help to obscure what hashing function is actually being used.static class
CrossHash.Water
A fairly fast hashing algorithm in general, Water performs especially well on large arrays, and passes SMHasher's newest and most stringent version of tests.static class
CrossHash.Wisp
The fastest hash in CrossHash, with middling quality.static class
CrossHash.Yolk
Like Mist, this is a class for hash functors, each an object with a 64-bit long seed, but it uses about the same algorithm asCrossHash.Water
instead of the older, less-robust style Mist uses. -
Field Summary
-
Constructor Summary
Constructors Constructor Description CrossHash()
-
Method Summary
Modifier and Type Method Description static boolean
equalityHelper(Object[] left, Object[] right, CrossHash.IHasher inner)
Not a general-purpose method; meant to ease implementation ofCrossHash.IHasher.areEqual(Object, Object)
methods when the type being compared is a multi-dimensional array (which normally requires the heavyweight methodArrays.deepEquals(Object[], Object[])
or doing more work yourself; this reduces the work needed to implement fixed-depth equality).static int
hash(boolean[] data)
static int
hash(byte[] data)
static int
hash(char[] data)
static int
hash(char[][] data)
static int
hash(char[] data, int start, int end)
Hashes only a subsection of the given data, starting at start (inclusive) and ending before end (exclusive).static int
hash(double[] data)
static int
hash(float[] data)
static int
hash(int[] data)
static int
hash(int[][] data)
static int
hash(long[] data)
static int
hash(long[][] data)
static int
hash(short[] data)
static int
hash(CharSequence data)
static int
hash(CharSequence[] data)
static int
hash(CharSequence[]... data)
static int
hash(CharSequence data, int start, int end)
Hashes only a subsection of the given data, starting at start (inclusive) and ending before end (exclusive).static int
hash(Iterable<? extends CharSequence> data)
static int
hash(Object data)
static int
hash(Object[] data)
static int
hash(List<? extends CharSequence> data)
static long
hash64(boolean[] data)
static long
hash64(byte[] data)
static long
hash64(char[] data)
static long
hash64(char[][] data)
static long
hash64(char[] data, int start, int end)
Hashes only a subsection of the given data, starting at start (inclusive) and ending before end (exclusive).static long
hash64(double[] data)
static long
hash64(float[] data)
static long
hash64(int[] data)
static long
hash64(int[][] data)
static long
hash64(long[] data)
static long
hash64(long[][] data)
static long
hash64(short[] data)
static long
hash64(CharSequence data)
static long
hash64(CharSequence[] data)
static long
hash64(CharSequence[]... data)
static long
hash64(CharSequence data, int start, int end)
Hashes only a subsection of the given data, starting at start (inclusive) and ending before end (exclusive).static long
hash64(Iterable<? extends CharSequence> data)
static long
hash64(Object data)
static long
hash64(Object[] data)
static long
hash64(List<? extends CharSequence> data)
-
Field Details
-
booleanHasher
-
byteHasher
-
shortHasher
-
charHasher
-
intHasher
-
longHasher
-
floatHasher
-
doubleHasher
-
char2DHasher
-
int2DHasher
-
long2DHasher
-
stringHasher
-
stringArrayHasher
Though the name suggests this only hashes String arrays, it can actually hash any CharSequence array as well. -
objectArrayHasher
-
defaultHasher
-
mildHasher
The most basic IHasher type; effectively delegates toObjects.hashCode(Object)
andObjects.equals(Object, Object)
. Might not scramble the bits of a hash well enough to have good performance in a hash table lkeOrderedMap
orUnorderedSet
, unless the objects being hashed have good hashCode() implementations already. -
identityHasher
-
generalHasher
This IHasher is the one you should use if you aren't totally certain what types will go in an OrderedMap's keys or an OrderedSet's items, since it can handle mixes of elements.
-
-
Constructor Details
-
Method Details
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
Hashes only a subsection of the given data, starting at start (inclusive) and ending before end (exclusive).- Parameters:
data
- the char array to hashstart
- the start of the section to hash (inclusive)end
- the end of the section to hash (exclusive)- Returns:
- a 64-bit hash code for the requested section of data
-
hash64
Hashes only a subsection of the given data, starting at start (inclusive) and ending before end (exclusive).- Parameters:
data
- the String or other CharSequence to hashstart
- the start of the section to hash (inclusive)end
- the end of the section to hash (exclusive)- Returns:
- a 64-bit hash code for the requested section of data
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash64
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
Hashes only a subsection of the given data, starting at start (inclusive) and ending before end (exclusive).- Parameters:
data
- the char array to hashstart
- the start of the section to hash (inclusive)end
- the end of the section to hash (exclusive)- Returns:
- a 32-bit hash code for the requested section of data
-
hash
Hashes only a subsection of the given data, starting at start (inclusive) and ending before end (exclusive).- Parameters:
data
- the String or other CharSequence to hashstart
- the start of the section to hash (inclusive)end
- the end of the section to hash (exclusive)- Returns:
- a 32-bit hash code for the requested section of data
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
-
hash
-
equalityHelper
Not a general-purpose method; meant to ease implementation ofCrossHash.IHasher.areEqual(Object, Object)
methods when the type being compared is a multi-dimensional array (which normally requires the heavyweight methodArrays.deepEquals(Object[], Object[])
or doing more work yourself; this reduces the work needed to implement fixed-depth equality). As mentioned in the docs forCrossHash.IHasher.areEqual(Object, Object)
, example code that hashes 2D char arrays can be done using an IHasher for 1D char arrays called charHasher:return left == right || ((left instanceof char[][] && right instanceof char[][]) ? equalityHelper((char[][]) left, (char[][]) right, charHasher) : Objects.equals(left, right));
- Parameters:
left
- an array of some kind of Object, usually an array, that the given IHasher can compareright
- an array of some kind of Object, usually an array, that the given IHasher can compareinner
- an IHasher to compare items in left with items in right- Returns:
- true if the contents of left and right are equal by the given IHasher, otherwise false
-