Class CrossHash

java.lang.Object
squidpony.squidmath.CrossHash

public class CrossHash
extends Object
64-bit and 32-bit hashing functions that we can rely on staying the same cross-platform. Several algorithms are present here, each with some tradeoffs for performance, quality, and extra features. Each algorithm was designed for speed and general-purpose usability, but not cryptographic security.
The hashes this returns are always 0 when given null to hash. Arrays with identical elements of identical types will hash identically. Arrays with identical numerical values but different types will sometimes hash differently. This class always provides 64-bit hashes via hash64() and 32-bit hashes via hash(), and Wisp provides a hash32() method that matches older behavior and uses only 32-bit math. The hash64() and hash() methods, except in Hive, use 64-bit math even when producing 32-bit hashes, for GWT reasons. GWT doesn't have the same behavior as desktop and Android applications when using ints because it treats doubles mostly like ints, sometimes, due to it using JavaScript. If we use mainly longs, though, GWT emulates the longs with a more complex technique behind-the-scenes, that behaves the same on the web as it does on desktop or on a phone. Since CrossHash is supposed to be stable cross-platform, this is the way we need to go, despite it being slightly slower.
The static methods in CrossHash, like hash64(int[]), delegate to the CrossHash.Water algorithm. This is a fairly fast and heavily-tested hash that developed from something like Wang Yi's wyhash algorithm, though only the constants and the general concept of a mum() function are shared with wyhash. There are several static inner classes in CrossHash CrossHash.Water (already mentioned), CrossHash.Yolk (which is very close to Water but allows a 64-bit salt or seed), CrossHash.Curlup (which is the fastest hash here for larger inputs, and also allows a 64-bit seed), CrossHash.Mist (which allows a 128-bit salt, but has mediocre quality), CrossHash.Hive (which is mostly here for compatibility, but has OK quality and good collision rates), and CrossHash.Wisp (which is fast for small inputs but has bad collision rates). There's also the inner IHasher interface, and the classes that implement it. Water, Yolk, and Curlup all pass the rigorous SMHasher test battery. The others don't pass it in full, or sometimes at all.
IHasher values are provided as static fields, and use Water to hash a specific type or fall back to Object.hashCode if given an object with the wrong type. IHasher values are optional parts of OrderedMap, OrderedSet, Arrangement, and the various classes that use Arrangement like K2 and K2V1, and allow arrays to be used as keys in those collections while keeping hashing by value instead of the normal hashing by reference for arrays. You probably won't ever need to make a class that implements IHasher yourself; for some cases you may want to look at the Hashers class for additional functions.
Note: This class was formerly called StableHash, but since that refers to a specific category of hashing algorithm that this is not, and since the goal is to be cross- platform, the name was changed to CrossHash. Note 2: FNV-1a was removed from SquidLib on July 25, 2017, and replaced as default with Wisp; Wisp was later replaced as default by Hive, and in June 2019 Hive was replaced by Water. Wisp was used because at the time SquidLib preferred 64-bit math when math needed to be the same across platforms; math on longs behaves the same on GWT as on desktop, despite being slower. Hive passed an older version of SMHasher, a testing suite for hashes, where Wisp does not (it fails just like Arrays.hashCode() does). Hive uses a cross-platform subset of the possible 32-bit math operations when producing 32-bit hashes of data that doesn't involve longs or doubles, and this should speed up the CrossHash.Hive.hash() methods a lot on GWT, but it slows down 32-bit output on desktop-class JVMs. Water became the default when newer versions of SMHasher showed that Hive wasn't as high-quality as it had appeared, and the recently-debuted wyhash by Wang Yi, a variation on a hash called MUM, opened some possibilities for structures that are simple but also very fast. Water is modeled after wyhash and uses the same constants in its hash64() methods, but avoids the 128-bit multiplication that wyhash uses. Because both wyhash and Water operate on 4 items at a time, they tend to be very fast on desktop platforms, but Water probably won't be amazing at GWT performance. Similarly, the recently-added Curlup performs very well due to SIMD optimizations that HotSpot performs, and probably won't do as well on GWT or Android.
Created by Tommy Ettinger on 1/16/2016.
Author:
Tommy Ettinger