Class OrderedSet<K>
- All Implemented Interfaces:
Serializable,Cloneable,Iterable<K>,Collection<K>,Set<K>,SortedSet<K>
- Direct Known Subclasses:
EnumOrderedSet
public class OrderedSet<K> extends Object implements SortedSet<K>, Serializable, Cloneable
Instances of this class use a hash table to represent a set. The table is filled up to a specified load factor, and then doubled in size to accommodate new entries. If the table is emptied below one fourth of the load factor, it is halved in size. However, halving is not performed when deleting entries from an iterator, as it would interfere with the iteration process.
Note that clear() does not modify the hash table size. Rather, a
family of trimming methods lets you control the size of
the table; this is particularly useful if you reuse instances of this class.
Iterators generated by this set will enumerate elements in the same order in
which they have been added to the set (addition of elements already present
in the set does not change the iteration order). Note that this order has
nothing in common with the natural order of the keys. The order is kept by
means of an array list, represented via an IntVLA parallel to the
table that can be modified with methods like shuffle(IRNG).
This class implements the interface of a sorted set, so to allow easy access
of the iteration order: for instance, you can get the first element in
iteration order with first() without having to create an iterator;
however, this class partially violates the SortedSet
contract because all subset methods throw an exception and
comparator() returns always null.
Additional methods, such as addAndMoveToFirst(), make it easy to
use instances of this class as a cache (e.g., with LRU policy).
This class allows approximately constant-time lookup of keys or values by their index in the ordering, which can
allow some novel usage of the data structure. OrderedSet can be used like a list of unique elements, keeping order
like a list does but also allowing rapid checks for whether an item exists in the OrderedSet, and OrderedMap
can be used like that but with values associated as well (where OrderedSet uses contains(), OrderedMap uses
containsKey()). You can also set the item at a position with addAt(Object, int), or alter an item while
keeping index the same with alter(Object, Object). Reordering works here too, both with completely random
orders from shuffle(IRNG) or with a previously-generated ordering from reorder(int...) (you can
produce such an ordering for a given size and reuse it across multiple Ordered data structures with
IRNG.randomOrdering(int)).
You can pass an CrossHash.IHasher instance such as CrossHash.generalHasher as an extra parameter to
most of this class' constructors, which allows the OrderedSet to use arrays (usually primitive arrays) as items. If
you expect only one type of array, you can use an instance like CrossHash.intHasher to hash int arrays, or
the aforementioned generalHasher to hash most kinds of arrays (it can't handle most multi-dimensional arrays well).
If you aren't using array items, you don't need to give an IHasher to the constructor and can ignore this feature.
Thank you, Sebastiano Vigna, for making FastUtil available to the public with such high quality.
See https://github.com/vigna/fastutil for the original library.
- Author:
- Sebastiano Vigna (responsible for all the hard parts), Tommy Ettinger (mostly responsible for squashing several layers of parent classes into one monster class)
- See Also:
- Serialized Form
-
Field Summary
Fields Modifier and Type Field Description protected booleancontainsNullWhether this set contains the key zero.static intDEFAULT_INITIAL_SIZEThe initial default size of a hash table.static floatDEFAULT_LOAD_FACTORThe default load factor of a hash table.floatfThe acceptable load factor.static floatFAST_LOAD_FACTORThe load factor for a (usually small) table that is meant to be particularly fast.protected CrossHash.IHasherhasherprotected K[]keyThe array of keys.protected intmaskThe mask for wrapping a position counter.protected intmaxFillThreshold after which we rehash.protected intnThe current table size.protected IntVLAorderAn IntVLA (variable-length int sequence) that stores the positions in the key array of specific keys, with the positions in insertion order.protected intsizeNumber of entries in the set (including the key zero, if present).static floatVERY_FAST_LOAD_FACTORThe load factor for a (usually very small) table that is meant to be extremely fast. -
Constructor Summary
Constructors Constructor Description OrderedSet()Creates a new hash set with initial expectedDEFAULT_INITIAL_SIZEelements andDEFAULT_LOAD_FACTORas load factor.OrderedSet(int expected)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor.OrderedSet(int expected, float f)Creates a new hash map.OrderedSet(int expected, float f, CrossHash.IHasher hasher)Creates a new hash map.OrderedSet(int expected, CrossHash.IHasher hasher)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor.OrderedSet(Collection<? extends K> c)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor copying a given collection.OrderedSet(Collection<? extends K> c, float f)Creates a new hash set copying a given collection.OrderedSet(Collection<? extends K> c, float f, CrossHash.IHasher hasher)Creates a new hash set copying a given collection.OrderedSet(Collection<? extends K> c, CrossHash.IHasher hasher)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor copying a given collection.OrderedSet(Iterator<? extends K> i)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor using elements provided by a type-specific iterator.OrderedSet(Iterator<? extends K> i, float f)Creates a new hash set using elements provided by a type-specific iterator.OrderedSet(K[] a)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor copying the elements of an array.OrderedSet(K[] a, float f)Creates a new hash set copying the elements of an array.OrderedSet(K[] a, float f, CrossHash.IHasher hasher)Creates a new hash set copying the elements of an array.OrderedSet(K[] a, int offset, int length)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor and fills it with the elements of a given array.OrderedSet(K[] a, int offset, int length, float f)Creates a new hash set and fills it with the elements of a given array.OrderedSet(K[] a, int offset, int length, float f, CrossHash.IHasher hasher)Creates a new hash set and fills it with the elements of a given array.OrderedSet(K[] a, int offset, int length, CrossHash.IHasher hasher)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor and fills it with the elements of a given array.OrderedSet(K[] a, CrossHash.IHasher hasher)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor copying the elements of an array.OrderedSet(CrossHash.IHasher hasher)Creates a new hash set withDEFAULT_LOAD_FACTORas load factor. -
Method Summary
Modifier and Type Method Description booleanadd(K k)booleanaddAll(Collection<? extends K> c)booleanaddAll(K[] a)booleanaddAndMoveToFirst(K k)Adds a key to the set; if the key is already present, it is moved to the first position of the iteration order.booleanaddAndMoveToLast(K k)Adds a key to the set; if the key is already present, it is moved to the last position of the iteration order.booleanaddAt(K k, int idx)KaddOrGet(K k)Add a random element if not present, get the existing value if already present.booleanalter(K original, K replacement)Changes a K, original, to another, replacement, while keeping replacement at the same point in the ordering.booleanalterAt(int index, K replacement)Changes the K at the given index to replacement while keeping replacement at the same point in the ordering.static intarraySize(int expected, float f)Returns the least power of two smaller than or equal to 230 and larger than or equal toMath.ceil( expected / f ).voidclear()Objectclone()Returns a deep copy of this map.Comparator<? super K>comparator()booleancontains(Object k)booleancontainsAll(Collection<?> c)Checks whether this collection contains all elements from the given collection.booleanequals(Object o)Kfirst()Returns the first element of this set in iteration order.protected intfixOrder(int i)Modifies the link vector so that the given entry is removed.protected voidfixOrder(int s, int d)Modifies the ordering for a shift from s to d.Kget(Object k)Returns the element of this set that is equal to the given key, ornull.KgetAt(int idx)Gets the item at the given index in the iteration order in constant time (random-access).longhash64()inthashCode()Returns a hash code for this set.SortedSet<K>headSet(K to)intindexOf(Object k)Gets the position in the ordering of the given key, though not as efficiently as some data structures can do it (e.g.booleanisEmpty()ListIterator<K>iterator()Klast()Returns the last element of this set in iteration order.static intmaxFill(int n, float f)Returns the maximum number of entries that can be filled before rehashing.static longmaxFill(long n, float f)Returns the maximum number of entries that can be filled before rehashing.protected intpositionOf(Object k)KrandomItem(IRNG rng)Gets a random value from this OrderedSet in constant time, using the given IRNG to generate a random number.protected voidrehash(int newN)Rehashes the map.protected booleanrem(Object k)booleanremove(Object o)booleanremoveAll(Collection<?> c)Remove from this collection all elements in the given collection.booleanremoveAt(int idx)Removes the item at the given index in the iteration order in not-exactly constant time (though it still should be efficient).KremoveFirst()Removes the first key in iteration order.KremoveLast()Removes the the last key in iteration order.OrderedSet<K>reorder(int... ordering)Given an array or varargs of replacement indices for this OrderedSet's iteration order, reorders this so the first item in the returned version is the same asgetAt(ordering[0])(with some care taken for negative or too-large indices), the second item in the returned version is the same asgetAt(ordering[1]), etc.booleanretainAll(Collection<?> c)Retains in this collection only elements from the given collection.voidreverse()Reverses the iteration order in linear time.protected voidshiftKeys(int pos)Shifts left entries with the specified hash code, starting at the specified position, and empties the resulting free entry.OrderedSet<K>shuffle(IRNG rng)Randomly alters the iteration order for this OrderedSet using the given IRNG to shuffle.intsize()voidsort(Comparator<? super K> comparator)Sorts this whole OrderedSet using the supplied Comparator.voidsort(Comparator<? super K> comparator, int start, int end)Sorts a sub-range of this OrderedSet from what is currently the indexstartup to (but not including) the indexend, using the supplied Comparator.SortedSet<K>subSet(K from, K to)booleanswap(K left, K right)Swaps the positions in the ordering for the given items, if they are both present.booleanswapIndices(int left, int right)Swaps the given indices in the ordering, if they are both ints between 0 and size.SortedSet<K>tailSet(K from)Object[]toArray()<T> T[]toArray(T[] a)StringtoString()booleantrim()Rehashes the map, making the table as small as possible.booleantrim(int n)Rehashes this map if the table is too large.
-
Field Details
-
key
The array of keys. -
mask
The mask for wrapping a position counter. -
containsNull
Whether this set contains the key zero. -
order
An IntVLA (variable-length int sequence) that stores the positions in the key array of specific keys, with the positions in insertion order. The order can be changed withreorder(int...)and other methods. -
n
The current table size. -
maxFill
Threshold after which we rehash. It must be the table size timesf. -
size
Number of entries in the set (including the key zero, if present). -
f
The acceptable load factor. -
DEFAULT_INITIAL_SIZE
The initial default size of a hash table.- See Also:
- Constant Field Values
-
DEFAULT_LOAD_FACTOR
The default load factor of a hash table.- See Also:
- Constant Field Values
-
FAST_LOAD_FACTOR
The load factor for a (usually small) table that is meant to be particularly fast.- See Also:
- Constant Field Values
-
VERY_FAST_LOAD_FACTOR
The load factor for a (usually very small) table that is meant to be extremely fast.- See Also:
- Constant Field Values
-
hasher
-
-
Constructor Details
-
OrderedSet
Creates a new hash map.The actual table size will be the least power of two greater than
expected/f.- Parameters:
expected- the expected number of elements in the hash set.f- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor.- Parameters:
expected- the expected number of elements in the hash set.
-
OrderedSet
public OrderedSet()Creates a new hash set with initial expectedDEFAULT_INITIAL_SIZEelements andDEFAULT_LOAD_FACTORas load factor. -
OrderedSet
Creates a new hash set copying a given collection.- Parameters:
c- aCollectionto be copied into the new hash set.f- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor copying a given collection.- Parameters:
c- aCollectionto be copied into the new hash set.
-
OrderedSet
Creates a new hash set using elements provided by a type-specific iterator.- Parameters:
i- a type-specific iterator whose elements will fill the set.f- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor using elements provided by a type-specific iterator.- Parameters:
i- a type-specific iterator whose elements will fill the set.
-
OrderedSet
Creates a new hash set and fills it with the elements of a given array.- Parameters:
a- an array whose elements will be used to fill the set.offset- the first element to use.length- the number of elements to use.f- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor and fills it with the elements of a given array.- Parameters:
a- an array whose elements will be used to fill the set.offset- the first element to use.length- the number of elements to use.
-
OrderedSet
Creates a new hash set copying the elements of an array.- Parameters:
a- an array to be copied into the new hash set.f- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor copying the elements of an array.- Parameters:
a- an array to be copied into the new hash set.
-
OrderedSet
Creates a new hash map.The actual table size will be the least power of two greater than
expected/f.- Parameters:
expected- the expected number of elements in the hash set.f- the load factor.hasher- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor.- Parameters:
hasher- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor.- Parameters:
hasher- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set copying a given collection.- Parameters:
c- aCollectionto be copied into the new hash set.f- the load factor.hasher- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor copying a given collection.- Parameters:
c- aCollectionto be copied into the new hash set.hasher- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set and fills it with the elements of a given array.- Parameters:
a- an array whose elements will be used to fill the set.offset- the first element to use.length- the number of elements to use.f- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor and fills it with the elements of a given array.- Parameters:
a- an array whose elements will be used to fill the set.offset- the first element to use.length- the number of elements to use.
-
OrderedSet
Creates a new hash set copying the elements of an array.- Parameters:
a- an array to be copied into the new hash set.f- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTORas load factor copying the elements of an array.- Parameters:
a- an array to be copied into the new hash set.
-
-
Method Details
-
addAll
-
addAll
-
add
-
addAt
-
addOrGet
Add a random element if not present, get the existing value if already present.This is equivalent to (but faster than) doing a:
K exist = set.get(k); if (exist == null) { set.add(k); exist = k; } -
shiftKeys
Shifts left entries with the specified hash code, starting at the specified position, and empties the resulting free entry.- Parameters:
pos- a starting position.
-
rem
-
remove
-
removeFirst
Removes the first key in iteration order.- Returns:
- the first key.
- Throws:
NoSuchElementException- is this set is empty.
-
removeLast
Removes the the last key in iteration order.- Returns:
- the last key.
- Throws:
NoSuchElementException- is this set is empty.
-
addAndMoveToFirst
Adds a key to the set; if the key is already present, it is moved to the first position of the iteration order.- Parameters:
k- the key.- Returns:
- true if the key was not present.
-
addAndMoveToLast
Adds a key to the set; if the key is already present, it is moved to the last position of the iteration order.- Parameters:
k- the key.- Returns:
- true if the key was not present.
-
get
Returns the element of this set that is equal to the given key, ornull.- Returns:
- the element of this set that is equal to the given key, or
null.
-
contains
-
positionOf
-
indexOf
Gets the position in the ordering of the given key, though not as efficiently as some data structures can do it (e.g.Arrangementcan access ordering position very quickly but doesn't store other values on its own). Returns a value that is at least 0 if it found k, or -1 if k was not present.- Parameters:
k- a key or possible key that this should find the index of- Returns:
- the index of k, if present, or -1 if it is not present in this OrderedSet
-
swap
Swaps the positions in the ordering for the given items, if they are both present. Returns true if the ordering changed as a result of this call, or false if it stayed the same (which can be because left or right was not present, or because left and right are the same reference (so swapping would do nothing)).- Parameters:
left- an item that should be present in this OrderedSetright- an item that should be present in this OrderedSet- Returns:
- true if this OrderedSet changed in ordering as a result of this call, or false otherwise
-
swapIndices
Swaps the given indices in the ordering, if they are both ints between 0 and size. Returns true if the ordering changed as a result of this call, or false if it stayed the same (which can be because left or right referred to an out-of-bounds index, or because left and right are equal (so swapping would do nothing)). -
clear
-
size
-
containsAll
Checks whether this collection contains all elements from the given collection.- Specified by:
containsAllin interfaceCollection<K>- Specified by:
containsAllin interfaceSet<K>- Parameters:
c- a collection.- Returns:
trueif this collection contains all elements of the argument.
-
retainAll
Retains in this collection only elements from the given collection. -
removeAll
Remove from this collection all elements in the given collection. If the collection is an instance of this class, it uses faster iterators. -
isEmpty
-
fixOrder
Modifies the link vector so that the given entry is removed. This method will complete in linear time.- Parameters:
i- the index of an entry.
-
fixOrder
Modifies the ordering for a shift from s to d.
This method will complete in linear time or better.- Parameters:
s- the source position.d- the destination position.
-
first
Returns the first element of this set in iteration order. -
last
Returns the last element of this set in iteration order. -
tailSet
-
headSet
-
subSet
-
comparator
- Specified by:
comparatorin interfaceSortedSet<K>
-
iterator
-
trim
Rehashes the map, making the table as small as possible.This method rehashes the table to the smallest size satisfying the load factor. It can be used when the set will not be changed anymore, so to optimize access speed and size.
If the table size is already the minimum possible, this method does nothing.
- Returns:
- true if there was enough memory to trim the map.
- See Also:
trim(int)
-
trim
Rehashes this map if the table is too large.Let N be the smallest table size that can hold
max(n,entries, still satisfying the load factor. If the current table size is smaller than or equal to N, this method does nothing. Otherwise, it rehashes this map in a table of size N.size())This method is useful when reusing maps. Clearing a map leaves the table size untouched. If you are reusing a map many times, you can call this method with a typical size to avoid keeping around a very large table just because of a few large transient maps.
- Parameters:
n- the threshold for the trimming.- Returns:
- true if there was enough memory to trim the map.
- See Also:
trim()
-
rehash
Rehashes the map.This method implements the basic rehashing strategy, and may be overriden by subclasses implementing different rehashing strategies (e.g., disk-based rehashing). However, you should not override this method unless you understand the internal workings of this class.
- Parameters:
newN- the new size
-
clone
Returns a deep copy of this map.This method performs a deep copy of this hash map; the data stored in the map, however, is not cloned. Note that this makes a difference only for object keys.
-
hashCode
Returns a hash code for this set.This method overrides the generic method provided by the superclass. Since
equals()is not overriden, it is important that the value returned by this method is the same value as the one returned by the overriden method. -
hash64
-
maxFill
Returns the maximum number of entries that can be filled before rehashing.- Parameters:
n- the size of the backing array.f- the load factor.- Returns:
- the maximum number of entries before rehashing.
-
maxFill
Returns the maximum number of entries that can be filled before rehashing.- Parameters:
n- the size of the backing array.f- the load factor.- Returns:
- the maximum number of entries before rehashing.
-
arraySize
Returns the least power of two smaller than or equal to 230 and larger than or equal toMath.ceil( expected / f ).- Parameters:
expected- the expected number of elements in a hash table.f- the load factor.- Returns:
- the minimum possible size for a backing array.
- Throws:
IllegalArgumentException- if the necessary size is larger than 230.
-
toArray
-
toArray
-
toString
-
equals
-
getAt
Gets the item at the given index in the iteration order in constant time (random-access).- Parameters:
idx- the index in the iteration order of the key to fetch- Returns:
- the key at the index, if the index is valid, otherwise null
-
removeAt
Removes the item at the given index in the iteration order in not-exactly constant time (though it still should be efficient).- Parameters:
idx- the index in the iteration order of the item to remove- Returns:
- true if this Set was changed as a result of this call, or false if nothing changed.
-
randomItem
Gets a random value from this OrderedSet in constant time, using the given IRNG to generate a random number.- Parameters:
rng- used to generate a random index for a value- Returns:
- a random value from this OrderedSet
-
shuffle
Randomly alters the iteration order for this OrderedSet using the given IRNG to shuffle.- Parameters:
rng- used to generate a random ordering- Returns:
- this for chaining
-
reorder
Given an array or varargs of replacement indices for this OrderedSet's iteration order, reorders this so the first item in the returned version is the same asgetAt(ordering[0])(with some care taken for negative or too-large indices), the second item in the returned version is the same asgetAt(ordering[1]), etc.
Negative indices are considered reversed distances from the end of ordering, so -1 refers to the same index asordering[ordering.length - 1]. If ordering is smaller thansize(), only the indices up to the length of ordering will be modified. If ordering is larger thansize(), only as many indices will be affected assize(), and reversed distances are measured from the end of this Set's entries instead of the end of ordering. Duplicate values in ordering will produce duplicate values in the returned Set.
This method modifies this OrderedSet in-place and also returns it for chaining.- Parameters:
ordering- an array or varargs of int indices, where the nth item in ordering changes the nth item in this Set to have the value currently in this Set at the index specified by the value in ordering- Returns:
- this for chaining, after modifying it in-place
-
alter
Changes a K, original, to another, replacement, while keeping replacement at the same point in the ordering.- Parameters:
original- a K value that will be removed from this Set if present, and its iteration index rememberedreplacement- another K value that will replace original at the remembered index- Returns:
- true if the Set changed, or false if it didn't (such as if the two arguments are equal, or replacement was already in the Set but original was not)
-
alterAt
Changes the K at the given index to replacement while keeping replacement at the same point in the ordering.- Parameters:
index- an index to replace the K item atreplacement- another K value that will replace the original at the remembered index- Returns:
- true if the Set changed, or false if it didn't (such as if the replacement was already present at the given index)
-
sort
Sorts this whole OrderedSet using the supplied Comparator.- Parameters:
comparator- a Comparator that can be used on the same type this uses for its keys (may need wildcards)
-
sort
Sorts a sub-range of this OrderedSet from what is currently the indexstartup to (but not including) the indexend, using the supplied Comparator.- Parameters:
comparator- a Comparator that can be used on the same type this uses for its keys (may need wildcards)start- the first index of a key to sort (the index can change after this)end- the exclusive bound on the indices to sort; often this is justsize()
-
reverse
Reverses the iteration order in linear time.
-