Class OrderedSet<K>
- All Implemented Interfaces:
Serializable
,Cloneable
,Iterable<K>
,Collection<K>
,Set<K>
,SortedSet<K>
- Direct Known Subclasses:
EnumOrderedSet
public class OrderedSet<K> extends Object implements SortedSet<K>, Serializable, Cloneable
Instances of this class use a hash table to represent a set. The table is filled up to a specified load factor, and then doubled in size to accommodate new entries. If the table is emptied below one fourth of the load factor, it is halved in size. However, halving is not performed when deleting entries from an iterator, as it would interfere with the iteration process.
Note that clear()
does not modify the hash table size. Rather, a
family of trimming methods lets you control the size of
the table; this is particularly useful if you reuse instances of this class.
Iterators generated by this set will enumerate elements in the same order in
which they have been added to the set (addition of elements already present
in the set does not change the iteration order). Note that this order has
nothing in common with the natural order of the keys. The order is kept by
means of an array list, represented via an IntVLA parallel to the
table that can be modified with methods like shuffle(IRNG)
.
This class implements the interface of a sorted set, so to allow easy access
of the iteration order: for instance, you can get the first element in
iteration order with first()
without having to create an iterator;
however, this class partially violates the SortedSet
contract because all subset methods throw an exception and
comparator()
returns always null
.
Additional methods, such as addAndMoveToFirst()
, make it easy to
use instances of this class as a cache (e.g., with LRU policy).
This class allows approximately constant-time lookup of keys or values by their index in the ordering, which can
allow some novel usage of the data structure. OrderedSet can be used like a list of unique elements, keeping order
like a list does but also allowing rapid checks for whether an item exists in the OrderedSet, and OrderedMap
can be used like that but with values associated as well (where OrderedSet uses contains(), OrderedMap uses
containsKey()). You can also set the item at a position with addAt(Object, int)
, or alter an item while
keeping index the same with alter(Object, Object)
. Reordering works here too, both with completely random
orders from shuffle(IRNG)
or with a previously-generated ordering from reorder(int...)
(you can
produce such an ordering for a given size and reuse it across multiple Ordered data structures with
IRNG.randomOrdering(int)
).
You can pass an CrossHash.IHasher
instance such as CrossHash.generalHasher
as an extra parameter to
most of this class' constructors, which allows the OrderedSet to use arrays (usually primitive arrays) as items. If
you expect only one type of array, you can use an instance like CrossHash.intHasher
to hash int arrays, or
the aforementioned generalHasher to hash most kinds of arrays (it can't handle most multi-dimensional arrays well).
If you aren't using array items, you don't need to give an IHasher to the constructor and can ignore this feature.
Thank you, Sebastiano Vigna, for making FastUtil available to the public with such high quality.
See https://github.com/vigna/fastutil for the original library.
- Author:
- Sebastiano Vigna (responsible for all the hard parts), Tommy Ettinger (mostly responsible for squashing several layers of parent classes into one monster class)
- See Also:
- Serialized Form
-
Field Summary
Fields Modifier and Type Field Description protected boolean
containsNull
Whether this set contains the key zero.static int
DEFAULT_INITIAL_SIZE
The initial default size of a hash table.static float
DEFAULT_LOAD_FACTOR
The default load factor of a hash table.float
f
The acceptable load factor.static float
FAST_LOAD_FACTOR
The load factor for a (usually small) table that is meant to be particularly fast.protected CrossHash.IHasher
hasher
protected K[]
key
The array of keys.protected int
mask
The mask for wrapping a position counter.protected int
maxFill
Threshold after which we rehash.protected int
n
The current table size.protected IntVLA
order
An IntVLA (variable-length int sequence) that stores the positions in the key array of specific keys, with the positions in insertion order.protected int
size
Number of entries in the set (including the key zero, if present).static float
VERY_FAST_LOAD_FACTOR
The load factor for a (usually very small) table that is meant to be extremely fast. -
Constructor Summary
Constructors Constructor Description OrderedSet()
Creates a new hash set with initial expectedDEFAULT_INITIAL_SIZE
elements andDEFAULT_LOAD_FACTOR
as load factor.OrderedSet(int expected)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor.OrderedSet(int expected, float f)
Creates a new hash map.OrderedSet(int expected, float f, CrossHash.IHasher hasher)
Creates a new hash map.OrderedSet(int expected, CrossHash.IHasher hasher)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor.OrderedSet(Collection<? extends K> c)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor copying a given collection.OrderedSet(Collection<? extends K> c, float f)
Creates a new hash set copying a given collection.OrderedSet(Collection<? extends K> c, float f, CrossHash.IHasher hasher)
Creates a new hash set copying a given collection.OrderedSet(Collection<? extends K> c, CrossHash.IHasher hasher)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor copying a given collection.OrderedSet(Iterator<? extends K> i)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor using elements provided by a type-specific iterator.OrderedSet(Iterator<? extends K> i, float f)
Creates a new hash set using elements provided by a type-specific iterator.OrderedSet(K[] a)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor copying the elements of an array.OrderedSet(K[] a, float f)
Creates a new hash set copying the elements of an array.OrderedSet(K[] a, float f, CrossHash.IHasher hasher)
Creates a new hash set copying the elements of an array.OrderedSet(K[] a, int offset, int length)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor and fills it with the elements of a given array.OrderedSet(K[] a, int offset, int length, float f)
Creates a new hash set and fills it with the elements of a given array.OrderedSet(K[] a, int offset, int length, float f, CrossHash.IHasher hasher)
Creates a new hash set and fills it with the elements of a given array.OrderedSet(K[] a, int offset, int length, CrossHash.IHasher hasher)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor and fills it with the elements of a given array.OrderedSet(K[] a, CrossHash.IHasher hasher)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor copying the elements of an array.OrderedSet(CrossHash.IHasher hasher)
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor. -
Method Summary
Modifier and Type Method Description boolean
add(K k)
boolean
addAll(Collection<? extends K> c)
boolean
addAll(K[] a)
boolean
addAndMoveToFirst(K k)
Adds a key to the set; if the key is already present, it is moved to the first position of the iteration order.boolean
addAndMoveToLast(K k)
Adds a key to the set; if the key is already present, it is moved to the last position of the iteration order.boolean
addAt(K k, int idx)
K
addOrGet(K k)
Add a random element if not present, get the existing value if already present.boolean
alter(K original, K replacement)
Changes a K, original, to another, replacement, while keeping replacement at the same point in the ordering.boolean
alterAt(int index, K replacement)
Changes the K at the given index to replacement while keeping replacement at the same point in the ordering.static int
arraySize(int expected, float f)
Returns the least power of two smaller than or equal to 230 and larger than or equal toMath.ceil( expected / f )
.void
clear()
Object
clone()
Returns a deep copy of this map.Comparator<? super K>
comparator()
boolean
contains(Object k)
boolean
containsAll(Collection<?> c)
Checks whether this collection contains all elements from the given collection.boolean
equals(Object o)
K
first()
Returns the first element of this set in iteration order.protected int
fixOrder(int i)
Modifies the link vector so that the given entry is removed.protected void
fixOrder(int s, int d)
Modifies the ordering for a shift from s to d.K
get(Object k)
Returns the element of this set that is equal to the given key, ornull
.K
getAt(int idx)
Gets the item at the given index in the iteration order in constant time (random-access).long
hash64()
int
hashCode()
Returns a hash code for this set.SortedSet<K>
headSet(K to)
int
indexOf(Object k)
Gets the position in the ordering of the given key, though not as efficiently as some data structures can do it (e.g.boolean
isEmpty()
ListIterator<K>
iterator()
K
last()
Returns the last element of this set in iteration order.static int
maxFill(int n, float f)
Returns the maximum number of entries that can be filled before rehashing.static long
maxFill(long n, float f)
Returns the maximum number of entries that can be filled before rehashing.protected int
positionOf(Object k)
K
randomItem(IRNG rng)
Gets a random value from this OrderedSet in constant time, using the given IRNG to generate a random number.protected void
rehash(int newN)
Rehashes the map.protected boolean
rem(Object k)
boolean
remove(Object o)
boolean
removeAll(Collection<?> c)
Remove from this collection all elements in the given collection.boolean
removeAt(int idx)
Removes the item at the given index in the iteration order in not-exactly constant time (though it still should be efficient).K
removeFirst()
Removes the first key in iteration order.K
removeLast()
Removes the the last key in iteration order.OrderedSet<K>
reorder(int... ordering)
Given an array or varargs of replacement indices for this OrderedSet's iteration order, reorders this so the first item in the returned version is the same asgetAt(ordering[0])
(with some care taken for negative or too-large indices), the second item in the returned version is the same asgetAt(ordering[1])
, etc.boolean
retainAll(Collection<?> c)
Retains in this collection only elements from the given collection.void
reverse()
Reverses the iteration order in linear time.protected void
shiftKeys(int pos)
Shifts left entries with the specified hash code, starting at the specified position, and empties the resulting free entry.OrderedSet<K>
shuffle(IRNG rng)
Randomly alters the iteration order for this OrderedSet using the given IRNG to shuffle.int
size()
void
sort(Comparator<? super K> comparator)
Sorts this whole OrderedSet using the supplied Comparator.void
sort(Comparator<? super K> comparator, int start, int end)
Sorts a sub-range of this OrderedSet from what is currently the indexstart
up to (but not including) the indexend
, using the supplied Comparator.SortedSet<K>
subSet(K from, K to)
boolean
swap(K left, K right)
Swaps the positions in the ordering for the given items, if they are both present.boolean
swapIndices(int left, int right)
Swaps the given indices in the ordering, if they are both ints between 0 and size.SortedSet<K>
tailSet(K from)
Object[]
toArray()
<T> T[]
toArray(T[] a)
String
toString()
boolean
trim()
Rehashes the map, making the table as small as possible.boolean
trim(int n)
Rehashes this map if the table is too large.
-
Field Details
-
key
The array of keys. -
mask
The mask for wrapping a position counter. -
containsNull
Whether this set contains the key zero. -
order
An IntVLA (variable-length int sequence) that stores the positions in the key array of specific keys, with the positions in insertion order. The order can be changed withreorder(int...)
and other methods. -
n
The current table size. -
maxFill
Threshold after which we rehash. It must be the table size timesf
. -
size
Number of entries in the set (including the key zero, if present). -
f
The acceptable load factor. -
DEFAULT_INITIAL_SIZE
The initial default size of a hash table.- See Also:
- Constant Field Values
-
DEFAULT_LOAD_FACTOR
The default load factor of a hash table.- See Also:
- Constant Field Values
-
FAST_LOAD_FACTOR
The load factor for a (usually small) table that is meant to be particularly fast.- See Also:
- Constant Field Values
-
VERY_FAST_LOAD_FACTOR
The load factor for a (usually very small) table that is meant to be extremely fast.- See Also:
- Constant Field Values
-
hasher
-
-
Constructor Details
-
OrderedSet
Creates a new hash map.The actual table size will be the least power of two greater than
expected
/f
.- Parameters:
expected
- the expected number of elements in the hash set.f
- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor.- Parameters:
expected
- the expected number of elements in the hash set.
-
OrderedSet
public OrderedSet()Creates a new hash set with initial expectedDEFAULT_INITIAL_SIZE
elements andDEFAULT_LOAD_FACTOR
as load factor. -
OrderedSet
Creates a new hash set copying a given collection.- Parameters:
c
- aCollection
to be copied into the new hash set.f
- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor copying a given collection.- Parameters:
c
- aCollection
to be copied into the new hash set.
-
OrderedSet
Creates a new hash set using elements provided by a type-specific iterator.- Parameters:
i
- a type-specific iterator whose elements will fill the set.f
- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor using elements provided by a type-specific iterator.- Parameters:
i
- a type-specific iterator whose elements will fill the set.
-
OrderedSet
Creates a new hash set and fills it with the elements of a given array.- Parameters:
a
- an array whose elements will be used to fill the set.offset
- the first element to use.length
- the number of elements to use.f
- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor and fills it with the elements of a given array.- Parameters:
a
- an array whose elements will be used to fill the set.offset
- the first element to use.length
- the number of elements to use.
-
OrderedSet
Creates a new hash set copying the elements of an array.- Parameters:
a
- an array to be copied into the new hash set.f
- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor copying the elements of an array.- Parameters:
a
- an array to be copied into the new hash set.
-
OrderedSet
Creates a new hash map.The actual table size will be the least power of two greater than
expected
/f
.- Parameters:
expected
- the expected number of elements in the hash set.f
- the load factor.hasher
- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor.- Parameters:
hasher
- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor.- Parameters:
hasher
- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set copying a given collection.- Parameters:
c
- aCollection
to be copied into the new hash set.f
- the load factor.hasher
- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor copying a given collection.- Parameters:
c
- aCollection
to be copied into the new hash set.hasher
- used to hash items; typically only needed when K is an array, where CrossHash has implementations
-
OrderedSet
Creates a new hash set and fills it with the elements of a given array.- Parameters:
a
- an array whose elements will be used to fill the set.offset
- the first element to use.length
- the number of elements to use.f
- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor and fills it with the elements of a given array.- Parameters:
a
- an array whose elements will be used to fill the set.offset
- the first element to use.length
- the number of elements to use.
-
OrderedSet
Creates a new hash set copying the elements of an array.- Parameters:
a
- an array to be copied into the new hash set.f
- the load factor.
-
OrderedSet
Creates a new hash set withDEFAULT_LOAD_FACTOR
as load factor copying the elements of an array.- Parameters:
a
- an array to be copied into the new hash set.
-
-
Method Details
-
addAll
-
addAll
-
add
-
addAt
-
addOrGet
Add a random element if not present, get the existing value if already present.This is equivalent to (but faster than) doing a:
K exist = set.get(k); if (exist == null) { set.add(k); exist = k; }
-
shiftKeys
Shifts left entries with the specified hash code, starting at the specified position, and empties the resulting free entry.- Parameters:
pos
- a starting position.
-
rem
-
remove
-
removeFirst
Removes the first key in iteration order.- Returns:
- the first key.
- Throws:
NoSuchElementException
- is this set is empty.
-
removeLast
Removes the the last key in iteration order.- Returns:
- the last key.
- Throws:
NoSuchElementException
- is this set is empty.
-
addAndMoveToFirst
Adds a key to the set; if the key is already present, it is moved to the first position of the iteration order.- Parameters:
k
- the key.- Returns:
- true if the key was not present.
-
addAndMoveToLast
Adds a key to the set; if the key is already present, it is moved to the last position of the iteration order.- Parameters:
k
- the key.- Returns:
- true if the key was not present.
-
get
Returns the element of this set that is equal to the given key, ornull
.- Returns:
- the element of this set that is equal to the given key, or
null
.
-
contains
-
positionOf
-
indexOf
Gets the position in the ordering of the given key, though not as efficiently as some data structures can do it (e.g.Arrangement
can access ordering position very quickly but doesn't store other values on its own). Returns a value that is at least 0 if it found k, or -1 if k was not present.- Parameters:
k
- a key or possible key that this should find the index of- Returns:
- the index of k, if present, or -1 if it is not present in this OrderedSet
-
swap
Swaps the positions in the ordering for the given items, if they are both present. Returns true if the ordering changed as a result of this call, or false if it stayed the same (which can be because left or right was not present, or because left and right are the same reference (so swapping would do nothing)).- Parameters:
left
- an item that should be present in this OrderedSetright
- an item that should be present in this OrderedSet- Returns:
- true if this OrderedSet changed in ordering as a result of this call, or false otherwise
-
swapIndices
Swaps the given indices in the ordering, if they are both ints between 0 and size. Returns true if the ordering changed as a result of this call, or false if it stayed the same (which can be because left or right referred to an out-of-bounds index, or because left and right are equal (so swapping would do nothing)). -
clear
-
size
-
containsAll
Checks whether this collection contains all elements from the given collection.- Specified by:
containsAll
in interfaceCollection<K>
- Specified by:
containsAll
in interfaceSet<K>
- Parameters:
c
- a collection.- Returns:
true
if this collection contains all elements of the argument.
-
retainAll
Retains in this collection only elements from the given collection. -
removeAll
Remove from this collection all elements in the given collection. If the collection is an instance of this class, it uses faster iterators. -
isEmpty
-
fixOrder
Modifies the link vector so that the given entry is removed. This method will complete in linear time.- Parameters:
i
- the index of an entry.
-
fixOrder
Modifies the ordering for a shift from s to d.
This method will complete in linear time or better.- Parameters:
s
- the source position.d
- the destination position.
-
first
Returns the first element of this set in iteration order. -
last
Returns the last element of this set in iteration order. -
tailSet
-
headSet
-
subSet
-
comparator
- Specified by:
comparator
in interfaceSortedSet<K>
-
iterator
-
trim
Rehashes the map, making the table as small as possible.This method rehashes the table to the smallest size satisfying the load factor. It can be used when the set will not be changed anymore, so to optimize access speed and size.
If the table size is already the minimum possible, this method does nothing.
- Returns:
- true if there was enough memory to trim the map.
- See Also:
trim(int)
-
trim
Rehashes this map if the table is too large.Let N be the smallest table size that can hold
max(n,
entries, still satisfying the load factor. If the current table size is smaller than or equal to N, this method does nothing. Otherwise, it rehashes this map in a table of size N.size()
)This method is useful when reusing maps. Clearing a map leaves the table size untouched. If you are reusing a map many times, you can call this method with a typical size to avoid keeping around a very large table just because of a few large transient maps.
- Parameters:
n
- the threshold for the trimming.- Returns:
- true if there was enough memory to trim the map.
- See Also:
trim()
-
rehash
Rehashes the map.This method implements the basic rehashing strategy, and may be overriden by subclasses implementing different rehashing strategies (e.g., disk-based rehashing). However, you should not override this method unless you understand the internal workings of this class.
- Parameters:
newN
- the new size
-
clone
Returns a deep copy of this map.This method performs a deep copy of this hash map; the data stored in the map, however, is not cloned. Note that this makes a difference only for object keys.
-
hashCode
Returns a hash code for this set.This method overrides the generic method provided by the superclass. Since
equals()
is not overriden, it is important that the value returned by this method is the same value as the one returned by the overriden method. -
hash64
-
maxFill
Returns the maximum number of entries that can be filled before rehashing.- Parameters:
n
- the size of the backing array.f
- the load factor.- Returns:
- the maximum number of entries before rehashing.
-
maxFill
Returns the maximum number of entries that can be filled before rehashing.- Parameters:
n
- the size of the backing array.f
- the load factor.- Returns:
- the maximum number of entries before rehashing.
-
arraySize
Returns the least power of two smaller than or equal to 230 and larger than or equal toMath.ceil( expected / f )
.- Parameters:
expected
- the expected number of elements in a hash table.f
- the load factor.- Returns:
- the minimum possible size for a backing array.
- Throws:
IllegalArgumentException
- if the necessary size is larger than 230.
-
toArray
-
toArray
-
toString
-
equals
-
getAt
Gets the item at the given index in the iteration order in constant time (random-access).- Parameters:
idx
- the index in the iteration order of the key to fetch- Returns:
- the key at the index, if the index is valid, otherwise null
-
removeAt
Removes the item at the given index in the iteration order in not-exactly constant time (though it still should be efficient).- Parameters:
idx
- the index in the iteration order of the item to remove- Returns:
- true if this Set was changed as a result of this call, or false if nothing changed.
-
randomItem
Gets a random value from this OrderedSet in constant time, using the given IRNG to generate a random number.- Parameters:
rng
- used to generate a random index for a value- Returns:
- a random value from this OrderedSet
-
shuffle
Randomly alters the iteration order for this OrderedSet using the given IRNG to shuffle.- Parameters:
rng
- used to generate a random ordering- Returns:
- this for chaining
-
reorder
Given an array or varargs of replacement indices for this OrderedSet's iteration order, reorders this so the first item in the returned version is the same asgetAt(ordering[0])
(with some care taken for negative or too-large indices), the second item in the returned version is the same asgetAt(ordering[1])
, etc.
Negative indices are considered reversed distances from the end of ordering, so -1 refers to the same index asordering[ordering.length - 1]
. If ordering is smaller thansize()
, only the indices up to the length of ordering will be modified. If ordering is larger thansize()
, only as many indices will be affected assize()
, and reversed distances are measured from the end of this Set's entries instead of the end of ordering. Duplicate values in ordering will produce duplicate values in the returned Set.
This method modifies this OrderedSet in-place and also returns it for chaining.- Parameters:
ordering
- an array or varargs of int indices, where the nth item in ordering changes the nth item in this Set to have the value currently in this Set at the index specified by the value in ordering- Returns:
- this for chaining, after modifying it in-place
-
alter
Changes a K, original, to another, replacement, while keeping replacement at the same point in the ordering.- Parameters:
original
- a K value that will be removed from this Set if present, and its iteration index rememberedreplacement
- another K value that will replace original at the remembered index- Returns:
- true if the Set changed, or false if it didn't (such as if the two arguments are equal, or replacement was already in the Set but original was not)
-
alterAt
Changes the K at the given index to replacement while keeping replacement at the same point in the ordering.- Parameters:
index
- an index to replace the K item atreplacement
- another K value that will replace the original at the remembered index- Returns:
- true if the Set changed, or false if it didn't (such as if the replacement was already present at the given index)
-
sort
Sorts this whole OrderedSet using the supplied Comparator.- Parameters:
comparator
- a Comparator that can be used on the same type this uses for its keys (may need wildcards)
-
sort
Sorts a sub-range of this OrderedSet from what is currently the indexstart
up to (but not including) the indexend
, using the supplied Comparator.- Parameters:
comparator
- a Comparator that can be used on the same type this uses for its keys (may need wildcards)start
- the first index of a key to sort (the index can change after this)end
- the exclusive bound on the indices to sort; often this is justsize()
-
reverse
Reverses the iteration order in linear time.
-