Package squidpony
Class MarkovObject<T>
java.lang.Object
squidpony.MarkovObject<T>
- All Implemented Interfaces:
Serializable
public class MarkovObject<T> extends Object implements Serializable
A simple Markov chain generator that works with Lists of some type instead of text like
Created by Tommy Ettinger on 2/26/2018.
MarkovTextLimited
.
Call analyze(Iterable)
or analyze(Object[])
once on a large sample Iterable or array where
sequences of items matter (this is called a corpus, and could be e.g. a List or an array), then you can call
chain(long)
many times to get "remixes" of the sample Iterable/array as a List. This is meant to allow easy
serialization of the necessary data to call chain(); if you can store the body
and processed
data
structures in some serialized form, then you can reassign them to the same fields to avoid calling analyze(). This
requires some way to serialize body, which is an Arrangement
of T, and so T must be serializable in some way
(not necessarily the Serializable
interface, but possibly that).
Created by Tommy Ettinger on 2/26/2018.
- See Also:
- Serialized Form
-
Field Summary
Fields Modifier and Type Field Description Arrangement<T>
body
All unique T items that this encountered during the latest call toanalyze(Iterable)
.ArrayList<IntVLA>
processed
Complicated data that mixes probabilities and the indices of items inbody
, generated during the latest call toanalyze(Iterable)
.ArrayList<IntVLA>
raw
-
Constructor Summary
Constructors Constructor Description MarkovObject()
-
Method Summary
Modifier and Type Method Description void
analyze(Iterable<T> corpus)
This is the main necessary step before using a MarkovObject; you must call this method at some point before you can call any other methods.void
analyze(T[] corpus)
This is the main necessary step before using a MarkovObject; you must call this method at some point before you can call any other methods.List<T>
chain(long seed)
Generates a 32-element List of T based on the given seed and previously analyzed corpus data (usinganalyze(Iterable)
).List<T>
chain(long seed, int maxLength, boolean canStopEarly, List<T> buffer)
Adds T items to buffer to fill it up to maxLength, based on the given seed and previously analyzed corpus data (usinganalyze(Iterable)
).MarkovObject<T>
copy()
-
Field Details
-
body
All unique T items that this encountered during the latest call toanalyze(Iterable)
. Will be null if analyze() was never called. -
processed
Complicated data that mixes probabilities and the indices of items inbody
, generated during the latest call toanalyze(Iterable)
. Will be null if analyze() was never called. -
raw
-
-
Constructor Details
-
MarkovObject
public MarkovObject()
-
-
Method Details
-
analyze
This is the main necessary step before using a MarkovObject; you must call this method at some point before you can call any other methods. This method analyzes the pairings of items in a (typically large) corpus Iterable. It only uses one preceding item to determine the subsequent word. It does not store any items as special stop terms, but it does usenull
to represent the start of a section (effectively treating any corpus as starting with null prepended), and will not produce null as output fromchain(long)
. If null is encountered as part of corpus, it will be interpreted as a point to stop on and potentially start a new section. Since the last item in the corpus could have no known items to produce after it, the end of the corpus is treated as having null appended as well. When it finishes processing, it stores the results inbody
andprocessed
, which allows other methods to be called (they will throw aNullPointerException
if analyze() hasn't been called).
Unlike inMarkovTextLimited
, you can analyze multiple corpus Iterables by calling this method more than once.- Parameters:
corpus
- a typically-large sample Iterable in the style that should be mimicked
-
analyze
This is the main necessary step before using a MarkovObject; you must call this method at some point before you can call any other methods. This method analyzes the pairings of items in a (typically large) corpus array of T. It only uses one preceding item to determine the subsequent word. It does not store any items as special stop terms, but it does usenull
to represent the start of a section (effectively treating any corpus as starting with null prepended), and will not produce null as output fromchain(long)
. If null is encountered as part of corpus, it will be interpreted as a point to stop on and potentially start a new section. Since the last item in the corpus could have no known items to produce after it, the end of the corpus is treated as having null appended as well. When it finishes processing, it stores the results inbody
andprocessed
, which allows other methods to be called (they will throw aNullPointerException
if analyze() hasn't been called).
Unlike inMarkovTextLimited
, you can analyze multiple corpus arrays by calling this method more than once.- Parameters:
corpus
- a typically-large sample array of T in the style that should be mimicked
-
chain
Generates a 32-element List of T based on the given seed and previously analyzed corpus data (usinganalyze(Iterable)
). This can't stop before generating a chain of 32 items unless analyze() hasn't been called or it was called on an empty or invalid Iterable/array (i.e. all null).- Parameters:
seed
- the seed for the random decisions this makes, as a long; any long can be used- Returns:
- a 32-element T List generated from the analyzed corpus Iterable/array's pairings of items
-
chain
Adds T items to buffer to fill it up to maxLength, based on the given seed and previously analyzed corpus data (usinganalyze(Iterable)
). If buffer is already at least as long as maxLength, if analyze() hasn't been called or if it was called on an empty or invalid Iterable/array (i.e. all null), then this won't change buffer and will return it as-is. If null was present in the analyzed corpus along with other items and canStopEarly is true, then if null would be generated this will instead stop adding items to buffer and return buffer as it is. If canStopEarly was false in the last case, the generated null would be discarded and a value from the start of the corpus or following a null in the corpus would be used instead.- Parameters:
seed
- the seed for the random decisions this makes, as a long; any long can be usedmaxLength
- the maximum length for the generated List, in itemscanStopEarly
- if true, this may add less than maxLength elements if null was present in the corpusbuffer
- a List of T that will have elements added until maxLength is reached; if it already is larger than maxLength this won't do anything- Returns:
- buffer, after items were added to fill maxLength (or to fill less if this stopped early)
-
copy
Copies the T items inbody
and the int-based data structureprocessed
into a new MarkovObject. None of the inner values, such as IntVLA values in processed, will be equivalent references, but the items in body will be the same objects in both MarkovObject instances.- Returns:
- a copy of this MarkovObject
-