public class MarkovTextLimited
extends java.lang.Object
implements java.io.Serializable
analyze(CharSequence)
once on a large sample text,
then you can call chain(long)
many times to get odd-sounding "remixes" of the sample text. For more natural
output, you can use MarkovText
, which is an order-2 Markov chain and so looks at two previous words. This is
meant to allow easy serialization of the necessary data to call chain(); if you can store the words
and
processed
arrays in some serialized form, then you can reassign them to the same fields to avoid calling
analyze(). One way to do this conveniently is to use serializeToString()
after calling analyze() once and to
save the resulting String; then, rather than calling analyze() again on future runs, you would call
deserializeFromString(String)
to create the MarkovTextLimited without needing any repeated analysis.
Modifier and Type | Field and Description |
---|---|
int[][] |
processed
Complicated data that mixes probabilities and the indices of words in
words , generated during the latest
call to analyze(CharSequence) . |
java.lang.String[] |
words
All words (case-sensitive and counting some punctuation as part of words) that this encountered during the latest
call to
analyze(CharSequence) . |
Constructor and Description |
---|
MarkovTextLimited() |
Modifier and Type | Method and Description |
---|---|
void |
analyze(java.lang.CharSequence corpus)
This is the main necessary step before using a MarkovTextLimited; you must call this method at some point before you can
call any other methods.
|
java.lang.String |
chain(long seed)
Generate a roughly-sentence-sized piece of text based on the previously analyzed corpus text (using
analyze(CharSequence) ) that terminates when stop punctuation is used (".", "!", "?", or "..."), or once
the length would be greater than 200 characters without encountering stop punctuation(it terminates such a
sentence with "." or "..."). |
java.lang.String |
chain(long seed,
int maxLength)
Generate a roughly-sentence-sized piece of text based on the previously analyzed corpus text (using
analyze(CharSequence) ) that terminates when stop punctuation is used (".", "!", "?", or "...") or once
the maxLength would be exceeded by any other words (it terminates such a sentence with "." or "..."). |
void |
changeNames(NaturalLanguageCipher translator)
After calling
analyze(CharSequence) , you can optionally call this to alter any words in this MarkovTextLimited that
were used as a proper noun (determined by whether they were capitalized in the middle of a sentence), changing
them to a ciphered version using the given NaturalLanguageCipher . |
MarkovTextLimited |
copy()
|
static MarkovTextLimited |
deserializeFromString(java.lang.String data)
Recreates an already-analyzed MarkovTextLimited given a String produced by
serializeToString() . |
java.lang.String |
serializeToString()
Returns a representation of this MarkovTextLimited as a String; use
deserializeFromString(String) to get a
MarkovTextLimited back from this String. |
public java.lang.String[] words
analyze(CharSequence)
. Will be null if analyze() was never called.public int[][] processed
words
, generated during the latest
call to analyze(CharSequence)
. This is a jagged 2D array. Will be null if analyze() was never called.public void analyze(java.lang.CharSequence corpus)
words
and processed
, which allows other
methods to be called (they will throw a NullPointerException
if analyze() hasn't been called).corpus
- a typically-large sample text in the style that should be mimickedpublic void changeNames(NaturalLanguageCipher translator)
analyze(CharSequence)
, you can optionally call this to alter any words in this MarkovTextLimited that
were used as a proper noun (determined by whether they were capitalized in the middle of a sentence), changing
them to a ciphered version using the given NaturalLanguageCipher
. Normally you would initialize a
NaturalLanguageCipher with a FakeLanguageGen
that matches the style you want for all names in this text,
then pass that to this method during pre-processing (not necessarily at runtime, since this method isn't
especially fast if the corpus was large). This method modifies this MarkovTextLimited in-place.translator
- a NaturalLanguageCipher that will be used to translate proper nouns in this MarkovTextLimited's word arraypublic java.lang.String chain(long seed)
analyze(CharSequence)
) that terminates when stop punctuation is used (".", "!", "?", or "..."), or once
the length would be greater than 200 characters without encountering stop punctuation(it terminates such a
sentence with "." or "...").seed
- the seed for the random decisions this makes, as a long; any long can be usedpublic java.lang.String chain(long seed, int maxLength)
analyze(CharSequence)
) that terminates when stop punctuation is used (".", "!", "?", or "...") or once
the maxLength would be exceeded by any other words (it terminates such a sentence with "." or "...").seed
- the seed for the random decisions this makes, as a long; any long can be usedmaxLength
- the maximum length for the generated String, in number of characterspublic java.lang.String serializeToString()
deserializeFromString(String)
to get a
MarkovTextLimited back from this String. The words
and processed
fields must have been given values by
either direct assignment, calling analyze(CharSequence)
, or building this MarkovTest with the
aforementioned deserializeToString method. Uses spaces to separate words and a tab to separate the two fields.public static MarkovTextLimited deserializeFromString(java.lang.String data)
serializeToString()
.data
- a String returned by serializeToString()
chain(long)
public MarkovTextLimited copy()
words
and the 2D jagged int array processed
into a new MarkovTextLimited.
None of the arrays will be equivalent references, but the Strings (being immutable) will be the same objects in
both MarkovTextLimited instances. This is primarily useful with changeNames(NaturalLanguageCipher)
, which can
produce several variants on names given several initial copies produced with this method.Copyright © Eben Howard 2012–2022. All rights reserved.