com.github.yellowstonegames.text.MarkovTextLimited

public class MarkovTextLimited extends Object

A simple Markov chain text generator; it is called "Limited" because it only can be used as an order-1 Markov chain, meaning only one prior word is looked at. To use, call analyze(CharSequence) once on a large sample text, then you can call chain(long) many times to get odd-sounding "remixes" of the sample text. For more natural output, you can use MarkovText, which is an order-2 Markov chain and so looks at two previous words.
This is meant to allow easy serialization of the necessary data to call chain(); if you can store the words and processed arrays in some serialized form, then you can reassign them to the same fields to avoid calling analyze(). One way to do this conveniently is to use stringSerialize() after calling analyze() once and to save the resulting String; then, rather than calling analyze() again on future runs, you would call stringDeserialize(String) to create the MarkovTextLimited without needing any repeated analysis.
This doesn't produce especially understandable sentences, but sometimes that's all you need. This class is mostly present to provide feature parity with SquidLib 3.x .

Field Summary

Fields

Modifier and Type

Field

Description

int[][]

processed

Complicated data that mixes probabilities and the indices of words in words, generated during the latest call to analyze(CharSequence).

String[]

words

All words (case-sensitive and counting some punctuation as part of words) that this encountered during the latest call to analyze(CharSequence).
Constructor Summary

Constructors

Constructor

Description

MarkovTextLimited()

MarkovTextLimited(MarkovTextLimited other)
Method Summary

Modifier and Type

Method

Description

void

analyze(CharSequence corpus)

This is the main necessary step before using a MarkovTextLimited; you must call this method at some point before you can call any other methods.

<S extends CharSequence & Appendable> S

appendTo(S sb)

String

chain(long seed)

Generate a roughly-sentence-sized piece of text based on the previously analyzed corpus text (using analyze(CharSequence)) that terminates when stop punctuation is used (".", "!", "?", or "..."), or once the length would be greater than 200 characters without encountering stop punctuation(it terminates such a sentence with "." or "...").

String

chain(long seed, int maxLength)

Generate a roughly-sentence-sized piece of text based on the previously analyzed corpus text (using analyze(CharSequence)) that terminates when stop punctuation is used (".", "!", "?", or "...") or once the maxLength would be exceeded by any other words (it terminates such a sentence with "." or "...").

void

changeNames(Translator translator)

After calling analyze(CharSequence), you can optionally call this to alter any words in this MarkovTextLimited that were used as a proper noun (determined by whether they were capitalized in the middle of a sentence), changing them to a ciphered version using the given Translator.

MarkovTextLimited

copy()

Copies the String array words and the 2D jagged int array processed into a new MarkovTextLimited.

static MarkovTextLimited

stringDeserialize(String data)

Recreates an already-analyzed MarkovTextLimited given a String produced by stringSerialize().

String

stringSerialize()

Returns a representation of this MarkovTextLimited as a String; use stringDeserialize(String) to get a MarkovTextLimited back from this String.

Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- words
  
  public String[] words
  
  All words (case-sensitive and counting some punctuation as part of words) that this encountered during the latest call to analyze(CharSequence). Will be null if analyze() was never called.
- processed
  
  public int[][] processed
  
  Complicated data that mixes probabilities and the indices of words in words, generated during the latest call to analyze(CharSequence). This is a jagged 2D array. Will be null if analyze() was never called.
Constructor Details
- MarkovTextLimited
  
  public MarkovTextLimited()
- MarkovTextLimited
  
  public MarkovTextLimited(MarkovTextLimited other)
Method Details
- analyze
  
  public void analyze(CharSequence corpus)
  
  This is the main necessary step before using a MarkovTextLimited; you must call this method at some point before you can call any other methods. You can serialize this MarkovTextLimited after calling to avoid needing to call this again on later runs, or even include serialized MarkovTextLimited objects with a game to only need to call this during pre-processing. This method analyzes the pairings of words in a (typically large) corpus text, including some punctuation as part of words and some kinds as their own "words." It only uses one preceding word to determine the subsequent word. When it finishes processing, it stores the results in words and processed, which allows other methods to be called (they will throw a NullPointerException if analyze() hasn't been called).
  
  Parameters:
  
  corpus - a typically-large sample text in the style that should be mimicked
- changeNames
  
  public void changeNames(Translator translator)
  
  After calling analyze(CharSequence), you can optionally call this to alter any words in this MarkovTextLimited that were used as a proper noun (determined by whether they were capitalized in the middle of a sentence), changing them to a ciphered version using the given Translator. Normally you would initialize a Translator with a Language that matches the style you want for all names in this text, then pass that to this method during pre-processing (not necessarily at runtime, since this method isn't especially fast if the corpus was large). This method modifies this MarkovTextLimited in-place.
  
  Parameters:
  
  translator - a Translator that will be used to translate proper nouns in this MarkovTextLimited's word array
- chain
  
  public String chain(long seed)
  
  Generate a roughly-sentence-sized piece of text based on the previously analyzed corpus text (using analyze(CharSequence)) that terminates when stop punctuation is used (".", "!", "?", or "..."), or once the length would be greater than 200 characters without encountering stop punctuation(it terminates such a sentence with "." or "...").
  
  Parameters:
  
  seed - the seed for the random decisions this makes, as a long; any long can be used
  
  Returns:
  
  a String generated from the analyzed corpus text's word placement, usually a small sentence
- chain
  
  public String chain(long seed, int maxLength)
  
  Generate a roughly-sentence-sized piece of text based on the previously analyzed corpus text (using analyze(CharSequence)) that terminates when stop punctuation is used (".", "!", "?", or "...") or once the maxLength would be exceeded by any other words (it terminates such a sentence with "." or "...").
  
  Parameters:
  
  seed - the seed for the random decisions this makes, as a long; any long can be used
  
  maxLength - the maximum length for the generated String, in number of characters
  
  Returns:
  
  a String generated from the analyzed corpus text's word placement, usually a small sentence
- stringSerialize
  
  public String stringSerialize()
  
  Returns a representation of this MarkovTextLimited as a String; use stringDeserialize(String) to get a MarkovTextLimited back from this String. The words and processed fields must have been given values by either direct assignment, calling analyze(CharSequence), or building this MarkovTextLimited with the aforementioned stringDeserialize method. Uses spaces to separate words and a tab to separate the two fields.
  
  Returns:
  
  a String that can be used to store the analyzed words and frequencies in this MarkovTextLimited
- appendTo
  
  public <S extends CharSequence & Appendable> S appendTo(S sb)
- stringDeserialize
  
  public static MarkovTextLimited stringDeserialize(String data)
  
  Recreates an already-analyzed MarkovTextLimited given a String produced by stringSerialize().
  
  Parameters:
  
  data - a String returned by stringSerialize()
  
  Returns:
  
  a MarkovTextLimited that is ready to generate text with chain(long)
- copy
  
  public MarkovTextLimited copy()
  
  Copies the String array words and the 2D jagged int array processed into a new MarkovTextLimited. None of the arrays will be equivalent references, but the Strings (being immutable) will be the same objects in both MarkovTextLimited instances. This is primarily useful with changeNames(Translator), which can produce several variants on names given several initial copies produced with this method.
  
  Returns:
  
  a copy of this MarkovTextLimited

Class MarkovTextLimited

Field Summary

Constructor Summary

Method Summary

Methods inherited from class Object

Field Details

words

processed

Constructor Details

MarkovTextLimited

MarkovTextLimited

Method Details

analyze

changeNames

chain

chain

stringSerialize

appendTo

stringDeserialize

copy