com.github.yellowstonegames.core.StringTools

public final class StringTools extends Object

Various utility functions for handling readable natural-language text. This has tools to wrap long CharSequences to fit in a maximum width, and generally tidy up generated text. This last step includes padding left and right (including a "strict" option that truncates Strings that are longer than the padded size), Capitalizing Each Word, Capitalizing the first word in a sentence, replacing "a improper usage of a" with "an improved replacement using an," etc. This also has a lot of predefined categories of chars that are considered widely enough supported by fonts, like COMMON_PUNCTUATION and LATIN_LETTERS_UPPER.

Field Summary

Fields

Modifier and Type

Field

Description

static final com.github.tommyettinger.ds.CharBitSetFixedSize

ALL_UNICODE_LETTER_SET

An OffsetBitSet containing every letter char in the Unicode BMP as an index.

static final com.github.tommyettinger.ds.CharBitSetFixedSize

ALL_UNICODE_LOWERCASE_LETTER_SET

An OffsetBitSet containing every lower-case letter char in the Unicode BMP as an index.

static final com.github.tommyettinger.ds.CharBitSetFixedSize

ALL_UNICODE_UPPERCASE_LETTER_SET

An OffsetBitSet containing every upper-case letter char in the Unicode BMP as an index.

static final String

BOX_DRAWING

static final String

BOX_DRAWING_DOUBLE

static final String

BOX_DRAWING_SINGLE

static final String

COMMON_PUNCTUATION

static final String

CURRENCY

static final String

CYRILLIC_LETTERS

static final String

CYRILLIC_LETTERS_LOWER

static final String

CYRILLIC_LETTERS_UPPER

static final String

DIGITS

static final String

ENGLISH_LETTERS

static final String

ENGLISH_LETTERS_LOWER

static final String

ENGLISH_LETTERS_UPPER

static final String

GREEK_LETTERS

static final String

GREEK_LETTERS_LOWER

Includes both lower-case forms for Sigma, 'ς' and 'σ', but this matches the two upper-case Sigma in GREEK_LETTERS_UPPER.

static final String

GREEK_LETTERS_UPPER

Includes the letter Sigma, 'Σ', twice because it has two lower-case forms in GREEK_LETTERS_LOWER.

static final String

GROUPING_SIGNS_CLOSE

An index in GROUPING_SIGNS_OPEN can be used here to find the closing char for that opening one.

static final String

GROUPING_SIGNS_OPEN

Can be used to match an index with one in GROUPING_SIGNS_CLOSE to find the closing char (this way only).

static final String

LATIN_EXTENDED_LETTERS

static final String

LATIN_EXTENDED_LETTERS_LOWER

static final String

LATIN_EXTENDED_LETTERS_UPPER

static final String

LATIN_LETTERS

static final String

LATIN_LETTERS_LOWER

static final String

LATIN_LETTERS_UPPER

static final String

LETTERS

static final String

LETTERS_AND_NUMBERS

static final String

LETTERS_LOWER

static final String

LETTERS_UPPER

static final String

MARKS

static final String

MODERN_PUNCTUATION

static final regexodus.Pattern

nonSpacePattern

static final String

PERMISSIBLE_CHARS

A constant containing only chars that are reasonably likely to be supported by broad fonts and thus display-able.

static final String

PUNCTUATION

static final String

SPACING

static final String

TECHNICAL_PUNCTUATION

static final String

UNCOMMON_PUNCTUATION

static final String

VISUAL_SYMBOLS

static final regexodus.Pattern

whitespacePattern
Method Summary

Modifier and Type

Method

Description

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, boolean... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, byte... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, char... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, double... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, float... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, int... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, long... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, short... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, CharSequence... elements)

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, Iterable<?> elements)

Deprecated.

static StringBuilder

appendJoined(StringBuilder sb, CharSequence delimiter, Object[] elements)

Deprecated.

static StringBuilder

appendJoinedArrays(StringBuilder sb, CharSequence delimiter, char[]... elements)

static StringBuilder

appendJoinedDense(StringBuilder sb, boolean... elements)

Deprecated.

static StringBuilder

appendJoinedDense(StringBuilder sb, char t, char f, boolean... elements)

Deprecated.

static StringBuilder

appendJoinedReadably(StringBuilder sb, CharSequence delimiter, long... elements)

Like Base.appendJoined(CharSequence, CharSequence, long[]) , but this appends an 'L' to each number so they can be read in by Java.

static String

capitalize(CharSequence original)

Capitalizes Each Word In The Parameter original, Returning A New String.

static boolean

contains(CharSequence text, char[] search)

Deprecated.

static boolean

contains(CharSequence text, CharSequence search)

Deprecated.

static int

containsPart(CharSequence text, char[] search)

Deprecated.

static int

containsPart(CharSequence text, char[] search, CharSequence prefix, CharSequence suffix)

Tries to find as much of the sequence prefix search suffix as it can in text, where prefix and suffix are CharSequences for some reason and search is a char array.

static int

containsPart(CharSequence text, CharSequence search)

Deprecated.

static String

correctABeforeVowel(CharSequence text)

A simple method that looks for any occurrences of the word 'a' followed by some non-zero amount of whitespace and then any vowel starting the following word (such as 'a item'), then replaces each such improper 'a' with 'an' (such as 'an item').

static int

count(String source, int search)

Scans repeatedly in source for the codepoint search (which is usually a char literal), not scanning the same section twice, and returns the number of instances of search that were found, or 0 if source is null.

static int

count(String source, int search, int startIndex, int endIndex)

Scans repeatedly in source (only using the area from startIndex, inclusive, to endIndex, exclusive) for the codepoint search (which is usually a char literal), not scanning the same section twice, and returns the number of instances of search that were found, or 0 if source is null or if the searched area is empty.

static int

count(String source, String search)

Scans repeatedly in source for the String search, not scanning the same char twice except as part of a larger String, and returns the number of instances of search that were found, or 0 if source is null or if search is null or empty.

static int

count(String source, String search, int startIndex, int endIndex)

Scans repeatedly in source (only using the area from startIndex, inclusive, to endIndex, exclusive) for the String search, not scanning the same char twice except as part of a larger String, and returns the number of instances of search that were found, or 0 if source or search is null or if the searched area is empty.

static com.github.tommyettinger.ds.CharBitSetFixedSize

decompressCategory(regexodus.Category category)

Takes the compressed bitset inside a RegExodus Category and decompresses it to a jdkgdxds OffsetBitSet.

static int

indexOf(CharSequence text, String regex)

static int

indexOf(CharSequence text, String regex, int beginIndex)

static int

indexOf(CharSequence text, regexodus.Pattern regex)

static int

indexOf(CharSequence text, regexodus.Pattern regex, int beginIndex)

static String

join(CharSequence delimiter, boolean... elements)

static String

join(CharSequence delimiter, byte... elements)

static String

join(CharSequence delimiter, char... elements)

static String

join(CharSequence delimiter, double... elements)

static String

join(CharSequence delimiter, float... elements)

static String

join(CharSequence delimiter, int... elements)

static String

join(CharSequence delimiter, long... elements)

static String

join(CharSequence delimiter, short... elements)

static String

join(CharSequence delimiter, CharSequence... elements)

Deprecated.

static String

join(CharSequence delimiter, Iterable<?> elements)

Deprecated.

static String

join(CharSequence delimiter, Object[] elements)

Deprecated.

static String

joinArrays(CharSequence delimiter, char[]... elements)

static String

joinDense(boolean... elements)

Deprecated.

static String

joinDense(char t, char f, boolean... elements)

Deprecated.

static String

joinReadably(CharSequence delimiter, long... elements)

Like Base.join(CharSequence, long[]), but this appends an 'L' to each number, so they can be read in by Java.

static String

padLeft(String text, char padChar, int minimumLength)

If text is shorter than the given minimumLength, returns a String with text padded on the left with padChar until it reaches that length; otherwise it simply returns text.

static String

padLeft(String text, int minimumLength)

If text is shorter than the given minimumLength, returns a String with text padded on the left with spaces until it reaches that length; otherwise it simply returns text.

static String

padLeftStrict(String text, char padChar, int totalLength)

Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its left side with padChar until totalLength is reached.

static String

padLeftStrict(String text, int totalLength)

Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its left side with spaces until totalLength is reached.

static String

padRight(String text, char padChar, int minimumLength)

If text is shorter than the given minimumLength, returns a String with text padded on the right with padChar until it reaches that length; otherwise it simply returns text.

static String

padRight(String text, int minimumLength)

If text is shorter than the given minimumLength, returns a String with text padded on the right with spaces until it reaches that length; otherwise it simply returns text.

static String

padRightStrict(String text, char padChar, int totalLength)

Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its right side with padChar until totalLength is reached.

static String

padRightStrict(String text, int totalLength)

Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its right side with spaces until totalLength is reached.

static String

replace(CharSequence text, CharSequence before, CharSequence after)

Deprecated.

static String

safeSubstring(String source, int beginIndex, int endIndex)

Like String.substring(int, int) but returns "" instead of throwing any sort of Exception.

static String

sentenceCase(CharSequence original)

Attempts to scan for sentences in original, capitalizes the first letter of each sentence, and otherwise leaves the CharSequence untouched as it returns it as a String.

static String[]

split(String source, String delimiter)

Like String.split(String) but doesn't use any regex for splitting (the delimiter is a literal String).

static List<String>

wrap(CharSequence longText, int width)

Word-wraps the given String (or other CharSequence, such as a StringBuilder) so it is split into zero or more Strings as lines of text, with the given width as the maximum width for a line.

static List<String>

wrap(List<String> receiving, CharSequence longText, int width)

Word-wraps the given String (or other CharSequence, such as a StringBuilder) so it is split into zero or more Strings as lines of text, with the given width as the maximum width for a line; appends the word-wrapped lines to the given List of Strings and does not create a new List.

Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- whitespacePattern
  
  public static final regexodus.Pattern whitespacePattern
- nonSpacePattern
  
  public static final regexodus.Pattern nonSpacePattern
- PERMISSIBLE_CHARS
  public static final String PERMISSIBLE_CHARS
  
  A constant containing only chars that are reasonably likely to be supported by broad fonts and thus display-able. This assumes the font supports Latin, Greek, and Cyrillic alphabets, with good support for extended Latin (at least for European languages) but not required to be complete enough to support the very large Vietnamese set of extensions to Latin, nor to support any International Phonetic Alphabet (IPA) chars. It also assumes box drawing characters are supported and a handful of common dingbats, such as male and female signs. It does not include the tab, newline, or carriage return characters, since these don't usually make sense on a grid of chars.
  
  See Also:
  
  Constant Field Values
- BOX_DRAWING_SINGLE
  public static final String BOX_DRAWING_SINGLE
  
  See Also:
  
  Constant Field Values
- BOX_DRAWING_DOUBLE
  public static final String BOX_DRAWING_DOUBLE
  
  See Also:
  
  Constant Field Values
- BOX_DRAWING
  public static final String BOX_DRAWING
  
  See Also:
  
  Constant Field Values
- VISUAL_SYMBOLS
  public static final String VISUAL_SYMBOLS
  
  See Also:
  
  Constant Field Values
- DIGITS
  public static final String DIGITS
  
  See Also:
  
  Constant Field Values
- MARKS
  public static final String MARKS
  
  See Also:
  
  Constant Field Values
- GROUPING_SIGNS_OPEN
  public static final String GROUPING_SIGNS_OPEN
  
  Can be used to match an index with one in GROUPING_SIGNS_CLOSE to find the closing char (this way only).
  
  See Also:
  
  Constant Field Values
- GROUPING_SIGNS_CLOSE
  public static final String GROUPING_SIGNS_CLOSE
  
  An index in GROUPING_SIGNS_OPEN can be used here to find the closing char for that opening one.
  
  See Also:
  
  Constant Field Values
- COMMON_PUNCTUATION
  public static final String COMMON_PUNCTUATION
  
  See Also:
  
  Constant Field Values
- MODERN_PUNCTUATION
  public static final String MODERN_PUNCTUATION
  
  See Also:
  
  Constant Field Values
- UNCOMMON_PUNCTUATION
  public static final String UNCOMMON_PUNCTUATION
  
  See Also:
  
  Constant Field Values
- TECHNICAL_PUNCTUATION
  public static final String TECHNICAL_PUNCTUATION
  
  See Also:
  
  Constant Field Values
- PUNCTUATION
  public static final String PUNCTUATION
  
  See Also:
  
  Constant Field Values
- CURRENCY
  public static final String CURRENCY
  
  See Also:
  
  Constant Field Values
- SPACING
  public static final String SPACING
  
  See Also:
  
  Constant Field Values
- ENGLISH_LETTERS_UPPER
  public static final String ENGLISH_LETTERS_UPPER
  
  See Also:
  
  Constant Field Values
- ENGLISH_LETTERS_LOWER
  public static final String ENGLISH_LETTERS_LOWER
  
  See Also:
  
  Constant Field Values
- ENGLISH_LETTERS
  public static final String ENGLISH_LETTERS
  
  See Also:
  
  Constant Field Values
- LATIN_EXTENDED_LETTERS_UPPER
  public static final String LATIN_EXTENDED_LETTERS_UPPER
  
  See Also:
  
  Constant Field Values
- LATIN_EXTENDED_LETTERS_LOWER
  public static final String LATIN_EXTENDED_LETTERS_LOWER
  
  See Also:
  
  Constant Field Values
- LATIN_EXTENDED_LETTERS
  public static final String LATIN_EXTENDED_LETTERS
  
  See Also:
  
  Constant Field Values
- LATIN_LETTERS_UPPER
  public static final String LATIN_LETTERS_UPPER
  
  See Also:
  
  Constant Field Values
- LATIN_LETTERS_LOWER
  public static final String LATIN_LETTERS_LOWER
  
  See Also:
  
  Constant Field Values
- LATIN_LETTERS
  public static final String LATIN_LETTERS
  
  See Also:
  
  Constant Field Values
- GREEK_LETTERS_UPPER
  public static final String GREEK_LETTERS_UPPER
  
  Includes the letter Sigma, 'Σ', twice because it has two lower-case forms in GREEK_LETTERS_LOWER. This lets you use one index for both lower and upper case, like with Latin and Cyrillic.
  
  See Also:
  
  Constant Field Values
- GREEK_LETTERS_LOWER
  public static final String GREEK_LETTERS_LOWER
  
  Includes both lower-case forms for Sigma, 'ς' and 'σ', but this matches the two upper-case Sigma in GREEK_LETTERS_UPPER. This lets you use one index for both lower and upper case, like with Latin and Cyrillic.
  
  See Also:
  
  Constant Field Values
- GREEK_LETTERS
  public static final String GREEK_LETTERS
  
  See Also:
  
  Constant Field Values
- CYRILLIC_LETTERS_UPPER
  public static final String CYRILLIC_LETTERS_UPPER
  
  See Also:
  
  Constant Field Values
- CYRILLIC_LETTERS_LOWER
  public static final String CYRILLIC_LETTERS_LOWER
  
  See Also:
  
  Constant Field Values
- CYRILLIC_LETTERS
  public static final String CYRILLIC_LETTERS
  
  See Also:
  
  Constant Field Values
- LETTERS_UPPER
  public static final String LETTERS_UPPER
  
  See Also:
  
  Constant Field Values
- LETTERS_LOWER
  public static final String LETTERS_LOWER
  
  See Also:
  
  Constant Field Values
- LETTERS
  public static final String LETTERS
  
  See Also:
  
  Constant Field Values
- LETTERS_AND_NUMBERS
  public static final String LETTERS_AND_NUMBERS
  
  See Also:
  
  Constant Field Values
- ALL_UNICODE_LETTER_SET
  
  public static final com.github.tommyettinger.ds.CharBitSetFixedSize ALL_UNICODE_LETTER_SET
  
  An OffsetBitSet containing every letter char in the Unicode BMP as an index. You can check if a char c is in this set with ALL_UNICODE_LETTER_SET.contains(c) .
- ALL_UNICODE_UPPERCASE_LETTER_SET
  
  public static final com.github.tommyettinger.ds.CharBitSetFixedSize ALL_UNICODE_UPPERCASE_LETTER_SET
  
  An OffsetBitSet containing every upper-case letter char in the Unicode BMP as an index. You can check if a char c is in this set with ALL_UNICODE_UPPERCASE_LETTER_SET.contains(c) .
- ALL_UNICODE_LOWERCASE_LETTER_SET
  
  public static final com.github.tommyettinger.ds.CharBitSetFixedSize ALL_UNICODE_LOWERCASE_LETTER_SET
  
  An OffsetBitSet containing every lower-case letter char in the Unicode BMP as an index. You can check if a char c is in this set with ALL_UNICODE_LOWERCASE_LETTER_SET.contains(c) .
Method Details
- join
  
  @Deprecated public static String join(CharSequence delimiter, CharSequence... elements)
  
  Deprecated.
  
  Use TextTools.join(CharSequence, Object[]) instead.
  
  Parameters:
  
  delimiter -
  
  elements -
  
  Returns:
- joinArrays
  
  public static String joinArrays(CharSequence delimiter, char[]... elements)
- join
  
  public static String join(CharSequence delimiter, long... elements)
- join
  
  public static String join(CharSequence delimiter, double... elements)
- join
  
  public static String join(CharSequence delimiter, int... elements)
- join
  
  public static String join(CharSequence delimiter, float... elements)
- join
  
  public static String join(CharSequence delimiter, short... elements)
- join
  
  public static String join(CharSequence delimiter, char... elements)
- join
  
  public static String join(CharSequence delimiter, byte... elements)
- join
  
  public static String join(CharSequence delimiter, boolean... elements)
- joinReadably
  
  public static String joinReadably(CharSequence delimiter, long... elements)
  
  Like Base.join(CharSequence, long[]), but this appends an 'L' to each number, so they can be read in by Java. Replaced by Base.joinReadable(CharSequence, long[]) in most circumstances.
  
  Parameters:
  
  delimiter -
  
  elements -
  
  Returns:
- appendJoinedReadably
  
  public static StringBuilder appendJoinedReadably(StringBuilder sb, CharSequence delimiter, long... elements)
  
  Like Base.appendJoined(CharSequence, CharSequence, long[]) , but this appends an 'L' to each number so they can be read in by Java. Replaced by Base.appendJoinedReadable(CharSequence, CharSequence, long[]). * @param sb
  
  Parameters:
  
  delimiter -
  
  elements -
  
  Returns:
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, CharSequence... elements)
- appendJoinedArrays
  
  public static StringBuilder appendJoinedArrays(StringBuilder sb, CharSequence delimiter, char[]... elements)
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, long... elements)
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, double... elements)
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, int... elements)
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, float... elements)
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, short... elements)
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, char... elements)
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, byte... elements)
- appendJoined
  
  public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, boolean... elements)
- joinDense
  
  @Deprecated public static String joinDense(boolean... elements)
  
  Deprecated.
  
  Joins the boolean array elements without delimiters into a String, using "1" for true and "0" for false. This is "dense" because it doesn't have any delimiters between elements. Using TextTools.joinDense(boolean...) is recommended instead.
  
  Parameters:
  
  elements - an array or vararg of booleans
  
  Returns:
  
  a String using 1 for true elements and 0 for false, or the empty string if elements is null or empty
- joinDense
  
  @Deprecated public static String joinDense(char t, char f, boolean... elements)
  
  Deprecated.
  
  Joins the boolean array elements without delimiters into a String, using the char t for true and the char f for false. This is "dense" because it doesn't have any delimiters between elements. Using TextTools.joinDense(char, char, boolean...) is recommended instead.
  
  Parameters:
  
  t - the char to write for true values
  
  f - the char to write for false values
  
  elements - an array or vararg of booleans
  
  Returns:
  
  a String using 1 for true elements and 0 for false, or the empty string if elements is null or empty
- appendJoinedDense
  
  @Deprecated public static StringBuilder appendJoinedDense(StringBuilder sb, boolean... elements)
  
  Deprecated.
  
  Joins the boolean array elements without delimiters into a StringBuilder, using "1" for true and "0" for false. This is "dense" because it doesn't have any delimiters between elements. Using TextTools.appendJoinedDense(CharSequence, boolean...) is recommended instead.
  
  Parameters:
  
  sb - a StringBuilder that will be modified in-place
  
  elements - an array or vararg of booleans
  
  Returns:
  
  sb after modifications (if elements was non-null)
- appendJoinedDense
  
  @Deprecated public static StringBuilder appendJoinedDense(StringBuilder sb, char t, char f, boolean... elements)
  
  Deprecated.
  
  Joins the boolean array elements without delimiters into a StringBuilder, using the char t for true and the char f for false. This is "dense" because it doesn't have any delimiters between elements. Using TextTools.appendJoinedDense(CharSequence, char, char, boolean...) is recommended instead.
  
  Parameters:
  
  sb - a StringBuilder that will be modified in-place
  
  t - the char to write for true values
  
  f - the char to write for false values
  
  elements - an array or vararg of booleans
  
  Returns:
  
  sb after modifications (if elements was non-null)
- join
  
  @Deprecated public static String join(CharSequence delimiter, Object[] elements)
  
  Deprecated.
  
  Joins the items in elements by calling their toString method on them (or just using the String "null" for null items), and separating each item with delimiter. Unlike other join methods in this class, this does not take a vararg of Object items, since that would cause confusion with the overloads that take one object; it takes a non-vararg Object array instead. Using TextTools.join(CharSequence, Object[]) is recommended instead.
  
  Parameters:
  
  delimiter - the String or other CharSequence to separate items in elements with; if null, uses ""
  
  elements - the Object items to stringify and join into one String; if the array is null or empty, this returns an empty String, and if items are null, they are shown as "null"
  
  Returns:
  
  the String representations of the items in elements, separated by delimiter and put in one String
- join
  
  @Deprecated public static String join(CharSequence delimiter, Iterable<?> elements)
  
  Deprecated.
  
  Joins the items in elements by calling their toString method on them (or just using the String "null" for null items), and separating each item with delimiter. This can take any Iterable of any type for its elements parameter. Using TextTools.join(CharSequence, Iterable) is recommended instead.
  
  Parameters:
  
  delimiter - the String or other CharSequence to separate items in elements with; if null, uses ""
  
  elements - the Object items to stringify and join into one String; if Iterable is null or empty, this returns an empty String, and if items are null, they are shown as "null"
  
  Returns:
  
  the String representations of the items in elements, separated by delimiter and put in one String
- appendJoined
  
  @Deprecated public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, Object[] elements)
  
  Deprecated.
  
  Joins the items in elements by calling their toString method on them (or just using the String "null" for null items), and separating each item with delimiter. Unlike other join methods in this class, this does not take a vararg of Object items, since that would cause confusion with the overloads that take one object; it takes a non-vararg Object array instead. Using TextTools.appendJoined(CharSequence, CharSequence, Object[]) is recommended instead.
  
  Parameters:
  
  sb - a StringBuilder that will be modified in-place
  
  delimiter - the String or other CharSequence to separate items in elements with; if null, uses ""
  
  elements - the Object items to stringify and join into one String; if the array is null or empty, this returns an empty String, and if items are null, they are shown as "null"
  
  Returns:
  
  sb after modifications (if elements was non-null)
- appendJoined
  
  @Deprecated public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, Iterable<?> elements)
  
  Deprecated.
  
  Joins the items in elements by calling their toString method on them (or just using the String "null" for null items), and separating each item with delimiter. This can take any Iterable of any type for its elements parameter. Using TextTools.appendJoined(CharSequence, CharSequence, Iterable) is recommended instead.
  
  Parameters:
  
  sb - a StringBuilder that will be modified in-place
  
  delimiter - the String or other CharSequence to separate items in elements with; if null, uses ""
  
  elements - the Object items to stringify and join into one String; if Iterable is null or empty, this returns an empty String, and if items are null, they are shown as "null"
  
  Returns:
  
  sb after modifications (if elements was non-null)
- contains
  
  @Deprecated public static boolean contains(CharSequence text, CharSequence search)
  
  Deprecated.
  
  Searches text for the exact contents of the char array search; returns true if text contains search. Use TextTools.contains(CharSequence, CharSequence) instead.
  
  Parameters:
  
  text - a CharSequence, such as a String or StringBuilder, that might contain search
  
  search - a char array to try to find in text
  
  Returns:
  
  true if search was found
- containsPart
  
  @Deprecated public static int containsPart(CharSequence text, CharSequence search)
  
  Deprecated.
  
  Tries to find as much of the char array search in the CharSequence text, always starting from the beginning of search (if the beginning isn't found, then it finds nothing), and returns the length of the found part of search (0 if not found). Use TextTools.containsPart(CharSequence, CharSequence) instead.
  
  Parameters:
  
  text - a CharSequence to search in
  
  search - a char array to look for
  
  Returns:
  
  the length of the searched-for char array that was found
- contains
  
  @Deprecated public static boolean contains(CharSequence text, char[] search)
  
  Deprecated.
  
  Searches text for the exact contents of the char array search; returns true if text contains search. Use TextTools.contains(CharSequence, char[]) instead.
  
  Parameters:
  
  text - a CharSequence, such as a String or StringBuilder, that might contain search
  
  search - a char array to try to find in text
  
  Returns:
  
  true if search was found
- containsPart
  
  @Deprecated public static int containsPart(CharSequence text, char[] search)
  
  Deprecated.
  
  Tries to find as much of the char array search in the CharSequence text, always starting from the beginning of search (if the beginning isn't found, then it finds nothing), and returns the length of the found part of search (0 if not found). Use TextTools.containsPart(CharSequence, char[]) instead.
  
  Parameters:
  
  text - a CharSequence to search in
  
  search - a char array to look for
  
  Returns:
  
  the length of the searched-for char array that was found
- containsPart
  
  public static int containsPart(CharSequence text, char[] search, CharSequence prefix, CharSequence suffix)
  
  Tries to find as much of the sequence prefix search suffix as it can in text, where prefix and suffix are CharSequences for some reason and search is a char array. Returns the length of the sequence it was able to match, up to prefix.length() + search.length + suffix.length(), or 0 if no part of the looked-for sequence could be found.
  This is almost certainly too specific to be useful outside a handful of cases, but it isn't marked as deprecated because it was removed from TextTools. If you for whatever reason need this, it is here.
  
  Parameters:
  
  text - a CharSequence to search in
  
  search - a char array to look for, surrounded by prefix and suffix
  
  prefix - a mandatory prefix before search, separated for some weird optimization reason
  
  suffix - a mandatory suffix after search, separated for some weird optimization reason
  
  Returns:
  
  the length of the searched-for prefix+search+suffix that was found
- replace
  
  @Deprecated public static String replace(CharSequence text, CharSequence before, CharSequence after)
  
  Deprecated.
  
  Use TextTools.replace(CharSequence, CharSequence, CharSequence) instead.
  
  Parameters:
  
  text -
  
  before -
  
  after -
  
  Returns:
- count
  
  public static int count(String source, String search)
  
  Scans repeatedly in source for the String search, not scanning the same char twice except as part of a larger String, and returns the number of instances of search that were found, or 0 if source is null or if search is null or empty.
  
  Parameters:
  
  source - a String to look through
  
  search - a String to look for
  
  Returns:
  
  the number of times search was found in source
- count
  
  public static int count(String source, int search)
  
  Scans repeatedly in source for the codepoint search (which is usually a char literal), not scanning the same section twice, and returns the number of instances of search that were found, or 0 if source is null.
  
  Parameters:
  
  source - a String to look through
  
  search - a codepoint or char to look for
  
  Returns:
  
  the number of times search was found in source
- count
  
  public static int count(String source, String search, int startIndex, int endIndex)
  
  Scans repeatedly in source (only using the area from startIndex, inclusive, to endIndex, exclusive) for the String search, not scanning the same char twice except as part of a larger String, and returns the number of instances of search that were found, or 0 if source or search is null or if the searched area is empty. If endIndex is negative, this will search from startIndex until the end of the source.
  
  Parameters:
  
  source - a String to look through
  
  search - a String to look for
  
  startIndex - the first index to search through, inclusive
  
  endIndex - the last index to search through, exclusive; if negative this will search the rest of source
  
  Returns:
  
  the number of times search was found in source
- count
  
  public static int count(String source, int search, int startIndex, int endIndex)
  
  Scans repeatedly in source (only using the area from startIndex, inclusive, to endIndex, exclusive) for the codepoint search (which is usually a char literal), not scanning the same section twice, and returns the number of instances of search that were found, or 0 if source is null or if the searched area is empty. If endIndex is negative, this will search from startIndex until the end of the source.
  
  Parameters:
  
  source - a String to look through
  
  search - a codepoint or char to look for
  
  startIndex - the first index to search through, inclusive
  
  endIndex - the last index to search through, exclusive; if negative this will search the rest of source
  
  Returns:
  
  the number of times search was found in source
- safeSubstring
  
  public static String safeSubstring(String source, int beginIndex, int endIndex)
  
  Like String.substring(int, int) but returns "" instead of throwing any sort of Exception. This delegates to TextTools.safeSubstring(String, int, int).
  
  Parameters:
  
  source - the String to get a substring from
  
  beginIndex - the first index, inclusive; will be treated as 0 if negative
  
  endIndex - the index after the last character (exclusive); if negative this will be source.length()
  
  Returns:
  
  the substring of source between beginIndex and endIndex, or "" if any parameters are null/invalid
- split
  
  public static String[] split(String source, String delimiter)
  
  Like String.split(String) but doesn't use any regex for splitting (the delimiter is a literal String).
  
  Parameters:
  
  source - the String to get split-up substrings from
  
  delimiter - the literal String to split on (not a regex); will not be included in the returned String array
  
  Returns:
  
  a String array consisting of at least one String (the entirety of Source if nothing was split)
- padRight
  
  public static String padRight(String text, int minimumLength)
  
  If text is shorter than the given minimumLength, returns a String with text padded on the right with spaces until it reaches that length; otherwise it simply returns text.
  
  Parameters:
  
  text - the text to pad if necessary
  
  minimumLength - the minimum length of String to return
  
  Returns:
  
  text, potentially padded with spaces to reach the given minimum length
- padRight
  
  public static String padRight(String text, char padChar, int minimumLength)
  
  If text is shorter than the given minimumLength, returns a String with text padded on the right with padChar until it reaches that length; otherwise it simply returns text.
  
  Parameters:
  
  text - the text to pad if necessary
  
  padChar - the char to use to pad text, if necessary
  
  minimumLength - the minimum length of String to return
  
  Returns:
  
  text, potentially padded with padChar to reach the given minimum length
- padRightStrict
  
  public static String padRightStrict(String text, int totalLength)
  
  Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its right side with spaces until totalLength is reached. If text is longer than totalLength, this only uses the portion of text needed to fill totalLength, and no more.
  
  Parameters:
  
  text - the String to pad if necessary, or truncate if too long
  
  totalLength - the exact length of String to return
  
  Returns:
  
  a String with exactly totalLength for its length, made from text and possibly extra spaces
- padRightStrict
  
  public static String padRightStrict(String text, char padChar, int totalLength)
  
  Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its right side with padChar until totalLength is reached. If text is longer than totalLength, this only uses the portion of text needed to fill totalLength, and no more.
  
  Parameters:
  
  text - the String to pad if necessary, or truncate if too long
  
  padChar - the char to use to fill any remaining length
  
  totalLength - the exact length of String to return
  
  Returns:
  
  a String with exactly totalLength for its length, made from text and possibly padChar
- padLeft
  
  public static String padLeft(String text, int minimumLength)
  
  If text is shorter than the given minimumLength, returns a String with text padded on the left with spaces until it reaches that length; otherwise it simply returns text.
  
  Parameters:
  
  text - the text to pad if necessary
  
  minimumLength - the minimum length of String to return
  
  Returns:
  
  text, potentially padded with spaces to reach the given minimum length
- padLeft
  
  public static String padLeft(String text, char padChar, int minimumLength)
  
  If text is shorter than the given minimumLength, returns a String with text padded on the left with padChar until it reaches that length; otherwise it simply returns text.
  
  Parameters:
  
  text - the text to pad if necessary
  
  padChar - the char to use to pad text, if necessary
  
  minimumLength - the minimum length of String to return
  
  Returns:
  
  text, potentially padded with padChar to reach the given minimum length
- padLeftStrict
  
  public static String padLeftStrict(String text, int totalLength)
  
  Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its left side with spaces until totalLength is reached. If text is longer than totalLength, this only uses the portion of text needed to fill totalLength, and no more.
  
  Parameters:
  
  text - the String to pad if necessary, or truncate if too long
  
  totalLength - the exact length of String to return
  
  Returns:
  
  a String with exactly totalLength for its length, made from text and possibly extra spaces
- padLeftStrict
  
  public static String padLeftStrict(String text, char padChar, int totalLength)
  
  Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its left side with padChar until totalLength is reached. If text is longer than totalLength, this only uses the portion of text needed to fill totalLength, and no more.
  
  Parameters:
  
  text - the String to pad if necessary, or truncate if too long
  
  padChar - the char to use to fill any remaining length
  
  totalLength - the exact length of String to return
  
  Returns:
  
  a String with exactly totalLength for its length, made from text and possibly padChar
- wrap
  public static List<String> wrap(CharSequence longText, int width)
  
  Word-wraps the given String (or other CharSequence, such as a StringBuilder) so it is split into zero or more Strings as lines of text, with the given width as the maximum width for a line. This correctly splits most (all?) text in European languages on spaces (treating all whitespace characters matched by the regex '\\s' as breaking), and also uses the English-language rule (probably used in other languages as well) of splitting on hyphens and other dash characters (Unicode category Pd) in the middle of a word. This means for a phrase like "UN Secretary General Ban-Ki Moon", if the width was 12, then the Strings in the List returned would be
  
  "UN Secretary" "General Ban-" "Ki Moon"
  Spaces are not preserved if they were used to split something into two lines, but dashes are.
  
  Parameters:
  
  longText - a probably-large piece of text that needs to be split into multiple lines with a max width
  
  width - the max width to use for any line, removing trailing whitespace at the end of a line
  
  Returns:
  
  a List of Strings for the lines after word-wrapping
- wrap
  public static List<String> wrap(List<String> receiving, CharSequence longText, int width)
  
  Word-wraps the given String (or other CharSequence, such as a StringBuilder) so it is split into zero or more Strings as lines of text, with the given width as the maximum width for a line; appends the word-wrapped lines to the given List of Strings and does not create a new List. This correctly splits most (all?) text in European languages on spaces (treating all whitespace characters matched by the regex '\\s' as breaking), and also uses the English-language rule (probably used in other languages as well) of splitting on hyphens and other dash characters (Unicode category Pd) in the middle of a word. This means for a phrase like "UN Secretary General Ban-Ki Moon", if the width was 12, then the Strings in the List returned would be
  
  "UN Secretary" "General Ban-" "Ki Moon"
  Spaces are not preserved if they were used to split something into two lines, but dashes are.
  
  Parameters:
  
  receiving - the List of String to append the word-wrapped lines to
  
  longText - a probably-large piece of text that needs to be split into multiple lines with a max width
  
  width - the max width to use for any line, removing trailing whitespace at the end of a line
  
  Returns:
  
  the given receiving parameter, after appending the lines from word-wrapping
- indexOf
  
  public static int indexOf(CharSequence text, regexodus.Pattern regex, int beginIndex)
- indexOf
  
  public static int indexOf(CharSequence text, String regex, int beginIndex)
- indexOf
  
  public static int indexOf(CharSequence text, regexodus.Pattern regex)
- indexOf
  
  public static int indexOf(CharSequence text, String regex)
- capitalize
  
  public static String capitalize(CharSequence original)
  
  Capitalizes Each Word In The Parameter original, Returning A New String.
  
  Parameters:
  
  original - a CharSequence, such as a StringBuilder or String, which could have CrAzY capitalization
  
  Returns:
  
  A String With Each Word Capitalized At The Start And The Rest In Lower Case
- sentenceCase
  
  public static String sentenceCase(CharSequence original)
  
  Attempts to scan for sentences in original, capitalizes the first letter of each sentence, and otherwise leaves the CharSequence untouched as it returns it as a String. Sentences are detected with a crude heuristic of "does it have periods, exclamation marks, or question marks at the end, or does it reach the end of input? If yes, it's a sentence."
  
  Parameters:
  
  original - a CharSequence that is expected to contain sentence-like data that needs capitalization; existing upper-case letters will stay upper-case.
  
  Returns:
  
  a String where the first letter of each sentence (detected as best this can) is capitalized.
- correctABeforeVowel
  
  public static String correctABeforeVowel(CharSequence text)
  
  A simple method that looks for any occurrences of the word 'a' followed by some non-zero amount of whitespace and then any vowel starting the following word (such as 'a item'), then replaces each such improper 'a' with 'an' (such as 'an item'). The regex used here isn't bulletproof, but it should be fairly robust, handling when you have multiple whitespace chars, different whitespace chars (like carriage return and newline), accented vowels in the following word (but not in the initial 'a', which is expected to use English spelling rules), and the case of the initial 'a' or 'A'. This also changes improper uses of "an" back to "a", such as by changing "an dog" to "a dog", or "an malevolent force" to "a malevolent force".
  Gotta love Regexodus; this is a two-liner that uses features specific to that regular expression library. This only matches text in the Latin script because a/an is a feature of English, and doesn't have a direct equivalent I know of in the Greek or Cyrillic scripts. There could easily be one! I just couldn't verify it.
  
  Parameters:
  
  text - the (probably generated English) multi-word text to search for 'a'/'an' in and possibly replace
  
  Returns:
  
  a new String with every improper 'a' and 'an' replaced
- decompressCategory
  
  public static com.github.tommyettinger.ds.CharBitSetFixedSize decompressCategory(regexodus.Category category)
  
  Takes the compressed bitset inside a RegExodus Category and decompresses it to a jdkgdxds OffsetBitSet. This may improve lookup time for frequently-checked Categories, since OffsetBitSet.contains(int) is quite fast (it runs in O(1) time), while Category.contains(char) is... not as fast (it runs in O(n) time, where n is the RLE-compressed size of the entire bitset). An OffsetBitSet can also be modified if needed, whereas a Category cannot.
  
  Parameters:
  
  category - a RegExodus Category, such as Category.Lu for upper-case letters
  
  Returns:
  
  a new OffsetBitSet storing the same contents as the given Category, but optimized for faster access

Class StringTools

Field Summary

Method Summary

Methods inherited from class Object

Field Details

whitespacePattern

nonSpacePattern

PERMISSIBLE_CHARS

BOX_DRAWING_SINGLE

BOX_DRAWING_DOUBLE

BOX_DRAWING

VISUAL_SYMBOLS

DIGITS

MARKS

GROUPING_SIGNS_OPEN

GROUPING_SIGNS_CLOSE

COMMON_PUNCTUATION

MODERN_PUNCTUATION

UNCOMMON_PUNCTUATION

TECHNICAL_PUNCTUATION

PUNCTUATION

CURRENCY

SPACING

ENGLISH_LETTERS_UPPER

ENGLISH_LETTERS_LOWER

ENGLISH_LETTERS

LATIN_EXTENDED_LETTERS_UPPER

LATIN_EXTENDED_LETTERS_LOWER

LATIN_EXTENDED_LETTERS

LATIN_LETTERS_UPPER

LATIN_LETTERS_LOWER

LATIN_LETTERS

GREEK_LETTERS_UPPER

GREEK_LETTERS_LOWER

GREEK_LETTERS

CYRILLIC_LETTERS_UPPER

CYRILLIC_LETTERS_LOWER

CYRILLIC_LETTERS

LETTERS_UPPER

LETTERS_LOWER

LETTERS

LETTERS_AND_NUMBERS

ALL_UNICODE_LETTER_SET

ALL_UNICODE_UPPERCASE_LETTER_SET

ALL_UNICODE_LOWERCASE_LETTER_SET

Method Details

join

joinArrays

join

join

join

join

join

join

join

join

joinReadably

appendJoinedReadably

appendJoined

appendJoinedArrays

appendJoined

appendJoined

appendJoined

appendJoined

appendJoined

appendJoined

appendJoined

appendJoined

joinDense

joinDense

appendJoinedDense

appendJoinedDense

join

join

appendJoined

appendJoined

contains

containsPart

contains

containsPart