Class StringTools
java.lang.Object
com.github.yellowstonegames.core.StringTools
Various utility functions for handling readable natural-language text. This has tools to wrap long
CharSequences to fit in a maximum width, and generally tidy up generated text. This last step includes padding left
and right (including a "strict" option that truncates Strings that are longer than the padded size), Capitalizing
Each Word, Capitalizing the first word in a sentence, replacing "a improper usage of a" with "an improved
replacement using an," etc. This also has a lot of predefined categories of chars that are considered widely enough
supported by fonts, like
COMMON_PUNCTUATION and LATIN_LETTERS_UPPER.-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final com.github.tommyettinger.ds.CharBitSetFixedSizeAn OffsetBitSet containing every letter char in the Unicode BMP as an index.static final com.github.tommyettinger.ds.CharBitSetFixedSizeAn OffsetBitSet containing every lower-case letter char in the Unicode BMP as an index.static final com.github.tommyettinger.ds.CharBitSetFixedSizeAn OffsetBitSet containing every upper-case letter char in the Unicode BMP as an index.static final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final StringIncludes both lower-case forms for Sigma, 'ς' and 'σ', but this matches the two upper-case Sigma inGREEK_LETTERS_UPPER.static final StringIncludes the letter Sigma, 'Σ', twice because it has two lower-case forms inGREEK_LETTERS_LOWER.static final StringAn index inGROUPING_SIGNS_OPENcan be used here to find the closing char for that opening one.static final StringCan be used to match an index with one inGROUPING_SIGNS_CLOSEto find the closing char (this way only).static final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final regexodus.Patternstatic final StringA constant containing only chars that are reasonably likely to be supported by broad fonts and thus display-able.static final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final regexodus.Pattern -
Method Summary
Modifier and TypeMethodDescriptionstatic StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, boolean... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, byte... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, char... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, double... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, float... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, int... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, long... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, short... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, CharSequence... elements) static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, Iterable<?> elements) Deprecated.static StringBuilderappendJoined(StringBuilder sb, CharSequence delimiter, Object[] elements) Deprecated.static StringBuilderappendJoinedArrays(StringBuilder sb, CharSequence delimiter, char[]... elements) static StringBuilderappendJoinedDense(StringBuilder sb, boolean... elements) Deprecated.static StringBuilderappendJoinedDense(StringBuilder sb, char t, char f, boolean... elements) Deprecated.static StringBuilderappendJoinedReadably(StringBuilder sb, CharSequence delimiter, long... elements) LikeBase.appendJoined(CharSequence, CharSequence, long[]), but this appends an 'L' to each number so they can be read in by Java.static Stringcapitalize(CharSequence original) Capitalizes Each Word In The Parameteroriginal, Returning A New String.static booleancontains(CharSequence text, char[] search) Deprecated.static booleancontains(CharSequence text, CharSequence search) Deprecated.static intcontainsPart(CharSequence text, char[] search) Deprecated.static intcontainsPart(CharSequence text, char[] search, CharSequence prefix, CharSequence suffix) Tries to find as much of the sequenceprefix search suffixas it can in text, where prefix and suffix are CharSequences for some reason and search is a char array.static intcontainsPart(CharSequence text, CharSequence search) Deprecated.static StringA simple method that looks for any occurrences of the word 'a' followed by some non-zero amount of whitespace and then any vowel starting the following word (such as 'a item'), then replaces each such improper 'a' with 'an' (such as 'an item').static intScans repeatedly insourcefor the codepointsearch(which is usually a char literal), not scanning the same section twice, and returns the number of instances of search that were found, or 0 if source is null.static intScans repeatedly insource(only using the area from startIndex, inclusive, to endIndex, exclusive) for the codepointsearch(which is usually a char literal), not scanning the same section twice, and returns the number of instances of search that were found, or 0 if source is null or if the searched area is empty.static intScans repeatedly insourcefor the Stringsearch, not scanning the same char twice except as part of a larger String, and returns the number of instances of search that were found, or 0 if source is null or if search is null or empty.static intScans repeatedly insource(only using the area from startIndex, inclusive, to endIndex, exclusive) for the Stringsearch, not scanning the same char twice except as part of a larger String, and returns the number of instances of search that were found, or 0 if source or search is null or if the searched area is empty.static com.github.tommyettinger.ds.CharBitSetFixedSizedecompressCategory(regexodus.Category category) Takes the compressed bitset inside a RegExodusCategoryand decompresses it to a jdkgdxdsOffsetBitSet.static intindexOf(CharSequence text, String regex) static intindexOf(CharSequence text, String regex, int beginIndex) static intindexOf(CharSequence text, regexodus.Pattern regex) static intindexOf(CharSequence text, regexodus.Pattern regex, int beginIndex) static Stringjoin(CharSequence delimiter, boolean... elements) static Stringjoin(CharSequence delimiter, byte... elements) static Stringjoin(CharSequence delimiter, char... elements) static Stringjoin(CharSequence delimiter, double... elements) static Stringjoin(CharSequence delimiter, float... elements) static Stringjoin(CharSequence delimiter, int... elements) static Stringjoin(CharSequence delimiter, long... elements) static Stringjoin(CharSequence delimiter, short... elements) static Stringjoin(CharSequence delimiter, CharSequence... elements) Deprecated.static Stringjoin(CharSequence delimiter, Iterable<?> elements) Deprecated.static Stringjoin(CharSequence delimiter, Object[] elements) Deprecated.static StringjoinArrays(CharSequence delimiter, char[]... elements) static StringjoinDense(boolean... elements) Deprecated.static StringjoinDense(char t, char f, boolean... elements) Deprecated.static StringjoinReadably(CharSequence delimiter, long... elements) LikeBase.join(CharSequence, long[]), but this appends an 'L' to each number, so they can be read in by Java.static StringIf text is shorter than the given minimumLength, returns a String with text padded on the left with padChar until it reaches that length; otherwise it simply returns text.static StringIf text is shorter than the given minimumLength, returns a String with text padded on the left with spaces until it reaches that length; otherwise it simply returns text.static StringpadLeftStrict(String text, char padChar, int totalLength) Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its left side with padChar until totalLength is reached.static StringpadLeftStrict(String text, int totalLength) Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its left side with spaces until totalLength is reached.static StringIf text is shorter than the given minimumLength, returns a String with text padded on the right with padChar until it reaches that length; otherwise it simply returns text.static StringIf text is shorter than the given minimumLength, returns a String with text padded on the right with spaces until it reaches that length; otherwise it simply returns text.static StringpadRightStrict(String text, char padChar, int totalLength) Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its right side with padChar until totalLength is reached.static StringpadRightStrict(String text, int totalLength) Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its right side with spaces until totalLength is reached.static Stringreplace(CharSequence text, CharSequence before, CharSequence after) Deprecated.static StringsafeSubstring(String source, int beginIndex, int endIndex) LikeString.substring(int, int)but returns "" instead of throwing any sort of Exception.static StringsentenceCase(CharSequence original) Attempts to scan for sentences inoriginal, capitalizes the first letter of each sentence, and otherwise leaves the CharSequence untouched as it returns it as a String.static String[]LikeString.split(String)but doesn't use any regex for splitting (the delimiter is a literal String).wrap(CharSequence longText, int width) Word-wraps the given String (or other CharSequence, such as a StringBuilder) so it is split into zero or more Strings as lines of text, with the given width as the maximum width for a line.wrap(List<String> receiving, CharSequence longText, int width) Word-wraps the given String (or other CharSequence, such as a StringBuilder) so it is split into zero or more Strings as lines of text, with the given width as the maximum width for a line; appends the word-wrapped lines to the given List of Strings and does not create a new List.
-
Field Details
-
whitespacePattern
public static final regexodus.Pattern whitespacePattern -
nonSpacePattern
public static final regexodus.Pattern nonSpacePattern -
PERMISSIBLE_CHARS
A constant containing only chars that are reasonably likely to be supported by broad fonts and thus display-able. This assumes the font supports Latin, Greek, and Cyrillic alphabets, with good support for extended Latin (at least for European languages) but not required to be complete enough to support the very large Vietnamese set of extensions to Latin, nor to support any International Phonetic Alphabet (IPA) chars. It also assumes box drawing characters are supported and a handful of common dingbats, such as male and female signs. It does not include the tab, newline, or carriage return characters, since these don't usually make sense on a grid of chars.- See Also:
-
BOX_DRAWING_SINGLE
- See Also:
-
BOX_DRAWING_DOUBLE
- See Also:
-
BOX_DRAWING
- See Also:
-
VISUAL_SYMBOLS
- See Also:
-
DIGITS
- See Also:
-
MARKS
- See Also:
-
GROUPING_SIGNS_OPEN
Can be used to match an index with one inGROUPING_SIGNS_CLOSEto find the closing char (this way only).- See Also:
-
GROUPING_SIGNS_CLOSE
An index inGROUPING_SIGNS_OPENcan be used here to find the closing char for that opening one.- See Also:
-
COMMON_PUNCTUATION
- See Also:
-
MODERN_PUNCTUATION
- See Also:
-
UNCOMMON_PUNCTUATION
- See Also:
-
TECHNICAL_PUNCTUATION
- See Also:
-
PUNCTUATION
- See Also:
-
CURRENCY
- See Also:
-
SPACING
- See Also:
-
ENGLISH_LETTERS_UPPER
- See Also:
-
ENGLISH_LETTERS_LOWER
- See Also:
-
ENGLISH_LETTERS
- See Also:
-
LATIN_EXTENDED_LETTERS_UPPER
- See Also:
-
LATIN_EXTENDED_LETTERS_LOWER
- See Also:
-
LATIN_EXTENDED_LETTERS
- See Also:
-
LATIN_LETTERS_UPPER
- See Also:
-
LATIN_LETTERS_LOWER
- See Also:
-
LATIN_LETTERS
- See Also:
-
GREEK_LETTERS_UPPER
Includes the letter Sigma, 'Σ', twice because it has two lower-case forms inGREEK_LETTERS_LOWER. This lets you use one index for both lower and upper case, like with Latin and Cyrillic.- See Also:
-
GREEK_LETTERS_LOWER
Includes both lower-case forms for Sigma, 'ς' and 'σ', but this matches the two upper-case Sigma inGREEK_LETTERS_UPPER. This lets you use one index for both lower and upper case, like with Latin and Cyrillic.- See Also:
-
GREEK_LETTERS
- See Also:
-
CYRILLIC_LETTERS_UPPER
- See Also:
-
CYRILLIC_LETTERS_LOWER
- See Also:
-
CYRILLIC_LETTERS
- See Also:
-
LETTERS_UPPER
- See Also:
-
LETTERS_LOWER
- See Also:
-
LETTERS
- See Also:
-
LETTERS_AND_NUMBERS
- See Also:
-
ALL_UNICODE_LETTER_SET
public static final com.github.tommyettinger.ds.CharBitSetFixedSize ALL_UNICODE_LETTER_SETAn OffsetBitSet containing every letter char in the Unicode BMP as an index. You can check if a charcis in this set withALL_UNICODE_LETTER_SET.contains(c). -
ALL_UNICODE_UPPERCASE_LETTER_SET
public static final com.github.tommyettinger.ds.CharBitSetFixedSize ALL_UNICODE_UPPERCASE_LETTER_SETAn OffsetBitSet containing every upper-case letter char in the Unicode BMP as an index. You can check if a charcis in this set withALL_UNICODE_UPPERCASE_LETTER_SET.contains(c). -
ALL_UNICODE_LOWERCASE_LETTER_SET
public static final com.github.tommyettinger.ds.CharBitSetFixedSize ALL_UNICODE_LOWERCASE_LETTER_SETAn OffsetBitSet containing every lower-case letter char in the Unicode BMP as an index. You can check if a charcis in this set withALL_UNICODE_LOWERCASE_LETTER_SET.contains(c).
-
-
Method Details
-
join
Deprecated.UseTextTools.join(CharSequence, Object[])instead.- Parameters:
delimiter-elements-- Returns:
-
joinArrays
-
join
-
join
-
join
-
join
-
join
-
join
-
join
-
join
-
joinReadably
LikeBase.join(CharSequence, long[]), but this appends an 'L' to each number, so they can be read in by Java. Replaced byBase.joinReadable(CharSequence, long[])in most circumstances.- Parameters:
delimiter-elements-- Returns:
-
appendJoinedReadably
public static StringBuilder appendJoinedReadably(StringBuilder sb, CharSequence delimiter, long... elements) LikeBase.appendJoined(CharSequence, CharSequence, long[]), but this appends an 'L' to each number so they can be read in by Java. Replaced byBase.appendJoinedReadable(CharSequence, CharSequence, long[]). * @param sb- Parameters:
delimiter-elements-- Returns:
-
appendJoined
public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, CharSequence... elements) -
appendJoinedArrays
public static StringBuilder appendJoinedArrays(StringBuilder sb, CharSequence delimiter, char[]... elements) -
appendJoined
public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, long... elements) -
appendJoined
public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, double... elements) -
appendJoined
-
appendJoined
public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, float... elements) -
appendJoined
public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, short... elements) -
appendJoined
public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, char... elements) -
appendJoined
public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, byte... elements) -
appendJoined
public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, boolean... elements) -
joinDense
Deprecated.Joins the boolean arrayelementswithout delimiters into a String, using "1" for true and "0" for false. This is "dense" because it doesn't have any delimiters between elements. UsingTextTools.joinDense(boolean...)is recommended instead.- Parameters:
elements- an array or vararg of booleans- Returns:
- a String using 1 for true elements and 0 for false, or the empty string if elements is null or empty
-
joinDense
Deprecated.Joins the boolean arrayelementswithout delimiters into a String, using the chartfor true and the charffor false. This is "dense" because it doesn't have any delimiters between elements. UsingTextTools.joinDense(char, char, boolean...)is recommended instead.- Parameters:
t- the char to write for true valuesf- the char to write for false valueselements- an array or vararg of booleans- Returns:
- a String using 1 for true elements and 0 for false, or the empty string if elements is null or empty
-
appendJoinedDense
Deprecated.Joins the boolean arrayelementswithout delimiters into a StringBuilder, using "1" for true and "0" for false. This is "dense" because it doesn't have any delimiters between elements. UsingTextTools.appendJoinedDense(CharSequence, boolean...)is recommended instead.- Parameters:
sb- a StringBuilder that will be modified in-placeelements- an array or vararg of booleans- Returns:
- sb after modifications (if elements was non-null)
-
appendJoinedDense
@Deprecated public static StringBuilder appendJoinedDense(StringBuilder sb, char t, char f, boolean... elements) Deprecated.Joins the boolean arrayelementswithout delimiters into a StringBuilder, using the chartfor true and the charffor false. This is "dense" because it doesn't have any delimiters between elements. UsingTextTools.appendJoinedDense(CharSequence, char, char, boolean...)is recommended instead.- Parameters:
sb- a StringBuilder that will be modified in-placet- the char to write for true valuesf- the char to write for false valueselements- an array or vararg of booleans- Returns:
- sb after modifications (if elements was non-null)
-
join
Deprecated.Joins the items inelementsby calling their toString method on them (or just using the String "null" for null items), and separating each item withdelimiter. Unlike other join methods in this class, this does not take a vararg of Object items, since that would cause confusion with the overloads that take one object; it takes a non-vararg Object array instead. UsingTextTools.join(CharSequence, Object[])is recommended instead.- Parameters:
delimiter- the String or other CharSequence to separate items in elements with; if null, uses ""elements- the Object items to stringify and join into one String; if the array is null or empty, this returns an empty String, and if items are null, they are shown as "null"- Returns:
- the String representations of the items in elements, separated by delimiter and put in one String
-
join
Deprecated.Joins the items inelementsby calling their toString method on them (or just using the String "null" for null items), and separating each item withdelimiter. This can take any Iterable of any type for its elements parameter. UsingTextTools.join(CharSequence, Iterable)is recommended instead.- Parameters:
delimiter- the String or other CharSequence to separate items in elements with; if null, uses ""elements- the Object items to stringify and join into one String; if Iterable is null or empty, this returns an empty String, and if items are null, they are shown as "null"- Returns:
- the String representations of the items in elements, separated by delimiter and put in one String
-
appendJoined
@Deprecated public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, Object[] elements) Deprecated.Joins the items inelementsby calling their toString method on them (or just using the String "null" for null items), and separating each item withdelimiter. Unlike other join methods in this class, this does not take a vararg of Object items, since that would cause confusion with the overloads that take one object; it takes a non-vararg Object array instead. UsingTextTools.appendJoined(CharSequence, CharSequence, Object[])is recommended instead.- Parameters:
sb- a StringBuilder that will be modified in-placedelimiter- the String or other CharSequence to separate items in elements with; if null, uses ""elements- the Object items to stringify and join into one String; if the array is null or empty, this returns an empty String, and if items are null, they are shown as "null"- Returns:
- sb after modifications (if elements was non-null)
-
appendJoined
@Deprecated public static StringBuilder appendJoined(StringBuilder sb, CharSequence delimiter, Iterable<?> elements) Deprecated.Joins the items inelementsby calling their toString method on them (or just using the String "null" for null items), and separating each item withdelimiter. This can take any Iterable of any type for itselementsparameter. UsingTextTools.appendJoined(CharSequence, CharSequence, Iterable)is recommended instead.- Parameters:
sb- a StringBuilder that will be modified in-placedelimiter- the String or other CharSequence to separate items in elements with; if null, uses ""elements- the Object items to stringify and join into one String; if Iterable is null or empty, this returns an empty String, and if items are null, they are shown as "null"- Returns:
- sb after modifications (if elements was non-null)
-
contains
Deprecated.Searches text for the exact contents of the char array search; returns true if text contains search. UseTextTools.contains(CharSequence, CharSequence)instead.- Parameters:
text- a CharSequence, such as a String or StringBuilder, that might contain searchsearch- a char array to try to find in text- Returns:
- true if search was found
-
containsPart
Deprecated.Tries to find as much of the char arraysearchin the CharSequencetext, always starting from the beginning of search (if the beginning isn't found, then it finds nothing), and returns the length of the found part of search (0 if not found). UseTextTools.containsPart(CharSequence, CharSequence)instead.- Parameters:
text- a CharSequence to search insearch- a char array to look for- Returns:
- the length of the searched-for char array that was found
-
contains
Deprecated.Searches text for the exact contents of the char array search; returns true if text contains search. UseTextTools.contains(CharSequence, char[])instead.- Parameters:
text- a CharSequence, such as a String or StringBuilder, that might contain searchsearch- a char array to try to find in text- Returns:
- true if search was found
-
containsPart
Deprecated.Tries to find as much of the char arraysearchin the CharSequencetext, always starting from the beginning of search (if the beginning isn't found, then it finds nothing), and returns the length of the found part of search (0 if not found). UseTextTools.containsPart(CharSequence, char[])instead.- Parameters:
text- a CharSequence to search insearch- a char array to look for- Returns:
- the length of the searched-for char array that was found
-
containsPart
public static int containsPart(CharSequence text, char[] search, CharSequence prefix, CharSequence suffix) Tries to find as much of the sequenceprefix search suffixas it can in text, where prefix and suffix are CharSequences for some reason and search is a char array. Returns the length of the sequence it was able to match, up toprefix.length() + search.length + suffix.length(), or 0 if no part of the looked-for sequence could be found.
This is almost certainly too specific to be useful outside a handful of cases, but it isn't marked as deprecated because it was removed from TextTools. If you for whatever reason need this, it is here.- Parameters:
text- a CharSequence to search insearch- a char array to look for, surrounded by prefix and suffixprefix- a mandatory prefix before search, separated for some weird optimization reasonsuffix- a mandatory suffix after search, separated for some weird optimization reason- Returns:
- the length of the searched-for prefix+search+suffix that was found
-
replace
@Deprecated public static String replace(CharSequence text, CharSequence before, CharSequence after) Deprecated.UseTextTools.replace(CharSequence, CharSequence, CharSequence)instead.- Parameters:
text-before-after-- Returns:
-
count
Scans repeatedly insourcefor the Stringsearch, not scanning the same char twice except as part of a larger String, and returns the number of instances of search that were found, or 0 if source is null or if search is null or empty.- Parameters:
source- a String to look throughsearch- a String to look for- Returns:
- the number of times search was found in source
-
count
Scans repeatedly insourcefor the codepointsearch(which is usually a char literal), not scanning the same section twice, and returns the number of instances of search that were found, or 0 if source is null.- Parameters:
source- a String to look throughsearch- a codepoint or char to look for- Returns:
- the number of times search was found in source
-
count
Scans repeatedly insource(only using the area from startIndex, inclusive, to endIndex, exclusive) for the Stringsearch, not scanning the same char twice except as part of a larger String, and returns the number of instances of search that were found, or 0 if source or search is null or if the searched area is empty. If endIndex is negative, this will search from startIndex until the end of the source.- Parameters:
source- a String to look throughsearch- a String to look forstartIndex- the first index to search through, inclusiveendIndex- the last index to search through, exclusive; if negative this will search the rest of source- Returns:
- the number of times search was found in source
-
count
Scans repeatedly insource(only using the area from startIndex, inclusive, to endIndex, exclusive) for the codepointsearch(which is usually a char literal), not scanning the same section twice, and returns the number of instances of search that were found, or 0 if source is null or if the searched area is empty. If endIndex is negative, this will search from startIndex until the end of the source.- Parameters:
source- a String to look throughsearch- a codepoint or char to look forstartIndex- the first index to search through, inclusiveendIndex- the last index to search through, exclusive; if negative this will search the rest of source- Returns:
- the number of times search was found in source
-
safeSubstring
LikeString.substring(int, int)but returns "" instead of throwing any sort of Exception. This delegates toTextTools.safeSubstring(String, int, int).- Parameters:
source- the String to get a substring frombeginIndex- the first index, inclusive; will be treated as 0 if negativeendIndex- the index after the last character (exclusive); if negative this will be source.length()- Returns:
- the substring of source between beginIndex and endIndex, or "" if any parameters are null/invalid
-
split
LikeString.split(String)but doesn't use any regex for splitting (the delimiter is a literal String).- Parameters:
source- the String to get split-up substrings fromdelimiter- the literal String to split on (not a regex); will not be included in the returned String array- Returns:
- a String array consisting of at least one String (the entirety of Source if nothing was split)
-
padRight
If text is shorter than the given minimumLength, returns a String with text padded on the right with spaces until it reaches that length; otherwise it simply returns text.- Parameters:
text- the text to pad if necessaryminimumLength- the minimum length of String to return- Returns:
- text, potentially padded with spaces to reach the given minimum length
-
padRight
If text is shorter than the given minimumLength, returns a String with text padded on the right with padChar until it reaches that length; otherwise it simply returns text.- Parameters:
text- the text to pad if necessarypadChar- the char to use to pad text, if necessaryminimumLength- the minimum length of String to return- Returns:
- text, potentially padded with padChar to reach the given minimum length
-
padRightStrict
Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its right side with spaces until totalLength is reached. If text is longer than totalLength, this only uses the portion of text needed to fill totalLength, and no more.- Parameters:
text- the String to pad if necessary, or truncate if too longtotalLength- the exact length of String to return- Returns:
- a String with exactly totalLength for its length, made from text and possibly extra spaces
-
padRightStrict
Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its right side with padChar until totalLength is reached. If text is longer than totalLength, this only uses the portion of text needed to fill totalLength, and no more.- Parameters:
text- the String to pad if necessary, or truncate if too longpadChar- the char to use to fill any remaining lengthtotalLength- the exact length of String to return- Returns:
- a String with exactly totalLength for its length, made from text and possibly padChar
-
padLeft
If text is shorter than the given minimumLength, returns a String with text padded on the left with spaces until it reaches that length; otherwise it simply returns text.- Parameters:
text- the text to pad if necessaryminimumLength- the minimum length of String to return- Returns:
- text, potentially padded with spaces to reach the given minimum length
-
padLeft
If text is shorter than the given minimumLength, returns a String with text padded on the left with padChar until it reaches that length; otherwise it simply returns text.- Parameters:
text- the text to pad if necessarypadChar- the char to use to pad text, if necessaryminimumLength- the minimum length of String to return- Returns:
- text, potentially padded with padChar to reach the given minimum length
-
padLeftStrict
Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its left side with spaces until totalLength is reached. If text is longer than totalLength, this only uses the portion of text needed to fill totalLength, and no more.- Parameters:
text- the String to pad if necessary, or truncate if too longtotalLength- the exact length of String to return- Returns:
- a String with exactly totalLength for its length, made from text and possibly extra spaces
-
padLeftStrict
Constructs a String with exactly the given totalLength by taking text (or a substring of it) and padding it on its left side with padChar until totalLength is reached. If text is longer than totalLength, this only uses the portion of text needed to fill totalLength, and no more.- Parameters:
text- the String to pad if necessary, or truncate if too longpadChar- the char to use to fill any remaining lengthtotalLength- the exact length of String to return- Returns:
- a String with exactly totalLength for its length, made from text and possibly padChar
-
wrap
Word-wraps the given String (or other CharSequence, such as a StringBuilder) so it is split into zero or more Strings as lines of text, with the given width as the maximum width for a line. This correctly splits most (all?) text in European languages on spaces (treating all whitespace characters matched by the regex '\\s' as breaking), and also uses the English-language rule (probably used in other languages as well) of splitting on hyphens and other dash characters (Unicode category Pd) in the middle of a word. This means for a phrase like "UN Secretary General Ban-Ki Moon", if the width was 12, then the Strings in the List returned would be
"UN Secretary" "General Ban-" "Ki Moon"
Spaces are not preserved if they were used to split something into two lines, but dashes are.- Parameters:
longText- a probably-large piece of text that needs to be split into multiple lines with a max widthwidth- the max width to use for any line, removing trailing whitespace at the end of a line- Returns:
- a List of Strings for the lines after word-wrapping
-
wrap
Word-wraps the given String (or other CharSequence, such as a StringBuilder) so it is split into zero or more Strings as lines of text, with the given width as the maximum width for a line; appends the word-wrapped lines to the given List of Strings and does not create a new List. This correctly splits most (all?) text in European languages on spaces (treating all whitespace characters matched by the regex '\\s' as breaking), and also uses the English-language rule (probably used in other languages as well) of splitting on hyphens and other dash characters (Unicode category Pd) in the middle of a word. This means for a phrase like "UN Secretary General Ban-Ki Moon", if the width was 12, then the Strings in the List returned would be
"UN Secretary" "General Ban-" "Ki Moon"
Spaces are not preserved if they were used to split something into two lines, but dashes are.- Parameters:
receiving- the List of String to append the word-wrapped lines tolongText- a probably-large piece of text that needs to be split into multiple lines with a max widthwidth- the max width to use for any line, removing trailing whitespace at the end of a line- Returns:
- the given
receivingparameter, after appending the lines from word-wrapping
-
indexOf
-
indexOf
-
indexOf
-
indexOf
-
capitalize
Capitalizes Each Word In The Parameteroriginal, Returning A New String.- Parameters:
original- a CharSequence, such as a StringBuilder or String, which could have CrAzY capitalization- Returns:
- A String With Each Word Capitalized At The Start And The Rest In Lower Case
-
sentenceCase
Attempts to scan for sentences inoriginal, capitalizes the first letter of each sentence, and otherwise leaves the CharSequence untouched as it returns it as a String. Sentences are detected with a crude heuristic of "does it have periods, exclamation marks, or question marks at the end, or does it reach the end of input? If yes, it's a sentence."- Parameters:
original- a CharSequence that is expected to contain sentence-like data that needs capitalization; existing upper-case letters will stay upper-case.- Returns:
- a String where the first letter of each sentence (detected as best this can) is capitalized.
-
correctABeforeVowel
A simple method that looks for any occurrences of the word 'a' followed by some non-zero amount of whitespace and then any vowel starting the following word (such as 'a item'), then replaces each such improper 'a' with 'an' (such as 'an item'). The regex used here isn't bulletproof, but it should be fairly robust, handling when you have multiple whitespace chars, different whitespace chars (like carriage return and newline), accented vowels in the following word (but not in the initial 'a', which is expected to use English spelling rules), and the case of the initial 'a' or 'A'. This also changes improper uses of "an" back to "a", such as by changing "an dog" to "a dog", or "an malevolent force" to "a malevolent force".
Gotta love Regexodus; this is a two-liner that uses features specific to that regular expression library. This only matches text in the Latin script because a/an is a feature of English, and doesn't have a direct equivalent I know of in the Greek or Cyrillic scripts. There could easily be one! I just couldn't verify it.- Parameters:
text- the (probably generated English) multi-word text to search for 'a'/'an' in and possibly replace- Returns:
- a new String with every improper 'a' and 'an' replaced
-
decompressCategory
public static com.github.tommyettinger.ds.CharBitSetFixedSize decompressCategory(regexodus.Category category) Takes the compressed bitset inside a RegExodusCategoryand decompresses it to a jdkgdxdsOffsetBitSet. This may improve lookup time for frequently-checked Categories, sinceOffsetBitSet.contains(int)is quite fast (it runs in O(1) time), whileCategory.contains(char)is... not as fast (it runs in O(n) time, where n is the RLE-compressed size of the entire bitset). An OffsetBitSet can also be modified if needed, whereas a Category cannot.- Parameters:
category- a RegExodus Category, such asCategory.Lufor upper-case letters- Returns:
- a new OffsetBitSet storing the same contents as the given Category, but optimized for faster access
-