struct
Pf::GraphemeSeln
- Pf::GraphemeSeln
- Struct
- Value
- Object
Overview
A string selection type whose indivisible unit is the grapheme.
See String::Grapheme from the standard library for more info on what
graphemes are.
A string selection is very much like a selection in a text editor. That is,
it delimits a chunk of text, within a larger (or equal) body of text. Notably,
a selection is never "detached" from the body of text. In terms of lifetimes,
in order for a selection to be alive, the text must be alive. This is why
selections must not be used as a replacement for Strings. Just because
selections are cheap doesn't mean you can use them anywhere. In your mind's
eye, you must always picture selections as part of a larger body of text;
think "highlighted portions" of text. Whenever you store a selection on the heap,
think of it as storing the larger body of text, with a particular region
highlighted.
Internally, a grapheme selection consists of three things: a trunk, the starting grapheme index, and the ending grapheme index (exclusive). That the end index is exclusive means you can have empty selections. Think of them as "cursors" or "I-beams".
A selection trunk contains data shared by all selections from the same string. In a way, all selections stem from X, and the most fitting term for X is, arguably, "the trunk".
Most importantly, a selection trunk keeps a reference to the original string.
We call it the trunk string. The reference keeps the trunk string alive.
It is also the reason you must not use selections as a generic replacement for strings.
Okay, if you really want to, you can call #detach where appropriate.
There is also the notion of relative versus absolute indices. A relative index counts from the start of a selection. An absolute index counts from the start of the trunk string instead. Thus we get "absolute byte indices", "relative grapheme indices" and so on.
EXPERIMENTAL Relies on features from the experimental String::Grapheme API.
Defined in:
permafrost/grapheme_seln.crConstructors
-
.new(string : String) : GraphemeSeln
Constructs a grapheme selection from string.
Instance Method Summary
-
#+(other : GraphemeSeln) : GraphemeSeln
Joins two selections into one.
-
#==(other : String) : Bool
Returns
trueif the content of this selection is equal, bytewise, to that of other. -
#==(other : GraphemeSeln) : Bool
Two grapheme selections are equal if their strings are equal, and the selections "highlight" the same part of the string.
-
#after(index : Int32) : GraphemeSeln
Returns a selection of graphemes after index (relative; exclusive).
-
#after_end : GraphemeSeln
Returns an empty selection after the end of this one.
-
#at(index : Int32) : GraphemeSeln
Returns a selection of the grapheme at index (relative).
-
#at_byte(byte_index : Int32) : GraphemeSeln
Returns a selection of the grapheme at byte index (relative).
-
#at_byte_abs(byte_index : Int32) : GraphemeSeln
Returns a selection of the grapheme at byte index (absolute).
-
#before(index : Int32) : GraphemeSeln
Returns a selection of graphemes before index (relative; exclusive).
-
#before?(other : GraphemeSeln) : Bool
Returns
trueif this selection comes before other in the trunk string. -
#before_start : GraphemeSeln
Returns an empty selection before the start of this one.
-
#begin : Int32
Returns the start grapheme index of this selection (absolute).
-
#byte_end : Int32
Returns the end byte index in the trunk string (absolute; exclusive).
-
#byte_mask_split(objects : Indexable(T), object : T, & : GraphemeSeln, GraphemeSeln -> ) : Nil forall T
Splits based on an external "mask" or annotation array, aligned with the trunk string's bytes.
-
#byte_range : Range(Int32, Int32)
Returns an exclusive range of selected grapheme bytes (absolute).
-
#byte_select(from : Int32, to : Int32) : GraphemeSeln
Selects the range of graphemes defined by from (byte index; relative) and to (byte index; exclusive; relative).
-
#byte_select_abs(from : Int32, to : Int32) : GraphemeSeln
Selects graphemes in the the range defined by from (byte index; absolute) and to (byte index; absolute).
-
#byte_select_inclusive(from : Int32, to : Int32) : GraphemeSeln
Selects graphemes in the the range defined by from (byte index; relative) and to (byte index; relative).
-
#byte_start : Int32
Returns the start byte index in the trunk string (absolute).
-
#bytesize : Int32
Returns the number of selected bytes.
-
#chomp : GraphemeSeln
Shrinks this selection so that it does not end with trailing LF
\nor CRLF\r\n. -
#contiguous?(other : GraphemeSeln) : Bool
Returns
trueif this selection directly precedes other in the trunk string. -
#covers_fully? : Bool
Returns
trueif the whole trunk string is selected. -
#detach : GraphemeSeln
Returns a copy of this selection which is detached from the trunk.
-
#each_line(& : GraphemeSeln -> ) : Nil
Splits this selection at newlines (
\n) and yields each resulting fragment. -
#each_segment(segmenter : GraphemeView -> T, & : T, GraphemeSeln -> ) forall T
Yields selections of contiguous runs of graphemes for which the given segmenter function produced the same value, along with that value.
-
#empty? : Bool
Returns
trueif this selection contains zero graphemes. -
#end : Int32
Returns the end grapheme index of this selection (exclusive; absolute).
-
#ends_with?(object) : Bool
Returns
trueif this selection's last grapheme matches object. -
#expand : GraphemeSeln
Expands this selection to enclose the entirety of the trunk string.
-
#first : GraphemeSeln
Returns a selection containing the first grapheme.
-
#grapheme : GraphemeView
Asserts that this selection includes exactly one grapheme, and returns a view of that grapheme.
-
#inside?(other lg : GraphemeSeln) : Bool
Returns
trueif this selection is fully inside other, or if it is the same as other. - #inspect(io)
-
#ix : Indexable(GraphemeView)
Returns an indexable of selected graphemes.
-
#last : GraphemeSeln
Returns a selection containing the last grapheme.
-
#partition(object) : Tuple(GraphemeSeln, GraphemeSeln, GraphemeSeln)
Splits this selection into three: one before the leftmost grapheme matching object, one with just that grapheme, and one after the grapheme.
-
#partition(& : GraphemeView -> Bool) : Tuple(GraphemeSeln, GraphemeSeln, GraphemeSeln)
Splits this selection into three: one before the leftmost grapheme for which the block returns
true, one with just that grapheme, and one after the grapheme. -
#pred : GraphemeSeln
Shifts this selection by one grapheme to the left (in the trunk string).
-
#present? : Bool
Returns
trueif this selection contains one or more graphemes. -
#prior : GraphemeSeln
Returns a selection of graphemes before the last one.
-
#properly_inside?(other lg : GraphemeSeln) : Bool
Returns
trueif this selection is fully inside other. -
#range : Range(Int32, Int32)
Returns an exclusive range of selected grapheme indices (absolute).
-
#rest : GraphemeSeln
Returns a selection of graphemes after the first one.
-
#select(from : Int32, to : Int32) : GraphemeSeln
Selects the range of graphemes defined by from (relative) and to (relative; exclusive).
-
#select(range : Range(Int32, Int32)) : GraphemeSeln
Selects the range of graphemes defined by range.
-
#size : Int32
Returns the number of selected graphemes.
-
#split(object, & : GraphemeSeln -> ) : Nil
Splits this selection at objects and yields each resulting fragment.
-
#starts_with?(object) : Bool
Returns
trueif this selection's first grapheme matches object. -
#succ : GraphemeSeln
Shifts this selection by one grapheme to the right (in the trunk string).
-
#through(other : GraphemeSeln) : GraphemeSeln
Selects all graphemes between this selection's start and other's end.
- #to_s(io)
-
#to_s
Returns a nicely readable and concise string representation of this object, typically intended for users.
-
#to_slice : Bytes
Returns a read-only view of the underling bytes of this selection.
Constructor Detail
Instance Method Detail
Joins two selections into one. An important requirement is that self and
other must form a contiguous sequence (see #contiguous?).
Returns true if the content of this selection is equal, bytewise, to that
of other.
Two grapheme selections are equal if their strings are equal, and the selections "highlight" the same part of the string.
Returns a selection of graphemes after index (relative; exclusive).
Returns a selection of the grapheme at byte index (relative).
Returns a selection of the grapheme at byte index (absolute).
Returns a selection of graphemes before index (relative; exclusive).
Returns true if this selection comes before other in the trunk string.
Splits based on an external "mask" or annotation array, aligned with the trunk string's bytes. Yields the part before a delimiter (first block arg) separately from the delimiter itself (the second block arg). The delimiter is empty at the end.
Returns an exclusive range of selected grapheme bytes (absolute).
Selects the range of graphemes defined by from (byte index; relative) and to (byte index; exclusive; relative).
Selects graphemes in the the range defined by from (byte index; absolute) and to (byte index; absolute). The from grapheme is one that includes the from byte index. The to grapheme is one that includes the to byte index. The to grapheme is excluded from the resulting selection.
Selects graphemes in the the range defined by from (byte index; relative) and to (byte index; relative). The from grapheme is one that includes the from byte index. The to grapheme is one that includes the to byte index. The to grapheme is included in the resulting selection.
Shrinks this selection so that it does not end with trailing LF \n
or CRLF \r\n.
Returns true if this selection directly precedes other in the trunk string.
That is, this selection must end where other begins for this method to
return true.
Returns a copy of this selection which is detached from the trunk.
You should imagine this as opening a new buffer in a text editor with the contents of the selection. The old buffer can now be freed by the GC as soon as possible; that's a benefit. The drawback is obviously, the selection loses context.
Splits this selection at newlines (\n) and yields each resulting fragment.
Note that newlines themselves are not removed, so fragments may or may not
end with a newline. An important property of this method is that the yielded
fragments form a contiguous sequence (so you can e.g. join #+ them back to get
this (original) selection).
Yields selections of contiguous runs of graphemes for which the given segmenter function produced the same value, along with that value.
Returns true if this selection's last grapheme matches object.
See GraphemeView#==.
Asserts that this selection includes exactly one grapheme, and returns a view of that grapheme.
Returns true if this selection is fully inside other, or if it is
the same as other.
Splits this selection into three: one before the leftmost grapheme matching
object, one with just that grapheme, and one after the grapheme. If there
is no grapheme matching object, the first selection is self, and the last
two point #after_end.
NOTE Comparison of each grapheme with object is performed using GraphemeView#==.
Splits this selection into three: one before the leftmost grapheme for which
the block returns true, one with just that grapheme, and one after the grapheme.
If there is no grapheme matching object, the first selection is self,
and the last two point #after_end.
Returns true if this selection is fully inside other.
Returns an exclusive range of selected grapheme indices (absolute).
Selects the range of graphemes defined by from (relative) and to (relative; exclusive).
Selects the range of graphemes defined by range.
Splits this selection at objects and yields each resulting fragment. The yielded fragments form a contiguous sequence.
NOTE Comparison of each grapheme with object is performed using GraphemeView#==.
Returns true if this selection's first grapheme matches object.
See GraphemeView#==.
Selects all graphemes between this selection's start and other's end.
Returns a nicely readable and concise string representation of this object, typically intended for users.
This method should usually not be overridden. It delegates to
#to_s(IO) which can be overridden for custom implementations.
Also see #inspect.
Returns a read-only view of the underling bytes of this selection. The view is a subview of the trunk string's bytes.