License | GPL-2 |
---|---|
Maintainer | yi-devel@googlegroups.com |
Stability | experimental |
Portability | portable |
Safe Haskell | Safe |
Language | Haskell2010 |
Extensions |
|
This module defines a rope
data structure for use in Yi. This
specific implementation uses a fingertree over Text.
In contrast to our old implementation, we can now reap all the benefits of Text: automatic unicode handling and blazing fast implementation on underlying strings. This frees us from a lot of book-keeping. We don't lose out on not using ByteString directly because the old implementation encoded it into UTF8 anyway, making it unsuitable for storing anything but text.
- data YiString
- fromString :: String -> YiString
- fromText :: Text -> YiString
- fromString' :: Int -> String -> YiString
- fromText' :: Int -> Text -> YiString
- toString :: YiString -> String
- toReverseString :: YiString -> String
- toText :: YiString -> Text
- toReverseText :: YiString -> Text
- null :: YiString -> Bool
- empty :: YiString
- take :: Int -> YiString -> YiString
- drop :: Int -> YiString -> YiString
- length :: YiString -> Int
- reverse :: YiString -> YiString
- countNewLines :: YiString -> Int
- lines :: YiString -> [YiString]
- lines' :: YiString -> [YiString]
- unlines :: [YiString] -> YiString
- splitAt :: Int -> YiString -> (YiString, YiString)
- splitAtLine :: Int -> YiString -> (YiString, YiString)
- cons :: Char -> YiString -> YiString
- snoc :: YiString -> Char -> YiString
- singleton :: Char -> YiString
- head :: YiString -> Maybe Char
- last :: YiString -> Maybe Char
- append :: YiString -> YiString -> YiString
- concat :: [YiString] -> YiString
- any :: (Char -> Bool) -> YiString -> Bool
- all :: (Char -> Bool) -> YiString -> Bool
- dropWhile :: (Char -> Bool) -> YiString -> YiString
- takeWhile :: (Char -> Bool) -> YiString -> YiString
- dropWhileEnd :: (Char -> Bool) -> YiString -> YiString
- takeWhileEnd :: (Char -> Bool) -> YiString -> YiString
- intercalate :: YiString -> [YiString] -> YiString
- intersperse :: Char -> [YiString] -> YiString
- filter :: (Char -> Bool) -> YiString -> YiString
- map :: (Char -> Char) -> YiString -> YiString
- words :: YiString -> [YiString]
- unwords :: [YiString] -> YiString
- split :: (Char -> Bool) -> YiString -> [YiString]
- init :: YiString -> Maybe YiString
- tail :: YiString -> Maybe YiString
- span :: (Char -> Bool) -> YiString -> (YiString, YiString)
- break :: (Char -> Bool) -> YiString -> (YiString, YiString)
- foldl' :: (a -> Char -> a) -> a -> YiString -> a
- replicate :: Int -> YiString -> YiString
- replicateChar :: Int -> Char -> YiString
- readFile :: FilePath -> IO (Either Text YiString)
- writeFile :: FilePath -> YiString -> IO ()
- fromRope :: YiString -> FingerTree Size YiChunk
- withText :: (Text -> Text) -> YiString -> YiString
- unsafeWithText :: (Text -> Text) -> YiString -> YiString
Documentation
A YiString
is a FingerTree
with cached char and line counts
over chunks of Text
.
Eq YiString Source # | Two Implementation note: This just uses The derived Eq implementation for the underlying tree only passes the equality check if the chunks are the same too which is not what we want. |
Ord YiString Source # | |
Show YiString Source # | |
IsString YiString Source # | |
Monoid YiString Source # | |
Binary YiString Source # | To serialise a |
NFData YiString Source # | |
Conversions to YiString
fromText :: Text -> YiString Source #
Converts a Text
into a YiString
using
defaultChunkSize
-sized chunks for the underlying tree.
fromText' :: Int -> Text -> YiString Source #
This is like fromText
but it allows the user to specify the
chunk size to be used. Uses defaultChunkSize
if the given
size is <= 0.
Conversions from YiString
toReverseString :: YiString -> String Source #
See toReverseText
.
Note that it is actually ~4.5 times faster to use toReverseText
and unpack the result than to convert to String
and use
reverse
.
toReverseText :: YiString -> Text Source #
Spits out the underlying string, reversed.
Note that this is actually slightly faster than manually unrolling
the tree from the end, reverse
ing each chunk and
concat
ing, at least with -O2 which you really need to be using
with Text
anyway.
Functions over content
reverse :: YiString -> YiString Source #
Reverse the whole underlying string.
This involves reversing the order of the chunks as well as content of the chunks. We use a little optimisation here that re-uses the content of each chunk but this exposes a potential problem: after many transformations, our chunks size might become quite varied (but never more than the default size), perhaps we should periodically rechunk the tree to recover nice sizes?
countNewLines :: YiString -> Int Source #
Count the number of newlines in the underlying string. This is actually amortized constant time as we cache this information in the underlying tree.
lines :: YiString -> [YiString] Source #
This is like lines'
but it does *not* preserve newlines.
Specifically, we just strip the newlines from the result of
lines'
.
This behaves slightly differently than the old split: the number of
resulting strings here is equal to the number of newline characters
in the underlying string. This is much more consistent than the old
behaviour which blindly used ByteString
s split and stitched the
result back together which was inconsistent with the rest of the
interface which worked with number of newlines.
lines' :: YiString -> [YiString] Source #
Splits the YiString
into a list of YiString
each containing a
line.
Note that in old implementation this allowed an arbitrary character
to split on. If you want to do that, manually convert toText
and
use splitOn
to suit your needs. This case is optimised for
newlines only which seems to have been the only use of the original
function.
The newlines are preserved so this should hold:
'toText' . 'concat' . 'lines'' ≡ 'toText'
but the underlying structure might change: notably, chunks will most likely change sizes.
splitAt :: Int -> YiString -> (YiString, YiString) Source #
Splits the string at given character position.
If position <= 0
then the left string is empty and the right string
contains everything else.
If position >= length of the string
then the left string contains
everything and the right string is empty.
Implementation note: the way this works is by splitting the
underlying finger at a closest chunk that goes *over* the given
position (see split
). This either results in a perfect split at
which point we're done or more commonly, it leaves as few
characters short and we need to take few characters from the first
chunk of the right side of the split. We do precisely that.
All together, this split is only as expensive as underlying
split
, the cost of splitting a chunk into two, the cost of one
cons and one cons of a chunk and lastly the cost of splitAt
of
the underlying Text
. It turns out to be fairly fast all
together.
splitAtLine :: Int -> YiString -> (YiString, YiString) Source #
Splits the underlying string before the given line number. Zero-indexed lines.
Splitting at line <= 0 gives you an empty string. Splitting at
n > 0
gives you the first n lines.
Also see splitAtLine'
.
head :: YiString -> Maybe Char Source #
Take the first character of the underlying string if possible.
last :: YiString -> Maybe Char Source #
Take the last character of the underlying string if possible.
append :: YiString -> YiString -> YiString Source #
Append two YiString
s.
We take the extra time to optimise this append for many small
insertions. With naive append of the inner fingertree with ><
,
it is often the case that we end up with a large collection of tiny
chunks. This function instead tries to join the underlying trees at
outermost chunks up to defaultChunkSize
which while slower,
should improve memory usage.
I suspect that this pays for itself as we'd spend more time computing over all the little chunks than few large ones anyway.
any :: (Char -> Bool) -> YiString -> Bool Source #
YiString
specialised any
.
Implementation note: this currently just does any by doing ‘TX.Text’ conversions upon consecutive chunks. We should be able to speed it up by running it in parallel over multiple chunks.
dropWhileEnd :: (Char -> Bool) -> YiString -> YiString Source #
As dropWhile
but drops from the end instead.
takeWhileEnd :: (Char -> Bool) -> YiString -> YiString Source #
Like takeWhile
but takes from the end instead.
intercalate :: YiString -> [YiString] -> YiString Source #
Concatenates the list of YiString
s after inserting the
user-provided YiString
between the elements.
Empty YiString
s are not ignored and will end up as strings of
length 1. If you don't want this, it's up to you to pre-process the
list. Just as with intersperse
, it is up to the user to
pre-process the list.
intersperse :: Char -> [YiString] -> YiString Source #
Intersperses the given character between the YiString
s. This is
useful when you have a bunch of strings you just want to separate
something with, comma or a dash. Note that it only inserts the
character between the elements.
What's more, the result is a single YiString
. You can easily
achieve a version that blindly inserts elements to the back by
mapping over the list instead of using this function.
You can think of it as a specialised version of
intercalate
. Note that what this does not do is
intersperse characters into the underlying text, you should convert
and use intersperse
for that instead.
filter :: (Char -> Bool) -> YiString -> YiString Source #
Filters the characters from the underlying string.
>>>
filter (/= 'a') "bac"
"bc"
map :: (Char -> Char) -> YiString -> YiString Source #
Maps the characters over the underlying string.
unwords :: [YiString] -> YiString Source #
Join given YiString
s with a space. Empty lines will be filtered
out first.
split :: (Char -> Bool) -> YiString -> [YiString] Source #
Splits the YiString
on characters matching the predicate, like
split
.
For splitting on newlines use lines
or lines'
instead.
Implementation note: GHC actually makes this naive implementation about as fast and in cases with lots of splits, faster, as a hand-rolled version on chunks with appends which is quite amazing in itself.
init :: YiString -> Maybe YiString Source #
Takes every character but the last one: returns Nothing on empty string.
tail :: YiString -> Maybe YiString Source #
Takes the tail of the underlying string. If the string is empty to begin with, returns Nothing.
span :: (Char -> Bool) -> YiString -> (YiString, YiString) Source #
Returns a pair whose first element is the longest prefix
(possibly empty) of t of elements that satisfy p, and whose second
is the remainder of the string. See also span
.
This implementation uses splitAt
which actually is just
as fast as hand-unrolling the tree. GHC sure is great!
break :: (Char -> Bool) -> YiString -> (YiString, YiString) Source #
Just like span
but with the predicate negated.
foldl' :: (a -> Char -> a) -> a -> YiString -> a Source #
Left fold.
Benchmarks show that folding is actually Pretty Damn Slow™: consider whether folding is really the best thing to use in your scenario.
replicate :: Int -> YiString -> YiString Source #
Replicate the given YiString set number of times, concatenating
the results. Also see replicateChar
.
replicateChar :: Int -> Char -> YiString Source #
Replicate the given character set number of times and pack the
result into a YiString
.
>>>
replicateChar 4 ' '
" "
IO
readFile :: FilePath -> IO (Either Text YiString) Source #
Reads file into the rope, also returning the ConverterName
that
was used for decoding. You should resupply this to writeFile
if
you're aiming to preserve the original encoding.
If we fail to guess the encoding used, error message is given instead.
It is up to the user to handle exceptions not directly related to character decoding.
writeFile :: FilePath -> YiString -> IO () Source #
Write a YiString
into the given file.
It's up to the user to handle exceptions.
Escape latches to underlying content. Note that these are safe
withText :: (Text -> Text) -> YiString -> YiString Source #
Helper function doing conversions of to and from underlying
Text
. You should aim to implement everything in terms of
YiString
instead.
Please note that this maps over each chunk so this can only be used with layout-agnostic functions. For example
>>>
let t = 'fromString' "abc" <> 'fromString' "def"
>>>
'toString' $ 'withText' 'TX.reverse' t
"cbafed"
Probably doesn't do what you wanted, but toUpper
would.
Specifically, for any f :
, Text
→ Text
withText
will
only do the ‘expected’ thing iff
f x <> f y ≡ f (x <> y)
which should look very familiar.
unsafeWithText :: (Text -> Text) -> YiString -> YiString Source #
Maps over each chunk which means this function is UNSAFE! If
you use this with functions which don't preserve Size
, that is
the chunk length and number of newlines, things will break really,
really badly. You should not need to use this.
Also see unsafeFmap