string-interpolate
Haskell having 5 different textual types in common use (String, strict and lazy
Text, strict and lazy ByteString) means that doing any kind of string
manipulation becomes a complicated game of type tetris with constant conversion
back and forth. What if string handling was as simple and easy as it is in
literally any other language?
Behold:
showWelcomeMessage :: Text -> Integer -> Text
showWelcomeMessage username visits =
[i|Welcome to my website, #{username}! You are visitor #{visits}!|]
No more needing to mconcat
, mappend
, and (<>)
to glue strings together.
No more having to remember a gajillion different functions for converting
between strict and lazy versions of Text, or having to worry about encoding
between Text <=> ByteString. No more getting bitten by trying to work with
Unicode ByteStrings. It just works!
string-interpolate provides a quasiquoter, i
, that allows you to interpolate
expressions directly into your string. It can produce anything that is an
instance of IsString
, and can interpolate anything which is an instance of
Show
.
Unicode handling
string-interpolate handles converting to/from Unicode when converting
String/Text to ByteString and vice versa. Lots of libraries use ByteString to
represent human-readable text, even though this is not safe. There are lots of
useful libraries in the ecosystem that are unfortunately annoying to work with
because of the need to generate ByteStrings containing application-specific info.
Insisting on explicitly converting to/from UTF-8 in these cases and handling
decoding failures adds lots of syntactic noise, when often you can reasonably
assume that a given ByteString will, 95% of the time, contain Unicode text.
So string-interpolate aims to provide reasonable defaults around conversion
between ByteString and real textual types so that developers don't need to
constantly be aware of text encodings.
When converting a String/Text to a ByteString, string-interpolate will
automatically encode it as a sequence of UTF-8 bytes. When converting a
ByteString to String/Text, string-interpolate will assume that the ByteString
contains a UTF-8 string, and convert the characters accordingly. Any invalid
characters in the ByteString will be converted to the Unicode replacement
character � (U+FFFD).
Remember: string-interpolate is not designed for 100% correctness around text
encodings, just for convenience in the most common case. If you absolutely need
to be aware of text encodings and to handle decode failures, take a look at
text-conversions.
Usage
First things first: add string-interpolate to your dependencies:
dependencies:
- string-interpolate
and import the quasiquoter and enable -XQuasiQuotes
:
{-# LANGUAGE QuasiQuotes #-}
import Data.String.Interpolate ( i )
Wrap anything you want to be interpolated with #{}
:
λ> name = "William"
λ> [i|Hello, #{name}!|] :: String
>>> "Hello, William!"
You can interpolate in anything which implements Show
:
λ> import Data.Time
λ> now <- getCurrentTime
λ> [i|The current time is #{now}.|] :: String
>>> "The current time is 2019-03-10 18:58:40.573892546 UTC."
...and interpolate into anything which implements IsString
.
string-interpolate must know what concrete type it's producing; it cannot be
used to generate a IsString a => a
. If you're using string-interpolate from
GHCi, make sure to add type signatures to toplevel usages!
string-interpolate also needs to know what concrete type it's interpolating.
For instance, the following code won't work:
showIt :: Show a => a -> String
showIt it = [i|The value: #{it}|]
You would need to convert it
to a String using show
first.
Strings and characters are always interpolated without surrounding quotes.
λ> verb = 'c'
λ> noun = "sea"
λ> [i|We went to go #{verb} the #{noun}.|] :: String
>>> "We went to go c the sea."
You can interpolate arbitrary expressions:
λ> [i|Tomorrow's date is #{addDays 1 $ utctDay now}.|] :: String
>>> "Tomorrow's date is 2019-03-11."
string-interpolate, by default, handles multiline strings by copying the
newline verbatim into the output.
λ> :{
| [i|
| a
| b
| c
| |] :: String
| :}
>>> "\n a\n b\n c\n"
A second quasiquoter, iii
, is provided that handles multiline strings/whitespace
in a different way, by collapsing any whitespace into a single space. The
intention is to use it when you want to split something across multiple
lines in source for readability but want it emitted like a normal sentence.
iii
is otherwise identical to i
.
λ> :{
| [iii|
| Lorum
| ipsum
| dolor
| sit
| amet.
| |] :: String
| :}
>>> "Lorum ipsum dolor sit amet."
A pnemonic for remembering what iii
does is to look at the i's as individual
lines which have been collapsed into a single line.
Backslashes are handled exactly the same way they are in normal Haskell strings.
If you need to put a literal #{
into your string, prefix the pound symbol with
a backslash:
λ> [i|\#{ some inner text }#|] :: String
>>> "#{ some inner text }#"
Comparison to other interpolation libraries
Some other interpolation libraries available:
Of these, Text.Printf isn't exception-safe, and neat-interpolation can only
produce Text values. interpolate, formatting, Interpolation, and
interpolatedstring-perl6 provide different solutions to the problem of
providing a general way of interpolating any value, into any kind of text.
Features
|
string-interpolate |
interpolate |
formatting |
Interpolation |
interpolatedstring-perl6 |
neat-interpolation |
String/Text support |
✅ |
✅ |
✅ |
⚠️ |
✅ |
⚠️ |
ByteString support |
✅ |
✅ |
❌ |
⚠️ |
✅ |
❌ |
Can interpolate arbitrary Show instances |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
Unicode-aware |
✅ |
❌ |
⚠️ |
❌ |
❌ |
⚠️ |
Multiline strings |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
Indentation handling |
❌ |
❌ |
❌ |
✅ |
❌ |
✅ |
Whitespace/newline chomping |
✅ |
❌ |
❌ |
❌ |
❌ |
❌ |
⚠ Since formatting
doesn't support ByteStrings, it technically supports
Unicode.
⚠ Interpolation
supports all five textual formats, but doesn't allow you
to mix and match; that is, you can't interpolate a String into an output
string of type Text, and vice versa.
⚠ neat-interpolation
only supports Text
. Because of that, it technically
supports Unicode.
Overall: string-interpolate is competitive with the fastest interpolation
libraries, only getting outperformed by Interpolation and
interpolatedstring-perl6, and even then mostly on ByteStrings. Since
these two libraries don't handle Unicode and string-interpolate converts
things to UTF-8, some slowdown is to be expected here.
We run three benchmarks: small string interpolation (<100 chars) with a single
interpolation parameter; small strings with multiple interpolation parameters,
and large string (~100KB) interpolation. Each of these benchmarks is then
run against String
, strict Text
, and strict ByteString
. Numbers are runtime
in relation to string-interpolate; smaller is better.
|
string-interpolate |
interpolate |
formatting |
Interpolation |
interpolatedstring-perl6 |
neat-interpolation |
small String |
1x |
1x |
2x |
1x |
1x |
N/A |
multi interp, String |
1x |
7x |
2.3x |
0.63x |
0.63x |
N/A |
small Text |
1x |
28x |
1.5x |
2.2x |
2.2x |
3.3x |
multi interp, Text |
1x |
22x |
1.6x |
2.9x |
2.9x |
3.0x |
large Text |
1x |
30,000x |
1x |
80x |
80x |
102x |
small ByteString |
1x |
15x |
N/A |
0.35x |
0.35x |
N/A |
multi interp, ByteString |
1x |
10x |
N/A |
0.5x |
0.5x |
N/A |
large ByteString |
1x |
100,000x |
N/A |
1.6x |
1.6x |
N/A |
(We don't bother running tests on large String
s, because no one is working
with data that large using String
anyways.)
In particular, notice that Interpolation and interpolatedstring-perl6
blow up on large Text; string-interpolation and formatting have
consistent performance across all benchmarks, with string-interpolation leading
the pack in Text
cases.
All results were tested on my local machine. If you'd like to attempt to replicate
the results, the benchmarks are in bench/
and can be run with a simple
stack bench
.
(NB: If you're attempting to reproduce these benchmarks, note that the
benchmarks for Interpolation and interpolatedstring-perl6 are commented
out by default, due to those packages not being in latest Stackage.)