http2-client
An native-Haskell HTTP2 client library based on http2
and tls
packages.
Hackage: https://hackage.haskell.org/package/http2-client .
General design
HTTP2 is a heavy protocol. HTTP2 features pipelining, query and responses
interleaving, server-pushes, pings, stateful compression and flow-control,
priorization etc. This library aims at exposing these features so that library
users can integrate http2-client
in a variety of applications. In short, we'd
like to expose as many HTTP2 features as possible. Hence, the http2-client
programming interface can feel low-level for users with expectations to get an
API as simple as in HTTP1.x.
Exposing most HTTP2 primitives as a drawback: the library allows a client to
behave abnormally with-respect to the HTTP2 spec. That said, we try to prevent
notoriously-difficult errors such as concurrency bugs by coercing users with
the programming API following Haskell's philosophy to factor-out errors at
compile time. For instance, a client can send DATA frames on a stream after
closing it with a RST (easy to spot). However, a multi-threaded client will not
be able to interleave DATA frames with HEADERS and their CONTINUATIONs and the
locking required to achieve this invariant is hidden (hard to implement).
Following this philosophy, we prefer to offer a somewhat low-level API in
Network.HTTP2.Client
and higher-level APIs (with a different performance
trade-off) in Network.HTTP2.Client.Helpers
. For instance,
Network.HTTP2.Client.Helpers.waitStream
will consume a whole stream in memory
before returning whereas Network.HTTP2.Client
users will have to take chunks
one at a time. We look forward to the linear arrows extension for improving the
library design.
Versioning and GHC support
We try to follow https://pvp.haskell.org/ as a.b.c.d
with the caveat that if
a=0
then we are still slightly unhappy with some APIs and we'll break things
arbitrarily.
We aim at supporting GHC-8.x, contributions to support GHC-7.x are welcome.
Installation
This package is a standard Stack
project, please also refer to Stack's
documentation if you have trouble installing or using this package. Please
also have a look at the Hackage Matrix CI:
https://matrix.hackage.haskell.org/package/http2-client .
Usage
First, make sure you are somewhat familiar with HTTP and HTTP2 standards by
reading RFCs or Wikipedia pages. If you use the library, feel free to shoot me
an e-mail (cf. commits) or a tweet @lucasdicioccio .
Help and examples
Please see some literate Haskell documents in the examples/
directory. For
a more involved usage, we currently provide a command-line example client:
http2-client-exe
which I use as a test client and you could use to test
various flow-control parameters. This binary lives in a separate package
at https://github.com/lucasdicioccio/http2-client-exe .
The Haddocks, at https://hackage.haskell.org/package/http2-client, should have
plenty implementation details, so please have a look. Otherwise, you can ask
help by creating an Issue on the bug-tracker.
Opening a stream
First, you open a (TLS-protected) connection to a server and configure the
initial SETTINGS to advertise. Then you can open and consume streams. Opening
streams takes a stream-definition and expresses two sequential parts. First,
sending the HTTP headers, which reserves an increasing stream-ID with the
server. Second, you consume a stream by sending DATA chunk or receiving DATA
chunks. One thing that can prevent concurrency is if you have too many opened
streams for the server. The http2-client
library tracks server's max
concurrency preference and will prevent you from opening too many streams.
Sending chunked data
Sent data must be chunked according to server's preferences. A function named
sendData
performs the chunking but this chunking could have some suboptimal
overhead if you want to repeatedly call sendData with a buffer size that is not
a multiple of the server's preferred chunk size.
Flow control
HTTP2 mandates a flow-control system that cannot be disabled. DATA chunks
consume credit from the flow-control system. The standard defines a
flow-control context per stream plus one global per-connection.
** Received DATA flow control ** In order to keep receiving data you need to
periodically transfer credit to the server. One transfers credit to server by
calling _updateWindow
, which transfers locally-accumulated credit (you
accumulate credit with addCredit
). The current implementation already follows
a "zero-sum" credit where received DATA is immediately consumed and
re-credited. That is, if you only keep calling _updateWindow
at some
frequency the stream will progress. You can also _addCredit
to permit
receiving more DATA on a stream/connection (e.g., if you want to implement
something like TCP slow-start).
** Sent DATA flow control ** A server following the HTTP2 specification
strictly will kick you for sending too much data. The http2-client
library
allows you to be more aggressive than the server allows and you have to care
for your streams. We provide an incoming flow-control context that will allow
you to call _withdrawCredit
to wait until some credit is available. At the
time of this writing, the sendData
function does not call _withdrawCredit
and we provide no equivalent. Note that the chunking and flow-control
mechanisms have interesting interactions in HTTP2 in a multi-threaded context.
Pay attention to always take credit in the per-stream flow-control context
before taking it from the global per-connection flow-control context.
Otherwise, you risk starving the global per-connection flow-control with no
guarantee that you'll be allowed to send a DATA frame.
Settings changes
The HTTP2 RFC acknowledges the inherent race conditions that may occur when
changing SETTINGS. The http2-client
library should be rather permissive and
accept rather than reject frames caught violating inconsistent settings once
client settings are made stricter. Conversely, the http2-client
library tries
to enforce server-SETTINGS strictly before ACKnowledging the setting changes.
This configuration can lead to problems if the server send more-permissive
SETTINGS (e.g., allowing a large default window size -> which recredits all
streams) but if the server applies this change locally only after receiving the
client ACK. One way to be double-sure the http2-client
library is always
strict would be to apply settings changes in two steps: settings that move in
the "stricter direction" (e.g., fewer concurrency, smaller initial window)
should be applied before ACK-ing the SETTING frame. Meanwhile settings that
move in the "looser direction" (e.g., more concurrency) should be applied
after ACK-ing the SETTINGS frame.
The current design apply SETTINGS:
- (client prefs) after receiving a ACK for sent SETTINGS, you get the choice to
wait for an ACK or wait in a thread, but you must wait for an ACK to apply
changed settings (the
_settings
function will return an IO to wait for the
ACK and apply settings). Note that the initial SETTINGS change frame is
waited for in a thread without library's user intervention (if you feel
strongly against this choice, please open a bug).
- (server prefs) immediately after receiving and hence before sending ACK-SETTINGS
Fortunately, changing settings mid-stream is probably a rare behavior and the
default SETTINGS are large enough to avoid creating fatal errors before
sending/receiving the initial SETTINGS frames.
Things that are hardcoded
A number of HTTP2 features are currently hardcoded:
- PINGs are replied-to immediately (i.e., a server could hog a connection with PINGs)
- the initial SETTINGS frame sent to the server is waited-for in a separate
thread, settings are applied to the connection when the server ACKs the frame
- flow-control from DATA frames is decremented immediately when received (in a
separate thread) rather than when consumed from the client
- similarly, flow-control re-increment every DATA received as soon as it is
received
Contributing
Contributions are welcome. As I start integrating external contribution I plan
to follow the following procedure:
- stop pushing directly into master
- develop any patch in a new branch, branched from master
- merge requests target master
Please pay attention to the following:
- avoid introducing external dependencies, especially if dependencies are not in stackage
- avoid reformatting-only merge requests
- please verify that you can
stack clean
and stack build --pedantic
General mindset to have during code-reviews:
- be kind
- be patient
- surpass egos and bring data if there is a disagreement
Bugtracker
Most of the following points have their own issues on the issue tracker at
GitHub: https://github.com/lucasdicioccio/http2-client/issues .
Things that will likely change the API
I think the fundamentals are right but the following needs tweaking:
- function to reset a stream will likely be blocking until a RST/EndStream is
received so that all DATA frames are accounted for in the flow-control system
- need a way to hook custom flow-control algorithms
Support of the HTTP2 standard
The current implementation follows the HTTP2 standard except for the following:
- does not handle
PRIORITY
- does not expose padding
- does not handle
SETTINGS_MAX_HEADER_LIST_SIZE
- it's unclear to me whether this limitation is applied per frame or in total
- the accounting is done before compression with 32 extra bytes per header
- does not implement most of the checks that should trigger protocol errors