A high-performance Protocol Buffers implementation for Haskell.
.
Features:
.
* .proto IDL parser (proto2 + proto3) and pure-text Haskell code generator
* Annotation-driven Template Haskell deriver (Proto.TH.Derive) that emits
wire codecs directly from a hand-written Haskell record
* loadProto Template Haskell splice that pairs the IDL parser with the
deriver for in-place type and instance generation
* Allocation-disciplined hot path: unboxed-sum decoder, two-pass sized
encoder, pre-computed tags, packed repeated fields, lazy submessage
decoding
* protoc plugin (protoc-gen-wireform), Cabal Setup.hs hook
(Proto.Setup), and an inline [proto|...|] quasi-quoter
* Proto3 canonical JSON mapping, well-known types
(Timestamp, Duration, Any, FieldMask, Struct, Wrappers, ...),
.pbtxt text format, proto2 typed extensions, dynamic / untyped
messages, and a runtime TypeRegistry
* gRPC service-method codegen (the wire framing lives in
wireform-grpc)
* Conformance suite driver that runs the upstream
conformance_test_runner end-to-end
.
See the umbrella package wireform for the multi-format facade and the
wireform-gen codegen CLI.
.
Performance tip: compiling with ghc-options: -fllvm alongside
-O2 typically yields 20-30% throughput gains on the encode/decode hot
paths. LLVM produces better instruction scheduling and vectorisation for
the unboxed-sum decoder and the sized-builder arithmetic. The default
native code generator works correctly; LLVM is strictly optional.
Build the wireform-proto-python-interop executable that round-trips
messages with the official Python google-protobuf library.
Requires python3 and pip install protobuf in the Python environment.
Disabled
Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info
[!CAUTION]
wireform is in heavy development and has not been published to Hackage yet. APIs may change.
A fully conformant, extremely high-performance Protocol Buffer implementation
for Haskell. Supports proto2 and proto3 with its own IDL parser, so
no protoc binary is needed.
Encode and decode performance is roughly as fast as the official
C++ implementation.
The usual workflow is: point Template Haskell at a .proto file,
get a data type plus wire and JSON-related instances. The splice
runs wireform's own parser (no protoc).
For the schema above, you get something along these lines:
data Person = Person
{ personName :: !Text
, personAge :: {-# UNPACK #-} !Int32
} deriving stock (Show, Eq, Generic)
Use normal record syntax and pattern matching on the generated type;
encodeMessage / decodeMessage are the straightforward binary
path.
let alice = Person { personName = "Alice", personAge = 30 }
let bytes = encodeMessage alice
case decodeMessage bytes of
Right p -> print (personName p)
Left err -> print err
For the other entry points, you can use inline Proto.TH.QQ, Haskell-first
Proto.TH.Derive, on-disk output via Proto.Setup, the protoc
plugin, or direct Proto.CodeGen.
Ways to use it
There are six entry points into the same codegen machinery, depending on your development style. All produce identical wire-format instances; they differ only in where and
when code generation happens.
loadProto: TH splice from a .proto file
Simplest path. Point it at a file, get types and instances.
{-# LANGUAGE TemplateHaskell #-}
import Proto.TH (loadProto)
$(loadProto "proto/messages.proto")
Messages and enums land in scope. No build system setup, no
generated files to commit. wireform's own parser handles the
.proto; protoc is not involved.
SearchRequest is now a regular Haskell type with
encode/decode/JSON instances.
Proto.TH.Derive: annotation-driven, no .proto file
Define your Haskell types first, derive the wire format from
annotations:
{-# LANGUAGE TemplateHaskell #-}
import Proto.TH.Derive (deriveProto, tag)
data Measurement = Measurement
{ sensorId :: !Text
, temperature :: !Double
, timestamp :: {-# UNPACK #-} !Int64
} deriving stock (Show, Eq, Generic)
{-# ANN type Measurement ("Measurement" :: String) #-}
{-# ANN sensorId (tag 1) #-}
{-# ANN temperature (tag 2) #-}
{-# ANN timestamp (tag 3) #-}
deriveProto ''Measurement
Useful when the Haskell types are the source of truth and protobuf
is just the serialisation format. You get MessageEncode / MessageDecode instances like every other path.
Proto.Setup: Cabal pre-build hook
For projects that prefer generated .hs files on disk (reviewable,
committable, visible to HLS without a TH rebuild):
# in your .cabal file
build-type: Custom
custom-setup
setup-depends: base, wireform-proto, Cabal, directory, filepath, text
library
hs-source-dirs: src, gen
Incremental: only regenerates when a .proto file is newer than
its .hs output.
protoc-gen-wireform: protoc plugin
If your build system already runs protoc (Bazel, Nix, Make,
polyglot monorepo):
Reads CodeGeneratorRequest from stdin, writes Haskell source via
the same codegen machinery.
Proto.CodeGen: pure-text code generator
Lowest-level entry point. generateModuleText takes a parsed
ProtoFile AST and returns the Haskell module source as Text.
No TH, no IO, just a pure function:
import Proto.Parser (parseProtoFile)
import Proto.CodeGen (generateModuleText, defaultGenerateOpts)
import qualified Data.Text.IO as TIO
main :: IO ()
main = do
src <- TIO.readFile "message.proto"
case parseProtoFile "message.proto" src of
Left err -> print err
Right pf -> do
let code = generateModuleText
defaultGenerateOpts { genModulePrefix = "MyApp.Proto" }
mempty "message.proto" pf
TIO.writeFile "gen/MyApp/Proto/Message.hs" code
This backs Proto.Setup, protoc-gen-wireform, and loadProto.
Useful for custom CLI tools, non-Cabal build systems, or generation
as part of a larger pipeline.
Which one should I use?
Method
When to use it
loadProto
Most projects. Simple, no build setup.
Proto.TH.QQ
Quick prototyping, one-off messages, tests.
Proto.TH.Derive
Haskell types are the source of truth.
Proto.Setup
You want generated .hs files on disk.
protoc-gen-wireform
Your build system already runs protoc.
Proto.CodeGen
Custom tooling, full pipeline control.
All six produce identical wire-format instances.
Custom field representations
String, bytes, repeated, and map fields can be overridden to use
different Haskell types. Overrides apply per-field, per-message,
or globally.
BlobMsg gets a lazy ByteString data field (large payloads you
might not fully consume). IdMsg gets a ShortByteString
identifier (unpinned, GC-friendly for small IDs). ConfigEntry
gets [Text] instead of Vector Text (small collections where
list overhead doesn't matter).
.proto field options
Overrides specified directly in the schema so the intent is visible
to anyone reading the .proto:
Each adapter bundles TH splices for encoding, decoding, sizing, and
empty/null checks. You can define custom adapters for newtypes,
unboxed vectors, or other containers.
See examples/CustomReprExample.hs
for a working example covering all adapter types, including
map<K, bytes> value overrides.
Multi-format
Because wireform-proto generates plain records, the same type
participates in the broader wireform annotation system. A single
{-# ANN ... #-} pragma on a record can drive instance generation
for protobuf, CBOR, MessagePack, and JSON simultaneously. Details
in wireform-derive.
Performance
Numbers from cabal bench compare-bench, encoding and decoding
identical messages through wireform-proto and proto-lens. Four
message shapes: a 3-field scalar, an 8-field mixed, a nested
submessage, and a repeated message with 50 packed ints, 20 strings,
and 10 nested items.
Encode
Message
wireform
wireform (LLVM)
proto-lens
speedup
Small
26 ns
23 ns
145 ns
6.3x
Medium
54 ns
52 ns
280 ns
5.4x
Nested
45 ns
42 ns
320 ns
7.6x
Repeated
657 ns
500 ns
2,646 ns
5.3x
Decode
Message
wireform
wireform (LLVM)
proto-lens
speedup
Small
21 ns
20 ns
77 ns
3.9x
Medium
57 ns
61 ns
201 ns
3.3x
Nested
49 ns
50 ns
144 ns
2.9x
Repeated
694 ns
623 ns
2,067 ns
3.3x
Roundtrip
Message
wireform
wireform (LLVM)
proto-lens
speedup
Small
76 ns
75 ns
218 ns
2.9x
Medium
201 ns
191 ns
472 ns
2.5x
Nested
156 ns
140 ns
450 ns
3.2x
Criterion, GHC 9.8.4, -O2, Apple Silicon (M-series). Schema and
runner in compare-bench/. Run with
cabal bench compare-bench. LLVM column uses -fllvm on wireform
packages; proto-lens stays NCG. LLVM helps most on repeated fields
(up to 27%).
Operation
wireform-proto
proto-lens
ratio
Small
30.9 ns
153 ns
4.94x
Medium
67.5 ns
297 ns
4.40x
Nested
54.6 ns
335 ns
6.13x
Repeated
786 ns
2728 ns
3.47x
Last run 2026-05-15 00:00:00 UTC. ghc-9.8.4 on darwin-aarch64, criterion 1.6.5.
Operation
wireform-proto
proto-lens
ratio
Small
45.9 ns
81.9 ns
1.78x
Medium
111 ns
204 ns
1.85x
Nested
76.7 ns
148 ns
1.93x
Repeated
1070 ns
2166 ns
2.02x
Last run 2026-05-15 00:00:00 UTC. ghc-9.8.4 on darwin-aarch64, criterion 1.6.5.
Encode and decode cost about the same. A 3-field message encodes
in ~23 ns and decodes in ~20 ns with LLVM. A 50-element
packed-repeated field with nested submessages round-trips in about
1 us. Builder output can be streamed directly to a Handle without
materialising a ByteString.
Enabling LLVM
Add -fllvm to both this package and wireform-core in your
cabal.project.local (or the consuming package's ghc-options):
LLVM must be installed and on $PATH (llc, opt). The improvement is largest on repeated
fields (~27%) and nested message encode (~12%) where LLVM's loop
optimiser and instruction scheduler outperform the native code
generator.
Also included
Proto3 canonical JSON
Generated types get ToJSON / FromJSON instances that follow the
proto3 JSON mapping.
json_name overrides, base64-encoded bytes, string-encoded 64-bit
integers, and NaN/Infinity sentinels are handled automatically.
import Data.Aeson (encode, eitherDecode)
let json = encode alice -- proto3 JSON
case eitherDecode json of
Right p -> print (p :: Person)
Left err -> putStrLn err
Well-known types
Timestamp, Duration, Any, FieldMask, Struct, Value,
ListValue, NullValue, all Wrappers, Empty, and
SourceContext ship with supplementary utilities:
import Proto.Google.Protobuf.Timestamp.Util (fromUTCTime, toUTCTime)
import Proto.Google.Protobuf.Duration.Util (fromNominalDiffTime)
import Proto.Google.Protobuf.Any.Util (packAny, unpackAny)
import Proto.Google.Protobuf.FieldMask.Util (intersect, merge)
let ts = fromUTCTime now -- UTCTime -> Timestamp
let dur = fromNominalDiffTime 3.5 -- NominalDiffTime -> Duration
let any_ = packAny registry alice -- pack into Any
case unpackAny registry any_ of
Just (p :: Person) -> print p
Nothing -> putStrLn "unknown type"
Streaming and incremental decoders
For length-delimited message streams (gRPC, Kafka, log files):
import Proto.Decode.Stream (decodeStream)
import Proto.Decode.Streaming (streamDecode, StreamStep(..))
-- Strict: decode all messages from a ByteString
let msgs = decodeStream @LogEntry bytes
-- Incremental: decode one message at a time
case streamDecode @LogEntry of
StreamNeedMore feed -> feed chunk >>= \case
StreamYield entry k -> process entry >> continue k
StreamDone -> pure ()
Proto2 extensions and dynamic messages
import Proto.Extension (getExtension, setExtension)
-- Typed extensions (proto2)
let deadline = getExtension deadlineField request
-- Dynamic messages (schema not known at compile time)
import Proto.Dynamic (decodeDynamic, encodeDynamic)
let dyn = decodeDynamic registry "my.package.Person" bytes
Lens access
Proto.Lens provides optional van Laarhoven lenses for generated
message fields. No dependency on lens or microlens — the lenses
use the van Laarhoven encoding directly:
import Proto.Lens (field)
view (field @"name") person -- get
set (field @"name") "Bob" person -- set
gRPC codegen
loadProto (and the other code-generation entry points) generates typed
service and method descriptor values alongside the message types. The
ServiceDef / MethodDef types and all wire framing live in
wireform-grpc; wireform-proto generates the
metadata that plugs into those types.
-- Generated by loadProto from a service definition:
-- grpcGreeterService :: Network.GRPC.Common.ServiceDef
-- grpcSayHelloMethod :: Network.GRPC.Common.MethodDef SayHelloRequest SayHelloResponse
Conformance
2675 / 2675 tests pass against the official upstream protobuf
conformance suite (protocolbuffers/protobuf@v28.2),
covering proto3 and proto2 binary and JSON.
Comparison to proto-lens
proto-lens has been around since 2016 and covers the
full proto2/proto3 surface.
wireform-proto
proto-lens
Record style
Plain records, direct field access
Opaque constructors, lens-only access
Construction
Record syntax; missing fields are compile errors
defMessage & field .~ val; missing fields silent
Pattern matching
Yes
No (lens getters only)
Type inference
Concrete field types
Lens chains often need annotations
Schema evolution
New fields break call sites (good)
New fields get silent defaults
Encode speed
5-8x faster
Baseline
Decode speed
3-4x faster
Baseline
Field representation
Configurable per-field
Fixed
Optics integration: wireform-proto generates plain records, so
OverloadedRecordDot and pattern matching work out of the box.
For lens-style access, Proto.Lens provides van Laarhoven lenses
via a field @"name" combinator, which are compatible with both lens and
microlens with no dependency on either:
import Proto.Lens (field)
view (field @"seconds") timestamp
set (field @"seconds") 42 timestamp
over (field @"seconds") (+1) timestamp
-- Compose into nested messages:
view (field @"inner" . field @"name") nested
License
BSD-3-Clause. See LICENSE for the full text and
third-party attributions.