pdf-toolbox-content-0.0.3.0: A collection of tools for processing PDF files

Safe HaskellNone

Pdf.Toolbox.Content.UnicodeCMap

Description

Unicode CMap defines mapping from glyphs to text

Synopsis

Documentation

data UnicodeCMap Source

Unicode character map

Font dictionary can contain "ToUnicode" key -- reference to a stream with unicode CMap

Instances

parseUnicodeCMap :: ByteString -> Either String UnicodeCMapSource

Parse content of unicode CMap

unicodeCMapNextGlyph :: UnicodeCMap -> ByteString -> Maybe (Int, ByteString)Source

Take the next glyph code from string, also returns the rest of the string

unicodeCMapDecodeGlyph :: UnicodeCMap -> Int -> Maybe TextSource

Convert glyph to text

Note: one glyph can represent more then one char, e.g. for ligatures