pdf-toolbox-document-0.1.2: A collection of tools for processing PDF files.
Safe HaskellNone
LanguageHaskell2010

Pdf.Document.Page

Description

PDF document page

Synopsis

Documentation

data Page Source #

Pdf document page

pageParentNode :: Page -> IO PageNode Source #

Page's parent node

pageContents :: Page -> IO [Ref] Source #

List of references to page's content streams

pageMediaBox :: Page -> IO (Rectangle Double) Source #

Media box, inheritable

pageFontDicts :: Page -> IO [(Name, FontDict)] Source #

Font dictionaries for the page

pageExtractText :: Page -> IO Text Source #

Extract text from the page

It tries to add spaces between chars if they don't present as actual characters in content stream.

glyphsToText :: [Span] -> Text Source #

Convert glyphs to text, trying to add spaces and newlines

It takes list of spans. Each span is a list of glyphs that are outputed in one shot. So we don't need to add space inside span, only between them.