fast-tagsoup: Fast parsing and extracting information from (possibly malformed) HTML/XML documents
Fast TagSoup parser. Speeds of 20-200MB/sec were observed.
Works only with strict bytestrings.
This library is intended to be used in conjunction with the original tagsoup package:
import Text.HTML.TagSoup hiding (parseTags, renderTags) import Text.HTML.TagSoup.Fast
Besides speed fast-tagsoup correctly handles HTML <script> and <style> tags, converts tags to lower case and can decode non UTF-8 XML for you.
This parser is used in production in BazQux Reader feeds and comments crawler.
Downloads
- fast-tagsoup-1.0.14.tar.gz [browse] (Cabal source package)
- Package description (as included in the package)
Maintainer's Corner
For package maintainers and hackage trustees
Candidates
- No Candidates
| Versions [RSS] | 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.0.9, 1.0.10, 1.0.11, 1.0.12, 1.0.13, 1.0.14 |
|---|---|
| Dependencies | base (>=4 && <5), bytestring, containers, tagsoup (>=0.13.10), text, text-icu [details] |
| License | BSD-3-Clause |
| Copyright | Vladimir Shabanov 2011-2017 |
| Author | Vladimir Shabanov <vshabanoff@gmail.com> |
| Maintainer | Vladimir Shabanov <vshabanoff@gmail.com> |
| Category | XML |
| Home page | https://github.com/vshabanov/fast-tagsoup |
| Source repo | head: git clone https://github.com/vshabanov/fast-tagsoup |
| Uploaded | by VladimirShabanov at 2017-07-04T17:36:00Z |
| Distributions | NixOS:1.0.14 |
| Reverse Dependencies | 3 direct, 0 indirect [details] |
| Downloads | 11916 total (48 in the last 30 days) |
| Rating | (no votes yet) [estimated by Bayesian average] |
| Your Rating | |
| Status | Docs available [build log] Last success reported on 2017-07-04 [all 1 reports] |