# readability Give `readability` an HTML document and it will detect and extract text of the article while removing everything unnecessary like menus, advertisements or sidebars. It is more or less reimplementation of [python-readability](https://github.com/buriy/python-readability). The package contains both a library and simple executable. ## Example of using `readability` executable Having an article that looks like following image: ![Original HTML](https://hg.sr.ht/~geyaeb/haskell-readability/raw/doc/orig.png "Original HTML") we can extract text by calling: ```shell $> readability https://mises.org/wire/why-central-banks-are-threat-our-savings ``` and we get the following HTML: ![Extracted text](https://hg.sr.ht/~geyaeb/haskell-readability/raw/doc/readable.png "Extracted text") If we are interested in plain text, we can further use `pandoc`: ```text $> readability https://mises.org/wire/why-central-banks-are-threat-our-savings | pandoc -f html -t plain The US personal savings rate jumped to 33 percent in April from 12.7 percent in March and 8 percent in April last year. An increase in savings is regarded by popular economics as less expenditure on consumption. Since consumption expenditure is considered as the main driving force of the economy, obviously a rebound in savings, which implies less consumption, cannot be good for economic activity, so it is held. Saving and wealth—what is the relation? To maintain their life and well-being, individuals require access to consumer goods. An increase in various consumer goods permits an increase in individuals’ living standards. What allows an increase in the production of consumer goods is the maintenance and the enhancement of the infrastructure of an economy. With better infrastructure, a greater quantity and better quality of consumer goods could be generated and more real wealth can be produced. The enhancement and the maintenance of the infrastructure becomes possible because of the availability of final consumer goods that sustain the various individuals who are busy expanding and maintaining the infrastructure. It is the producers of final consumer goods who pay the various individuals engaged in maintenaning and enhancing the infrastructure. The producers of final consumer goods pay these individuals (i.e., the intermediary producers) out of the saved or unconsumed production of final consumer goods. Note that when a producer of final consumer goods decides to save more, i.e., to consume less, the fall in his consumption is offset by the increase in the consumption of individuals who are engaged in the intermediary stages of production. This means that overall consumption is not declining because of an increase in saving—as popular thinking has it. ``` Had we not processed the article through `readability`, we would have gotten: ``` text $> curl https://mises.org/wire/why-central-banks-are-threat-our-savings | pandoc -f html -t plain Skip to main content [Home] Toggle navigation - Blog - Mises Wire - Books - Podcast - Video - Events - Store - Graduate Program - Ver en Español Stay Connected GO SUPPORT MISES JOIN OR RENEW TODAY SUPPORT MISES JOIN OR RENEW TODAY Mises Wire GET NEWS AND ARTICLES IN YOUR INBOXPrint A A Home | Wire | Why Central Banks Are a Threat to Our Savings Why Central Banks Are a Threat to Our Savings - [dollars] 0 Views Tags Money and Banking 06/25/2020Frank Shostak The US personal savings rate jumped to 33 percent in April from 12.7 percent in March and 8 percent in April last year. An increase in savings is regarded by popular economics as less expenditure on consumption. Since consumption expenditure is considered as the main driving force of the economy, obviously a rebound in savings, which implies less consumption, cannot be good for economic activity, so it is held. Saving and wealth—what is the relation? ``` We can also print only title or short title of the article: ``` shell $> readability --extract shortTitle https://mises.org/wire/why-central-banks-are-threat-our-savings Why Central Banks Are a Threat to Our Savings ``` Or we can print all available information as JSON for further processing: ``` shell $> readability --extract all https://mises.org/wire/why-central-banks-are-threat-our-savings | jq '.' { "article": "…", "shortTitle": "Why Central Banks Are a Threat to Our Savings", "title": "Why Central Banks Are a Threat to Our Savings | Mises Wire" } ``` Raw HTML can be also provided using standard input when SOURCE is omitted: ``` shell $> curl -s https://mises.org/wire/why-central-banks-are-threat-our-savings | readability -e title Why Central Banks Are a Threat to Our Savings | Mises Wire ``` ## Contribute Project is hosted at https://sr.ht/~geyaeb/haskell-readability/ . The homepage provides links to [Mercurial repository](https://hg.sr.ht/~geyaeb/haskell-readability), [mailing list](https://lists.sr.ht/~geyaeb/haskell-readability) and [ticket tracker](https://todo.sr.ht/~geyaeb/haskell-readability). Patches, suggestions, questions and general discussions can be send to the [mailing list](https://lists.sr.ht/~geyaeb/haskell-readability). Detailed information about sending patches by email can be found at [https://man.sr.ht/hg.sr.ht/email.md](https://man.sr.ht/hg.sr.ht/email.md).