Safe Haskell | None |
---|---|
Language | Haskell2010 |
Intro
This module allows you to run a loop which visits web feeds (RSS, Atom, etc.) periodically and reports new items using a function you provide. You can also insert control commands into the loop, such as adding and removing a feed, so that stopping and restarting it isn't required.
Both http and https URLs are supported.
The original use case which motivated the creation of this library is FunBot.
Running
The run
function runs the loop, and it takes a collector callback function
(as one of its parameters) to be called when new feed items are found.
Here is a simple usage example.
import Data.Time.Units import Web.Feed.Collect collect :: Label -> URL -> Feed -> Item -> IO () collect label url feed item = do putStrLn $ label ++ " : " ++ url putStrLn "Got a new feed item!" putStrLn $ showFeed feed putStrLn $ showItem item logError :: Error -> IO () logError = print feeds :: [(Label, Url)] feeds = [("democ-now", "http://www.democracynow.org/democracynow.rss")] main :: IO () main = run collect Nothing logError Nothing (1 :: Minute) 3 feeds
For quick testing, you can use one of the predefined collector functions.
For example, there is collectorNull
which discards the feed items and does
nothing (you can use it e.g. when testing success of HTTP requests or
debugging the library itself), and collectorPretty
which writes a short
nicely formatted entry to stdout for each new feed item. And there are more.
If new items aren't being detected correctly, you can try using runDebug
in place of run
. It takes and same arguments and runs the same loop, but
also writes a detailed log of the detection process into a file
"debug.log". You can use tail -f debug.log
from your terminal to watch
new log entries get appended to the file.
Using Control Commands
Now let's see how to push control commands into the loop. The 4th parameter
run
takes is an optional command queue. In the example above we just
passed Nothing
. Now let's provide an actual queue and use it.
Suppose we are writing a program with a command-line interface. It watches news feeds in the background, and can take commands from the user at the same time on the command-line. Feeds can be added, removed, etc. without restarting the program.
Assume we've written a function named parseCommand
, which takes a line of
user input and returns a command ready to push into the queue. The parsed
commands are of type Command
and are created using functions like
addFeed
, removeFeed
and so on, which this module provides. Using our
parseCommand
we can write the program like this:
import Control.Concurrent (forkIO) import Data.Time.Units import System.IO import Web.Feed.Collect collect :: Label -> URL -> Feed -> Item -> IO () collect = collectPretty logE :: Error -> IO () logE = print feeds :: [(Label, Url)] feeds = [("fsf-news", "https://www.fsf.org/static/fsforg/rss/news.xml")] parseCommand :: String -> Maybe Command parseCommand line = {- ... - } main :: IO () main = do cqueue <- newCommandQueue forkIO $ run collect Nothing logE (Just cqueue) (1 :: Minute) 3 feeds let loop = do line <- getLine if line == "quit" then putStrLn "Bye!" else do case parseCommand line of Just cmd -> sendCommand cqueue cmd Nothing -> putStrLn "Invalid input" loop loop
- type Label = String
- type Url = String
- data Command
- data CommandQueue
- data Error
- collectorNull :: Label -> Url -> Feed -> Item -> IO ()
- collectorPrint :: Label -> Url -> Feed -> Item -> IO ()
- collectorPretty :: Label -> Url -> Feed -> Item -> IO ()
- collectorLog :: (Label -> FilePath) -> Label -> Url -> Feed -> Item -> IO ()
- run :: (TimeUnit t, MonadIO m) => (Label -> Url -> Feed -> Item -> m ()) -> Maybe (Label -> Url -> Feed -> [Item] -> m ()) -> (Error -> m ()) -> Maybe CommandQueue -> t -> Int -> [(Label, Url)] -> m ()
- runDebug :: (TimeUnit t, MonadIO m) => (Label -> Url -> Feed -> Item -> m ()) -> Maybe (Label -> Url -> Feed -> [Item] -> m ()) -> (Error -> m ()) -> Maybe CommandQueue -> t -> Int -> [(Label, Url)] -> m ()
- newCommandQueue :: IO CommandQueue
- sendCommand :: CommandQueue -> Command -> IO ()
- sendCommands :: CommandQueue -> [Command] -> IO ()
- addFeed :: Label -> Url -> Command
- removeFeed :: Label -> Command
- feedActive :: Label -> Bool -> Command
- setInterval :: TimeUnit t => t -> Command
- setMaxPerVisit :: Int -> Command
- showFeed :: Feed -> String
- showItem :: Item -> String
Types
data CommandQueue Source
An error occuring while reading from a news feed.
HttpError HttpException | Error while creating an HTTP request or receiving a response. |
FeedParsingFailed Label Url | Error while parsing the HTTP response body into feed content |
Collectors
collectorNull :: Label -> Url -> Feed -> Item -> IO () Source
An item collector which discards the item, i.e. does nothing.
collectorPrint :: Label -> Url -> Feed -> Item -> IO () Source
An item collector which prints feed and item fields to stdout
.
collectorPretty :: Label -> Url -> Feed -> Item -> IO () Source
An item collector which prints short friendly feed and item descriptions
to stdout
.
collectorLog :: (Label -> FilePath) -> Label -> Url -> Feed -> Item -> IO () Source
An item collector which writes friendly descriptions into a log file, determining the log file's name using the function given as the first argument (the feed label, i.e. second argument, is also used as the argument for that function).
Running
:: (TimeUnit t, MonadIO m) | |
=> (Label -> Url -> Feed -> Item -> m ()) | Collector, i.e. action to perform when receiving a feed item. |
-> Maybe (Label -> Url -> Feed -> [Item] -> m ()) | Action to perform when receiving feed items. This is just a chance to
provide an efficient shortcut instead of repeated use of the previous
parameter. If there is no such shortcut, simply pass |
-> (Error -> m ()) | Error logging action. It will be called when an error occurs while trying to download and parse a feed. Such an error doesn't cause anything to stop. Execution simply goes on to read the next feed, and will try the erronous feed again in the next round. |
-> Maybe CommandQueue | A command queue you can use to change settings while the program runs.
For example, you can add a new feed or change the interval between polls
without relaunching |
-> t | Time interval between visits of a watched feed. |
-> Int | Maximal number of items to collect per visit (if more are available, they will be collected in the next visit) |
-> [(Label, Url)] | List of short feed labels (for easy reference and logging), and their URLs to watch. |
-> m () |
Watch feeds and perform the given action on received feed items.
runDebug :: (TimeUnit t, MonadIO m) => (Label -> Url -> Feed -> Item -> m ()) -> Maybe (Label -> Url -> Feed -> [Item] -> m ()) -> (Error -> m ()) -> Maybe CommandQueue -> t -> Int -> [(Label, Url)] -> m () Source
Like run
, but writes detailed reports into a file named "debug.log".
This should help you understand which feeds items are being received and why
new items aren't being detected in the way you expect.
For some feeds, the new item detection method used by this library may fail. In that case, please report the issue, and then the method can be updated and extended to support more cases.
Control commands
newCommandQueue :: IO CommandQueue Source
Create a new empty command queue.
sendCommand :: CommandQueue -> Command -> IO () Source
Send a command into the queue.
sendCommands :: CommandQueue -> [Command] -> IO () Source
Send a series of commands into the queue.
removeFeed :: Label -> Command Source
Remove a previously added feed.
feedActive :: Label -> Bool -> Command Source
Set whether a given feed should be watched (active) or not (inactive).
setInterval :: TimeUnit t => t -> Command Source
Set the interval, in microseconds, between feed scans.
setMaxPerVisit :: Int -> Command Source
Set the maximal number of feed items to be collected per feed per scan. If more new items are found, they wait for the next scan.