This module provides all the settable options in shpider.
- stayOnDomain :: Bool -> Shpider ()
- setTimeOut :: Long -> Shpider ()
- setStartPage :: String -> Shpider ()
- getStartPage :: Shpider String
- onlyDownloadHtml :: Bool -> Shpider ()
- setCurrentPage :: Page -> Shpider ()
- getCurrentPage :: Shpider Page
- keepTrack :: Shpider ()
Documentation
stayOnDomain :: Bool -> Shpider ()Source
Setting this to True
will forbid you to download
and sendForm
to any site which isn't on the domain shared by the url given in setStartPage
.
setTimeOut :: Long -> Shpider ()Source
Set the CurlTimeout option. Requests will TimeOut after this number of seconds.
setStartPage :: String -> Shpider ()Source
Set the start page of your shpidering antics. The start page must be an absolute URL, if not, this will raise an error.
getStartPage :: Shpider StringSource
Return the starting URL, as set by setStartPage
onlyDownloadHtml :: Bool -> Shpider ()Source
If onlyDownloadHtml is True, then during download
, shpider will make a HEAD request to see if the content type is text/html or application/xhtml+xml, and only if it is, then it will make a GET request.
setCurrentPage :: Page -> Shpider ()Source
Set the given page as the currentPage
.
getCurrentPage :: Shpider PageSource
Return the current page