Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
Introduction
This micro library consists of a single cross platform function
downloadFile
which downloads a file off the Web and to your filesystem. It
is very light on dependencies and configurability and ultimately just a
wrapper around a PowerShell script
on Windows and curl on Linux and macOS. Both
Powershell and curl
should be available out-of-the-box.
To set expectations downloadFile
is lo-fi and deliberately under-engineered.
The download request blocks until it is done, all errors are thrown as
unrecoverable IO exceptions and any errors that occur at the curl
or
PowerShell
level are bubbled up to the user as is. If you don't care about
low dependencies and a small API or need recoverable errors and socket pooling,
http-client is a
much nicer package with many more options.
I wrote this because I needed an easy, low-dependency way to download files off
the Internet across platforms at build time. I have a
demo project which shows how
to use it in your Setup.hs
Cabal build script.
It could also work pretty well for throwaway scripts.
:: HasCallStack | |
=> String | URL from which to download a file (or web page) |
-> Maybe (String, Maybe ProxyAuth) | Proxy authentication, eg. |
-> FilePath | Directory in which to save the file (it must exist) |
-> FilePath | File name into which to save the downloaded data |
-> Overwrite | Optionally overwrite the file if it already exists |
-> IO FilePath |
Downloads a file from the given URL via a GET request to the specified location on the filesystem and returns the _absolute_ and canonicalized path to that location.
On Windows the download itself delegates to a PowerShell script and wraps curl on all other platforms. The user agent for the request is "downloader/<downloader-version>(<os>;<arch>)", eg. when version 0.1.0.0 of this package is run on 64 bit Linux the user agent is "downloader/0.1.0.0(linux;x86_64)"
Only HTTP and HTTPS transport protocols are supported. If a URL does not specify a protocol it is prefixed with "https:", eg. given URL string "www.google.com" this function will make a request to "https://www.google.com".
The output directory may be relative but must exist.
The output filename must be just a valid, unqualified filename, eg. "file.txt" is fine but "../../a/b/c/file.txt" is rejected.
This function will throw an IO exception in the following cases:
- A badly formed URL
- A URL that specifies a protocol that is not http or https, eg. "ftp" will be rejected
- A badly formed proxy URL.
- A non existent directory or one that isn't writeable
- An invalid output filename
- A filename that includes parent directories eg, "a/b/c/file.txt"
- An HTTP status that is not 200 is returned by the request
- Any other error returned by
curl
or PowerShell.
A Bool
wrapper that is passed to downloadFile
and
which if set to (Overwrite True)
will allow downloadFile
to
overwrite an existing file.
Used for proxy authentication:
Basic ("user", "pass")
indicates that the proxy needs
basic authentication
and where the username is "user" and the password is "pass".
whereas with Digest ("user", "pass")
digest authentication
is used instead.
In a nutshell with Basic Auth your password is sent over the network in clear text so anyone monitoring traffic can see it. With digest auth each request generates two calls, the first gets the proxy's unique hash key and the second sends the actual request with the password hashed using the unique key so anyone monitoring web traffic only sees it encrypted.