rawfilepath: Use RawFilePath instead of FilePath

[ apache, library, system ] [ Propose Tags ]

Please see README.md

[Skip to Readme]


Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees


  • No Candidates
Versions [RSS], 0.1.1, 0.2.0, 0.2.1, 0.2.2, 0.2.3, 0.2.4, 1.0.0, 1.0.1, 1.1.0
Dependencies base (>=4.7 && <5), bytestring, unix [details]
License Apache-2.0
Copyright (C) 2016-2023 XT et al.
Author XT et al.
Maintainer git@xtendo.org
Category System
Home page https://github.com/xtendo-org/rawfilepath#readme
Bug tracker https://github.com/xtendo-org/rawfilepath/issues
Source repo head: git clone https://github.com/xtendo-org/rawfilepath
Uploaded by XT at 2023-08-29T04:47:22Z
Distributions LTSHaskell:1.1.0, NixOS:1.1.0, Stackage:1.1.0
Reverse Dependencies 3 direct, 0 indirect [details]
Downloads 5933 total (46 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2023-08-29 [all 1 reports]

Readme for rawfilepath-1.1.0

[back to package description]


Version: 1.1.0


  • Use all "process" and "directory" features
  • like callProcess or getDirectoryFiles
  • without worrying about FilePath encoding issues or performance penalties.


The unix package provides RawFilePath which is a type synonym of ByteString. Unlike FilePath (which is String), it has no performance issues because it is ByteString. It has no encoding issues because it is ByteString which is a sequence of bytes instead of characters.

That's all good. With RawFilePath, we can properly separate the "sequence of bytes" and the "sequence of Unicode characters." The control is yours. Properly encode or decode them with UTF-8 or UTF-16 or any codec of your choice.


  • The functions in unix are low-level.
  • The higher-level packages such as process and directory are strictly tied to FilePath.

This library provides the higher-level interface with RawFilePath.


rawfilepath is easy to use.

{-# language OverloadedStrings #-}

import RawFilePath
import System.IO
import qualified Data.ByteString as B

main :: IO ()
main = do
  p <- startProcess $ proc "sed" ["-e", "s/\\>/!/g"]
    `setStdin` CreatePipe
    `setStdout` CreatePipe
  B.hPut (processStdin p) "Lorem ipsum dolor sit amet"
  hClose (processStdin p)
  result <- B.hGetContents (processStdout p)
  print result
  -- "Lorem! ipsum! dolor! sit! amet!"
  • High performance
  • No round-trip encoding issue
  • Minimal dependencies (three packages: bytestring, unix, and base)
  • Lightweight library (under 400 total lines of code)
  • Type safety (inspired by typed-process)
  • Available now



Traditional String is notorious:

  • 24 bytes (three words) required for one character (the List constructor, the actual Char value, and the pointer to the next List constructor). 24x memory consumption.
  • Heap fragmentation causing malloc/free overhead
  • A lot of pointer chasing for reading: Devastates the cache hit rate
  • A lot of pointer chasing plus a lot of heap object allocation for manipulation (appending, slicing, etc.)
  • Completely unnecessary but mandatory conversions and memory allocation when the data is sent to or received from the outside world

This already makes us unhappy enough to avoid String. FilePath is a type synonym of String. Use RawFilePath instead. It's faster and occupies less memory.


FilePath is a type synonym of String. This is a bigger problem than what String already has, because it's not just a performance issue anymore; it's a correctness issue as there is no encoding information.

A syscall would give you (or expect from you) a series of bytes, but String is a series of characters. But how do you know the system's encoding? NTFS is UTF-16, and FAT32 uses the OEM character set. On Linux, there is no filesystem-level encoding. Would Haskell somehow magically figure out the system's encoding information and encode/decode accordingly? Well, there is no magic. FilePath has completely no guarantee of correct behavior at all, especially when there are non-ASCII letters.


In June 2015, three bright Haskell programmers came up with an elegant solution called the Abstract FilePath Proposal and met an immediate thunderous applause. Inspired by this enthusiasm, they further pursued the career of professional Haskell programming and focused on more interesting things. (sigh)

This library provides a stable and high-performance API that is available now.


API documentation of rawfilepath on Stackage.

To do

rawfilepath is stable. We don't expect any backward-incompatible changes. But we do want to port more system functions that are present in process or directory. We'll need to be a bit careful about their API for stability, though.

Patches will be highly appreciated.