cpython-3.3.0: Bindings for libpython

Safe HaskellNone

CPython.Types.Unicode

Contents

Synopsis

Unicode objects

fromEncodedObject :: Object obj => obj -> Encoding -> ErrorHandling -> IO UnicodeSource

Coerce an encoded object obj to an Unicode object.

Bytes and other char buffer compatible objects are decoded according to the given encoding and error handling mode.

All other objects, including Unicode objects, cause a TypeError to be thrown.

fromObject :: Object obj => obj -> IO UnicodeSource

Shortcut for fromEncodedObject "utf-8" Strict

encode :: Unicode -> Encoding -> ErrorHandling -> IO BytesSource

Encode a Unicode object and return the result as Bytes object. The encoding and error mode have the same meaning as the parameters of the the str.encode() method. The codec to be used is looked up using the Python codec registry.

decode :: Bytes -> Encoding -> ErrorHandling -> IO UnicodeSource

Create a Unicode object by decoding a Bytes object. The encoding and error mode have the same meaning as the parameters of the the str.encode() method. The codec to be used is looked up using the Python codec registry.

Methods and slot functions

splitSource

Arguments

:: Unicode 
-> Maybe Unicode

Separator

-> Maybe Integer

Maximum splits

-> IO List 

Split a string giving a List of Unicode objects. If the separator is Nothing, splitting will be done at all whitespace substrings. Otherwise, splits occur at the given separator. Separators are not included in the resulting list.

splitLines :: Unicode -> Bool -> IO ListSource

Split a Unicode string at line breaks, returning a list of Unicode strings. CRLF is considered to be one line break. If the second parameter is False, the line break characters are not included in the resulting strings.

translate :: Object table => Unicode -> table -> ErrorHandling -> IO UnicodeSource

Translate a string by applying a character mapping table to it.

The mapping table must map Unicode ordinal integers to Unicode ordinal integers or None (causing deletion of the character).

Mapping tables need only provide the __getitem__() interface; dictionaries and sequences work well. Unmapped character ordinals (ones which cause a LookupError) are left untouched and are copied as-is.

The error mode has the usual meaning for codecs.

join :: Sequence seq => Unicode -> seq -> IO UnicodeSource

Join a sequence of strings using the given separator.

tailMatchSource

Arguments

:: Unicode

String

-> Unicode

Substring

-> Integer

Start

-> Integer

End

-> MatchDirection 
-> IO Bool 

Return True if the substring matches string*[*start:end] at the given tail end (either a Prefix or Suffix match), False otherwise.

findSource

Arguments

:: Unicode

String

-> Unicode

Substring

-> Integer

Start

-> Integer

End

-> FindDirection 
-> IO (Maybe Integer) 

Return the first position of the substring in string*[*start:end] using the given direction. The return value is the index of the first match; a value of Nothing indicates that no match was found.

countSource

Arguments

:: Unicode

String

-> Unicode

Substring

-> Integer

Start

-> Integer

End

-> IO Integer 

Return the number of non-overlapping occurrences of the substring in string[start:end].

replaceSource

Arguments

:: Unicode

String

-> Unicode

Substring

-> Unicode

Replacement

-> Maybe Integer

Maximum count

-> IO Unicode 

Replace occurrences of the substring with a given replacement. If the maximum count is Nothing, replace all occurences.

format :: Unicode -> Tuple -> IO UnicodeSource

Return a new Unicode object from the given format and args; this is analogous to format % args.

contains :: Object element => Unicode -> element -> IO BoolSource

Check whether element is contained in a string.

element has to coerce to a one element string.