ReplaceUmlaut: converting text to properly encoded german umlauts

[ library, program, text ] [ Propose Tags ]

converts the convenient ae, oe and ue replacements for german umlauts into their proper UTF-8 encoded umlauts - respecting cases where the ae, oe and ue must remain based on a extensible list. Treats a file completely.


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.5.3
Change log ChangeLog.md
Dependencies base (>=4.7 && <5), dir-traverse, optparse-applicative, ReplaceUmlaut, text, transformers, uniform-cmdLineArgs, uniform-fileio, uniform-json (>=0.1.5), uniform-pandoc, uniformBase (>=0.1.5) [details]
License LicenseRef-GPL
Copyright 2021 Andrew U. Frank
Author Andrew Frank
Maintainer Andrew U. Frank <andrewufrank@gmail.com>
Category Text
Home page https://github.com/andrewufrank/u4blog.git#readme
Bug tracker https://github.com/andrewufrank/u4blog.git/issues
Source repo head: git clone https://github.com/andrewufrank/u4blog.git(uniform-cmdLineArgs)
Uploaded by andrewufrank at 2023-04-06T20:15:00Z
Distributions
Reverse Dependencies 1 direct, 0 indirect [details]
Executables replaceUmlaut
Downloads 76 total (2 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2023-04-06 [all 1 reports]

Readme for ReplaceUmlaut-0.1.5.3

[back to package description]

Replacement of ae oe ue by the UTF-8 umlaut-glyph ä ö ü

ReplaceUmlaut processes a txt or md file (or all md files in a directory) and replaces the umlaut to its proper form (possibly capitalized). It uses an extensible list to avoid replacements which are not appropriate.

The file nichtUmlaut.txt lists exceptions

An extensible list of parts of words which contain character combinations which should not be replaced; for example, Koeffizient. The list needs only contain parts of words (i.e. koeff) not a complete list of all words with non-replaceable combinations.

Command Line Use

The command replaceUmlaut filepath processes just the file and returns the changed file (the original is renamed with extension bak). Switches: - -m directory processes all md files in the directory. - -d is a debug option; files are not changed, but a changed version is returned with extension new.

Function

The function procMd1 is useful to process a single md file in a program. The list of permitted combinations must be passed as an argument. It returns True if the file is changed. Typical use is

    changed <- header
        then  do 
            erl1 <- readErlaubt  fnErlaubt
            let addErl = dyDoNotReplace . meta1 $ doc1
                -- allow additions to the list in the YAML header
                erl2 = addErl -- add erl1
            changed1 <- applyReplace debugReplace erl2   fnin 
            return changed1