GF Project Ideas
Resource Grammars, Web Applications, etc
contact: Aarne Ranta (aarne at chalmers dot se)
%!Encoding : iso-8859-1
%!target:html
%!postproc(html): #BECE
%!postproc(html): #ENCE
%!postproc(html): #GRAY
%!postproc(html): #EGRAY
%!postproc(html): #RED
%!postproc(html): #YELLOW
%!postproc(html): #ERED
%!postproc(html): #EYELLOW
#BECE
[Logos/gf0.png]
#ENCE
==Resource Grammar Implementations==
GF Resource Grammar Library is an open-source computational grammar resource
that currently covers 12 languages.
The Library is a collaborative effort to which programmers from many countries
have contributed. The next goal is to extend the library
to all of the 23 official EU languages. Also other languages
are welcome all the time. The following diagram show the current status of the
library. Each of the red and yellow ones are a potential project.
#BECE
[school-langs.png]
#ENCE
//red=wanted, green=exists, orange=in-progress, solid=official-eu, dotted=non-eu//
The linguistic coverage of the library includes the inflectional morphology
and basic syntax of each language. It can be used in GF applications
and also ported to other formats. It can also be used for building other
linguistic resources, such as morphological lexica and parsers.
The library is licensed under LGPL.
===Tasks===
Writing a grammar for a language is usually easier if other languages
from the same family already have grammars. The colours have the same
meaning as in the diagram above; in addition, we use boldface for the
red, still unimplemented languages and italics for the
orange languages in progress. Thus, in particular, each of the languages
coloured red below are possible programming projects.
Baltic:
- #RED Latvian #ERED
- #RED Lithuanian #ERED
Celtic:
- #RED Irish #ERED
Fenno-Ugric:
- #RED Estonian #ERED
- #GRAY Finnish #EGRAY
- #RED Hungarian #ERED
Germanic:
- #GRAY Danish #EGRAY
- #RED Dutch #ERED
- #GRAY English #EGRAY
- #GRAY German #EGRAY
- #GRAY Norwegian #EGRAY
- #GRAY Swedish #EGRAY
Hellenic:
- #RED Greek #ERED
Indo-Iranian:
- #YELLOW Hindi #EYELLOW
- #YELLOW Urdu #EYELLOW
Romance:
- #GRAY Catalan #EGRAY
- #GRAY French #EGRAY
- #GRAY Italian #EGRAY
- #RED Portuguese #ERED
- #YELLOW Romanian #EYELLOW
- #GRAY Spanish #EGRAY
Semitic:
- #YELLOW Arabic #EYELLOW
- #RED Maltese #ERED
Slavonic:
- #GRAY Bulgarian #EGRAY
- #RED Czech #ERED
- #YELLOW Polish #EYELLOW
- #GRAY Russian #EGRAY
- #RED Slovak #ERED
- #RED Slovenian #ERED
Tai:
- #YELLOW Thai #EYELLOW
Turkic:
- #YELLOW Turkish #EYELLOW
===Who is qualified===
Writing a resource grammar implementation requires good general programming
skills, and a good explicit knowledge of the grammar of the target language.
A typical participant could be
- native or fluent speaker of the target language
- interested in languages on the theoretical level, and preferably familiar
with many languages (to be able to think about them on an abstract level)
- familiar with functional programming languages such as ML or Haskell
(GF itself is a language similar to these)
- on Master's or PhD level in linguistics, computer science, or mathematics
But it is the quality of the assignment that is assessed, not any formal
requirements. The "typical participant" was described to give an idea of
who is likely to succeed in this.
===The Summer School===
A Summer School on resource grammars and applications will
be organized at the campus of Chalmers University of Technology in Gothenburg,
Sweden, on 17-28 August 2009. It can be seen as a natural checkpoint in
a resource grammar project; the participants are assumed to learn GF before
the Summer School, but how far they have come in their projects may vary.
More information on the Summer School web page:
[``http://www.cs.chalmers.se/Cs/Research/Language-technology/GF/doc/gf-summerschool.html`` http://www.cs.chalmers.se/Cs/Research/Language-technology/GF/doc/gf-summerschool.html]
==Other project ideas==
===GF interpreter in Java===
The idea is to write a run-time system for GF grammars in Java. This enables
the use of **embedded grammars** in Java applications. This project is
a fresh-up of [earlier work http://www.cs.chalmers.se/~bringert/gf/gf-java.html],
now using the new run-time format PGF and addressing a new parsing algorithm.
Requirements: Java, Haskell, basics of compilers and parsing algorithms.
===GF interpreter in C#===
The idea is to write a run-time system for GF grammars in C#. This enables
the use of **embedded grammars** in C# applications. This project is
similar to [earlier work http://www.cs.chalmers.se/~bringert/gf/gf-java.html]
on Java, now addressing C# and using the new run-time format PGF.
Requirements: C#, Haskell, basics of compilers and parsing algorithms.
===GF localization library===
This is an idea for a software localization library using GF grammars.
The library should replace strings by grammar rules, which can be conceived
as very smart templates always guaranteeing grammatically correct output.
The library should be based on the
[GF Resource Grammar Library http://www.cs.chalmers.se/Cs/Research/Language-technology/GF/lib/resource/doc/synopsis.html], providing infrastructure
currently for 12 languages.
Requirements: GF, some natural languages, some localization platform
===Multilingual grammar applications for mobile phones===
GF grammars can be compiled into programs that can be run on different
platforms, such as web browsers and mobile phones. An example is a
[numeral translator http://www.cs.chalmers.se/Cs/Research/Language-technology/GF/demos/index-numbers.html] running on both these platforms.
The proposed project is rather open: find some cool applications of
the technology that are useful or entertaining for mobile phone users. A
part of the project is to investigate implementation issues such as making
the best use of the phone's resources. Possible applications have
something to do with translation; one suggestion is an sms editor/translator.
Requirements: GF, JavaScript, some phone application development tools
===Multilingual grammar applications for the web===
This project is rather open: find some cool applications of
the technology that are useful or entertaining on the web. Examples include
- translators: see [demo http://129.16.250.57:41296/translate]
- multilingual wikis: see [demo http://csmisc14.cs.chalmers.se/~meza/restWiki/wiki.cgi]
- fridge magnets: see [demo http://129.16.250.57:41296/fridge]
Requirements: GF, JavaScript or Java and Google Web Toolkit, CGI
===GMail gadget for GF===
It is possible to add custom gadgets to GMail. If you are going to write
e-mail in a foreign language then you probably will need help from
dictonary or you may want to check something in the grammar. GF provides
all resources that you may need but you have to think about how to
design gadget that fits well in the GMail environment and what
functionality from GF you want to expose.
Requirements: GF, Google Web Toolkit
==Dissemination and intellectual property==
All code suggested here will be released under the LGPL just like
the current resource grammars and run-time GF libraries,
with the copyright held by respective authors.
As a rule, the code will be distributed via the GF web site.