Bitext MapsBitext Maps and Alignment

Proteus Project

Department of Computer Science
New York University

A "bitext" consists of two texts that are mutual translations. A bitext map is a fine-grained description of the correspondence relation between elements of the two halves of a bitext. Finding such a map is the first step to building translation models. It is also the first step in applications like automatic detection of omissions in translations.

Alignments are "watered-down" bitext maps that we can derive from general bitext maps. They are mainly useful for backward compatibility with bitext applications that were developed before we published our methods for producing high-quality general bitext maps.

Available Software

Geometric Mapping and Alignment (GMA) of parallel texts

Key Publications