Download Bitext Alignment (Synthesis Lectures on Human Language by Jörg Tiedemann PDF

By Jörg Tiedemann

This publication offers an summary of varied options for the alignment of bitexts. It describes normal options and methods that may be utilized to map corresponding elements in parallel records on a variety of degrees of granularity. Bitexts are invaluable linguistic assets for plenty of diverse study fields and useful purposes. the main foremost program is computing device translation, specifically, statistical computing device translation. in spite of the fact that, there are lots of different threads that may be that may be supported through the wealthy linguistic wisdom implicitly kept in parallel assets. Bitexts were explored in lexicography, note feel disambiguation, terminology extraction, computer-aided language studying and translation reviews to call quite a few. The publication covers the basic projects that experience to be performed while construction parallel corpora ranging from the gathering of translated records as much as sub-sentential alignments. particularly, it describes quite a few ways to record alignment, sentence alignment, notice alignment and tree constitution alignment. it is usually an inventory of assets and a finished assessment of the literature on alignment ideas. desk of Contents: creation / uncomplicated thoughts and Terminology / construction Parallel Corpora / Sentence Alignment / notice Alignment / word and Tree Alignment / Concluding comments

Show description

Read Online or Download Bitext Alignment (Synthesis Lectures on Human Language Technologies) PDF

Similar ai & machine learning books

Computer Vision: A Unified, Biologically-Inspired Approach

This quantity presents complete, self-consistent insurance of 1 method of computing device imaginative and prescient, with many direct or implied hyperlinks to human imaginative and prescient. The publication is the results of decades of study into the boundaries of human visible functionality and the interactions among the observer and his setting.

Mobile Wireless Middleware: Operating Systems and Applications. Second International Conference, Mobilware 2009, Berlin, Germany, April 28-29, 2009

This booklet constitutes the completely refereed proceedings of the second one foreign convention on cellular instant MiddleWARE, Mobilware 2009, held in Berlin, Germany, in April 2009. The 29 revised complete papers provided have been conscientiously reviewed and chosen from sixty three contributions. The papers are prepared in topical sections on situation and monitoring helps and companies; Location-aware and context-aware cellular help and providers.

Language Engineering of Lesser-Studied Languages (Nato Science Series, Series III : Computer and Systems Science-Vol 188)

The topic subject of this booklet falls into the overall region of average language processing. specific emphasis is given to languages that, for numerous purposes, haven't been the topic of research during this self-discipline. This e-book should be of curiosity to either machine scientists who wish to construct language processing structures and linguists drawn to studying approximately average language processing.

Building Natural Language Generation Systems (Studies in Natural Language Processing)

This ebook explains how you can construct ordinary Language new release (NLG) systems--computer software program structures that instantly generate comprehensible texts in English or different human languages. NLG structures use wisdom approximately language and the appliance area to instantly produce records, studies, factors, support messages, and different kinds of texts.

Additional resources for Bitext Alignment (Synthesis Lectures on Human Language Technologies)

Example text

However, this is not always the case, and alignment quality as such can be interesting because aligned bitexts are often useful for various applications. For intrinsic alignment evaluation, it is now common to concentrate on the evaluation of individual links in a bitext map using measures derived from information retrieval: precision and recall. Assuming that we have a gold standard of links Lgold , we can measure precision and recall for a set of proposed links L: Precision = |L ∩ Lgold | |L| Recall = |L ∩ Lgold | |Lgold | 22 2.

For this, Munteanu et al. [2004] apply a maximum entropy classifier, much in the spirit of the document alignment approach by Patry and Langlais [2005]. This classifier models a binary decision whether or not two given sentences are parallel according to features extracted from the sentence pair sp. The decision is based on a conditional probability distribution P (c|sp), which is parametrized as a log-linear combination of weighted real-valued feature functions fi (c, sp). P (c|sp) = 1 exp Z(sp) λi fi (c, sp) i Z(sp) is a normalizing factor, and λi are the feature weights that need to be optimized in training.

However, this is not always the case, and alignment quality as such can be interesting because aligned bitexts are often useful for various applications. For intrinsic alignment evaluation, it is now common to concentrate on the evaluation of individual links in a bitext map using measures derived from information retrieval: precision and recall. Assuming that we have a gold standard of links Lgold , we can measure precision and recall for a set of proposed links L: Precision = |L ∩ Lgold | |L| Recall = |L ∩ Lgold | |Lgold | 22 2.

Download PDF sample

Rated 4.51 of 5 – based on 18 votes