By Jian-Yun Nie, Graeme Hirst
Look for info is not any longer completely restricted in the local language of the consumer, yet is progressively more prolonged to different languages. this offers upward thrust to the matter of cross-language details retrieval (CLIR), whose aim is to discover suitable details written in a unique language to a question. as well as the issues of monolingual info retrieval (IR), translation is the major challenge in CLIR: one should still translate both the question or the records from a language to a different. notwithstanding, this translation challenge isn't really similar to full-text desktop translation (MT): the objective isn't really to supply a human-readable translation, yet a translation appropriate for locating appropriate records. particular translation tools are therefore required. The objective of this ebook is to supply a entire description of the specifi c difficulties bobbing up in CLIR, the ideas proposed during this region, in addition to the remainder difficulties. The e-book begins with a normal description of the monolingual IR and CLIR difficulties. various periods of methods to translation are then awarded: methods utilizing an MT approach, dictionary-based translation and techniques in line with parallel and related corpora. furthermore, the common retrieval effectiveness utilizing diversified ways is in comparison. will probably be proven that translation techniques particularly designed for CLIR can rival and outperform fine quality MT platforms. eventually, the e-book bargains a glance into the long run that pulls a powerful parallel among question growth in monolingual IR and question translation in CLIR, suggesting that many methods built in monolingual IR will be tailored to CLIR. The e-book can be utilized as an advent to CLIR. complex readers may also locate extra technical information and discussions concerning the final learn demanding situations sooner or later. it truly is compatible to new researchers who intend to hold out learn on CLIR.
Read Online or Download Cross-language Information Retrieval (Synthesis Lectures on Human Language Technologies) PDF
Similar ai & machine learning books
This quantity offers entire, self-consistent assurance of 1 method of desktop imaginative and prescient, with many direct or implied hyperlinks to human imaginative and prescient. The ebook is the results of a long time of analysis into the boundaries of human visible functionality and the interactions among the observer and his surroundings.
This ebook constitutes the completely refereed proceedings of the second one foreign convention on cellular instant MiddleWARE, Mobilware 2009, held in Berlin, Germany, in April 2009. The 29 revised complete papers provided have been rigorously reviewed and chosen from sixty three contributions. The papers are equipped in topical sections on place and monitoring helps and prone; Location-aware and context-aware cellular aid and providers.
The topic subject of this book falls into the final quarter of normal language processing. distinct emphasis is given to languages that, for varied purposes, haven't been the topic of analysis during this self-discipline. This ebook could be of curiosity to either computing device scientists who want to construct language processing platforms and linguists attracted to studying approximately typical language processing.
This publication explains tips to construct average Language iteration (NLG) systems--computer software program structures that instantly generate comprehensible texts in English or different human languages. NLG platforms use wisdom approximately language and the appliance area to instantly produce files, studies, reasons, support messages, and other forms of texts.
Additional resources for Cross-language Information Retrieval (Synthesis Lectures on Human Language Technologies)
For example, the term “computer game” can be written as two words 컴퓨터 게임 or as a single word 컴퓨터게임. Therefore, some segmentation or decompounding is still necessary (Tomlinson, 2004). The three languages have much in common for IR and CLIR. , 1999; Ogawa and Matsuda, 1999). 3 Other Languages In Arabic language, letters can change the form according to its position within a word. A root word can be extended by prefixes and postfixes to form other words (of different categories). Vowels are often omitted in writing.
As we can see, most of the operations are at lexical and syntactic levels, and very limited semantic information is used. This gives rise to the translation ambiguity problem: if the ambiguity cannot be solved by syntactic information and is not covered by phrase or idiom dictionaries, then there is a high chance that the word will be translated by its default translation. This may lead to wrong translations. We will see several examples later in the next section where we discuss about the potential problems of using MT systems for CLIR.
Until the reports on the earthquakes in 2008, many would not know that 汶川 is the name of a place. Even if one could guess it from the context in which it is used, one could not expect it to be included in a bilingual dictionary and know how to translate it into other languages. Many proper names would fall in the same situation. In addition, new terms can be more easily created in Chinese due to the fact that each Chinese character bears some meanings and a new combination of them can often be meaningful, too.