purl.org/peter.turney

Word Sense Disambiguation - Applications

Definition of Word Sense Disambiguation
Many words have multiple senses. In the context of a given document, typically only one or two of the possible senses of a word are appropriate. Automatic word sense disambiguation is the task of automatically determing the intended sense of a word, based on its surrounding context.
Machine Translation
Ambiguity is a major cause of errors in machine translation. A classic illustration of the problem is the machine translation of the phrase "The spirit is willing, but the flesh is weak" from English, to another language, and then back:
  • German: "The spirit is ready, but the flesh is weak."
  • French: "The spirit is laid out, but the flesh is weak."
  • Portugese: "The spirit is made use, but the meat is weak."
  • Italian: "The spirit is arranged, but the meat is weak person."
  • Spanish: "The alcohol is arranged, but the meat is weak."
  • The translation errors stem from the ambiguous words, "spirit", "willing", and "flesh".
    Information Retrieval
    Words with multiple senses are problematic for information retrieval. Word sense disambiguation could make search engines more precise.
    Information Extraction
    The task of information extraction is to find specific information in a given document, such as the name of a commercial product. Ambiguous words can be confusing to information extraction algorithms.

    Updated: February 3, 2007.