purl.org/peter.turney

Lexical Cohesion - Applications

Definition of Lexical Cohesion
A group of words is lexically cohesive when all of the words are semantically related; for example, when they all concern the same topic.
Cohesive Extractive Summarization
A rudimentary summary of a document can be created by extracting the most important sentences from the document, where the importance of a sentence is measured by the presence of keyphrases. Such summaries often contain outliers, sentences that do not fit with the other sentences. A measure of lexical cohesion can be used to detect and remove these outliers, thereby improving the quality of the summary.
Anaphora Resolution
A typical document first introduces an entity, such as a company, by giving its full name. Later in the document, the entity will be mentioned more briefly, using phrases like "the company" or simply "it". Anaphora resolution is the task of recognizing that these shorter phrases refer to the same entity as the full name. One approach to anaphora resolution involves building a chain of lexically cohesive terms, connecting sentences in the given document that discuss the same entity.
Improved Speech Recognition
A measure of lexical cohesion can be used to recognize when speech recognition software has made errors. The incorrect words usually do not cohere with the rest of the text.
Improved Optical Character Recognition
Errors in optical character recognition can also be detected by their lack of lexical cohesion.
Improved Machine Translation
Errors in machine translation also lack cohesion, although they may be more cohesive than errors in speech recognition and optical character recognition.

Updated: February 3, 2007.