purl.org/peter.turney
Analogies and Relational Similarity - Applications
- Definition of Relational Similarity
- When two words have a high degree of attributional similarity, we call them
synonyms. When two pairs of words have a high degree of relational similarity,
we say that their relations are analogous. For example, the word pair mason:stone
is analogous to the pair carpenter:wood.
- Recognizing Word Analogies
- Given a measure of relational similarity, it is possible to automatically
recognize word analogies, and thus solve multiple-choice word analogy problems.
- Classifying Semantic Relations
- A measure of relational similarity can be used to classify noun-modifier pairs.
The problem is to classify a noun-modifier pair, such as "laser
printer", according to the semantic relation between the head noun (printer) and the
modifier (laser). Classifying semantic relations in noun-modifier pairs can be viewed as a
supervised learning problem. If labeled training data is available (noun-modifier pairs
that have been manually assigned various classes, such as "instrument" and "cause"),
then an unknown pair can be classified according to the label of its
nearest neighbours in the training set. A measure of relational similarity can
be used to identify the nearest neighbours. There are many
interesting relations, such as antonymy, that do not occur in noun-modifier pairs,
but noun-modifier pairs are an interesting application, since they are
very common in English (WordNet 2.0 contains more than 26,000 noun-modifier pairs).
- Machine Translation
- Noun-modifier pairs are difficult to translate. Machine
translation cannot rely primarily on manually constructed translation dictionaries for
translating noun-modifier pairs, since such dictionaries are necessarily very incomplete.
It should be easier to automatically translate noun-modifier pairs when they are first
classified by their semantic relations. Consider the pair "electron microscope". Is the
semantic relation purpose (a microscope for viewing electrons), instrument (a
microscope that uses electrons), or material (a microscope made of electrons)? The
answer to this question should facilitate translation of the individual words, "microscope"
and "electron", and may also help to determine how the individual words are to be
combined in the target language (what order to put them in, what suffixes to add,
what prepositions to add).
- Word Sense Disambiguation
- Noun-modifier pairs are almost always monosemous. The implicit semantic relation between the
two words in the pair narrowly constrains the possible senses of the words.
The intended sense of a word is determined by its semantic
relations with the other words in the surrounding text. If we can identify the semantic
relations between the given word and its context, then we can disambiguate the given
word. Consider the noun-modifier pair "plant food". In isolation, "plant" could refer to an
industrial plant or a living organism. Once we have determined that the implicit semantic
relation in "plant food" is beneficiary (the plant benefits from the food), as opposed to,
say, location at (the food is located at the plant), the sense of "plant" is constrained to
"living organism".
- Information Extraction
- The standard information extraction task is to identify a specific type
of information, such as the name of a person or a company, and extract that
information from a given document. With a measure of relational similarity, we
can take this one step further, and identify the relations between the
extracted terms. For example, we can automatically recognize that the relation
between the extracted person's name and the extracted company's name is that
the person is the CEO of the company.
- Automatic Thesaurus Generation
- A thesaurus, such as WordNet, links words by relations of synonymy ("big" and "large"),
hyponymy ("oak" and "tree"), antonymy ("black" and "white"), and meronymy ("wheel" and "car").
A measure of relational similarity can be used as a component in a system
for automatically generating a thesaurus. Given examples of any semantic relation,
the measure of relational similarity can be used to extend those examples to new cases.
- Information Retrieval
- Current search engines are based on attributional similarity; the
similarity of a query to a document depends on correspondence between the attributes
of the query and the attributes of the documents. Typically the correspondence is
exact matching of words or root words. Latent Semantic Indexing (LSI) allows more
flexible matching, but it is still based on attributional similarity.
If we could reliably classify semantic relations, then we could ask new
kinds of search queries:
- Existing search engines cannot recognize the implicit instrument relation in "laser
printer", so the query "instrument and printing" will miss many relevant documents. A
measure of relational similarity could be used as a component in a supervised learning
system that learns to identify semantic relations between words in documents. These
semantic relations could then be added to the index of a conventional (attributional)
search engine. Alternatively, a search engine could compare a query to a document
using a similarity measure that takes into account both relational similarity and
attributional similarity. A query might be phrased as a word analogy problem:
- Processing Metaphorical Text
- Metaphorical language is very common in our daily life; so common that we
are usually unaware of it. Even technical dialogue, such as computer users asking for
help, is often metaphorical:
Human-computer dialogue systems are currently limited to very simple, literal language.
We believe that the task of mapping metaphorical language to more literal language can
be approached as a kind of word analogy problem:
A measure of relational similarity can be used to solve these kinds of
word analogy problems, and thus facilitate computer processing of metaphorical text.
- Identifying Semantic Roles
- A semantic frame for an event such as judgement contains
semantic roles such as judge, evaluee, and reason, whereas an event such as
statement contains roles such as speaker, addressee, and message.
The task of identifying semantic roles is to label the parts of a sentence
according to their semantic roles. A measure of relational similarity
can help to identify semantic roles.
- Analogy-Making
- Structure Mapping Theory (SMT), and its implementation in the Structure
Mapping Engine (SME), is the most influential work on modeling of analogy-making. The goal of
computational modeling of analogy-making is to understand how people form complex,
structured analogies. SME takes representations of a source domain and a target
domain, and produces an analogical mapping between the source and target. The
domains are given structured propositional representations, using predicate logic. These
descriptions include attributes, relations, and higher-order relations (expressing relations
between relations). The analogical mapping connects source domain relations to target
domain relations. Each individual connection in an analogical mapping implies that the
connected relations are similar; thus, SMT requires a measure of relational similarity, in
order to form maps. Early versions of SME only mapped identical relations, but later
versions of SME allowed similar, non-identical relations to match.
However, the focus of research in analogy-making has been on the mapping process as
a whole, rather than measuring the similarity between any two particular relations, hence
the similarity measures used in SME at the level of individual connections are somewhat
rudimentary. A more sophisticated measure of relational similarity, such
as Latent Relational Analysis (LRA), may enhance the performance of SME.
Likewise, the focus of LRA is on the similarity between particular relations,
and systematic mapping between sets of relations is ignored, so LRA may also be
enhanced by integration with SME.
Updated: February 3, 2007.