Robert L Goldstone Ying Feng and Brian J Rogosky

Consider two individuals, John and Mary, who each possess a number of concepts. How can we determine that John and Mary both have a concept of, say, Horse? John and Mary may not have exactly the same knowledge of horses, but it is important to be able to place their horse concepts into correspondence with one another, if only so that we can say things like, "Mary's concept of horse is much more sophisticated than John's." Concepts should be public in the sense that they can be possessed by more than one person (Fodor, 1998; Fodor & Lepore, 1992), and for this to be the possible, we must be able to determine correspondences, or translations, between two individuals' concepts.

There have been two major approaches in cognitive science to conceptual meaning that could potentially provide a solution to finding translations between conceptual systems. According to an "external grounding" account, concepts' meanings depend on their connection to the external world (this account is more thoroughly defined in the next section). By this account, the concept Horse means what it does because our perceptual apparatus can identify features that characterize horses. According to what we will call a "Conceptual web" account, concepts' meanings depend on their connections to each other. By this account, Horse's meaning depends on Gallop, Domesticated, and Quadruped, and in turn, these concepts depend on other concepts, including Horse (Quine & Ullian, 1970).

In this chapter, we will first present a brief tour of some of the main proponents of conceptual web and external grounding accounts of conceptual meaning. Then, we will describe a computer algorithm that translates between conceptual systems. The initial goal of this computational work is to show how translating across systems is possible using only within-system relations, as is predicted by a conceptual web account. However, the subsequent goal is to show how the synthesis of external and internal information can dramatically improve translation. This work suggests that the external grounding and conceptual web accounts should not be viewed as competitors, but rather, that these two sources of information strengthen one another. In the final section of the chapter, we will present applications of the developed ABSURDIST algorithm to object recognition, large corpora translation, analogical reasoning, and statistical scaling.

In this chapter, we will be primarily interested in translating between conceptual systems, but many of our examples of translation will involve words. Concepts are not always equivalent to word meanings. For one thing, we can have concepts of things for which we do not have words, such as the cylindrical plastic sheath at the tips of shoelaces. Although there may not be a word for every concept we possess, behind every word there is a conceptual structure. Accordingly, when we talk about a concept of Horse, we are referring to the conceptual structure that supports people's use of the word "Horse" as well as their ability to recognize horses, predict the behavior of horses, and interact appropriately with horses.

grounded concepts

For a concept to be externally grounded means that, in one way or another, its meaning is based on its connection to the world. There are several ways for this to occur. First, aspects of the concept may come to us via a perceptual system. Our concept of Red, Fast, and Loud all have clear perceptual components, but most, if not all (Barsalou, 1999), other concepts do as well. Second, a concept may be tied to objects in the world by deixis - by linguistically or physically pointing in a context. When a parent teaches a child the concept Horse by pointing out examples, this provides contextualized grounding for the child's emerging concept. A final third way, which we will not be addressing in our work, is that meaning may be tied directly to the external world without being mediated through the senses. Putnam's (1973) famous "twin earth" thought experiment is designed to show how the same internal, mental content can be associated with two different external referents. Putnam has us imagine a world, twin earth, that is exactly like our earth except that the compound we call water (H2O) has a different atomic structure (xyz), while still looking, feeling, and acting like water as we on real earth know it. Two molecule-for-molecule identical individuals, one on earth and one on twin earth, would presumably have the same internal mental state when thinking "water is wet," and yet, Putnam argues, they mean something different. One means stuff that is actually, whether they know it or not, made up of H2O, while the other means stuff that is made up of xyz. Putnam concludes that what is meant by a term is not determined solely by mental states, but rather depends upon the external world as well.

The rest of the chapters in this book give excellent grounds for believing that our concepts are not amodal and abstract symbolic representations, but rather are grounded in the external world via our perceptual systems. Lawrence Barsalou has presented a particularly influential and well-developed version of this account in the form of Perceptual Symbols Theory (Barsalou, 1999). By this account, conceptual knowledge involves activating brain areas dedicated for perceptual processing. When a concept is brought to mind, sensory-motor areas are reactivated to implement perceptual symbols. Even abstract concepts, such as truth and negation, are grounded in complex perceptual simulations of combined physical and introspective events. Several lines of empirical evidence are consistent with a perceptually grounded conceptual system. Detailed perceptual information is represented in concepts and this information is used when reasoning about those concepts (Barsalou et al., 2003). Concepts that are similar to one another give rise to similar patterns of brain activity, and a considerable amount of this activity is found in regions associated with perceptual processing (Simmons & Barsalou, 2003). When words are heard or seen, they spontaneously give rise to eye movements and perceptual images that would normally be evoked by the physical event designated by the words (Richardson, Spivey, Barsalou, & McRae, 2003; Stanfield & Zwaan, 2001). Switching from one modality to another during perceptual processing incurs a processing cost. The same cost is exhibited during the verification of concept properties, consistent with the notion that perceptual simulation underlies even verbal conceptual processing (Pecher, Zeelenberg, & Barsalou, 2003).

Much of the recent work on perceptual and embodied accounts of concepts has involved verbal stimuli such as words, sentences, and stories. The success of grounded accounts of language is noteworthy and surprising because of its opposition to the standard conception of language as purely arbitrary and symbolic. An acknowledgment of the perceptual grounding of language has lead to empirically validated computational models of language (Regier, 1996; Regier & Carlson, 2001). It has also provided insightful accounts of metaphors for understanding abstract notions such as time (Boroditsky, 2000; Boroditsky & Ramscar, 2002) and mathematics (Lakoff & Nunez, 2000). There has been a recent torrent of empirical results that are inconsistent with the idea that language comprehension is based on concepts that are symbols connected only to each other. Instead, the data support an embodied theory of meaning that relates the meaning of sentences to human perception and action (Glenberg & Kaschak, 2002; Glenberg & Robertson, 2000; Zwaan, Stanfield, & Yaxley, 2002).

Consistent with Barsalou's Perceptual Symbols Theory, other research has tried to unify the typically disconnected literatures on low-level perceptual processing and high-level cognition. Goldstone and Barsalou (1998) argue for strong parallels between processes traditionally considered to be perceptual on the one hand and conceptual on the other, and that perceptual processes are co-opted by abstract conceptual thought. Other research indicates bidirectional influences between our concepts and perceptions

(Goldstone, 2003; Goldstone, Lippa, & Shiffrin, 2001; Schyns, Goldstone, & Thibaut, 1998). Like a good pair of Birkenstock sandals that provide support by flexibly conforming to the foot, perception supports our concepts by conforming to these concepts. Perceptual learning results in perceptual and conceptual systems that are highly related. Taken together, this work suggests that apparently high-level conceptual knowledge and low-level perception may be more closely related than traditionally thought/perceived.

The case for grounding conceptual understanding in perception has a long philosophical history. As part of the British empiricist movement, David Hume (1740/1973) argued that our conceptual ideas originate in recombinations of sensory impressions. John Locke (1690) believed that our concepts ("ideas") have their origin either by our sense organs or by an internal sense of reflection. He argued further that our original ideas are derived from sensations (e.g., yellow, white, heat, cold, soft, and hard), and that the remaining ideas are derived from or depend upon these original ideas. The philosophical empiricist movement has been reinvigorated by Jesse Prinz (2002), who argues for sensory information as the ultimate ground for our concepts and beliefs. Stevan Harnad has similarly argued that concepts must be somehow connected to the external world, and this external connection establishes at least part of the meaning of the concept. In his article "The symbol grounding problem," Stevan Harnad (1990) considers the following thought experiment: "Suppose you had to learn Chinese as a first language and the only source of information you had was aChinese/Chinese dictionary. [... ]. How can you ever get off the symbol/ symbol merry-go-round? How is symbol meaning to be grounded in something other than just more meaningless symbols? This is the symbol grounding problem" (pp. 339-340).

conceptual webs

In stark antithesis to Harnad's thought experiment, artificial intelligence researchers have argued that conceptual meaning can come from dense patterns of relations between symbols even if the symbols have no causal connections to the external world. Lenat and Feigenbaum (1991) claim that "The problem of 'genuine semantics'... gets easier, not harder, as the K[nowledge] B[ase] grows. In the case of an enormous KB, such as CYC's, for example, we could rename all of the frames and predicates as G001, G002,..., and - using our knowledge of the world - reconstruct what each of their names must be" (p. 236). This claim is in direct opposition to Harnad's image of the symbol-symbol merry-go-round, and may seem ungrounded in several senses of the term. Still, depending on the power of intrasystem relations has been a mainstay of artificial intelligence, linguistics, and psychology for decades.

In semantic networks, concepts are represented by nodes in a network, and gain their functionality by their links to other concept nodes (Collins & Loftus, 1975; Quillian, 1967). Often times, these links are labeled, in which case different links refer to different kinds of relations between nodes. Dog would be connected to Animal by an Is-a link, to Bone by an Eats link, and to Paw by a Has-a link. These networks assume a conceptual web account of meaning because the networks' nodes are typically only connected to each other, rather than to an external world or perceptual systems.

A computational approach to word meaning that has received considerable recent attention has been to base word meanings solely on the patterns of co-occurrence between a large number of words in an extremely large text corpus (Burgess, Livesay, & Lund, 1998; Burgess & Lund, 2000; Landauer & Dumais, 1997). Mathematical techniques are used to create vector encodings of words that efficiently capture their co-occurrences. If two words, such as "cocoon" and "butterfly" frequently co-occur in an encyclopedia or enter into similar patterns of co-occurrence with other words, then their vector representations will be highly similar. The meaning of a word, its vector in a high dimensional space, is completely based on the contextual similarity of words to other words.

The traditional notion of concepts in linguistic theories is based upon conceptual webs. Ferdinand de Saussure (1915/1959) argued that all concepts are completely "negatively defined," that is, defined solely in terms of other concepts. He contended that "language is a system of interdependent terms in which the value of each term results solely from the simultaneous presence of the others" (p. 114) and that "concepts are purely differential and defined not in terms of their positive content but negatively by their relations with other terms in the system" (p. 117). By this account, the meaning of Mutton is defined in terms of other neighboring concepts. Mutton's use does not extend to cover sheep that are living because there is another lexicalized concept to cover living sheep (Sheep), and Mutton does not extend to cover cooked pig because of the presence of Pork. Under this notion of interrelated concepts, concepts compete for the right to control particular regions of a conceptual space (see also Goldstone, 1996; Goldstone, Steyvers, & Rogosky, 2003). If the word Mutton did not exist, then "all its content would go to its competitors" (Saussure, 1915/1959, p. 116).

According to the conceptual role semantics theory in philosophy, the meaning of a concept is given by its role within its containing system (Block, 1986,1999; Field, 1977; Rapaport, 2002). A conceptual belief, for example, that dogs bark, is identified by its unique causal role in the mental economy of the organism in which it is contained. A system containing only a single concept is not possible (Stich, 1983). A common inference from this view is that concepts that belong to substantially different systems must have different meanings. This inference, called "translation holism" by Fodor and Lepore (1992), entails that a person cannot have the same concept as another person unless the rest of their conceptual systems are at least highly similar. This view has had perhaps the most impact in the philosophy of science, where Kuhn's incommensurability thesis states that there can be no translation between scientific concepts across scientists that are committed to fundamentally different ontologies (Kuhn, 1962). A chemist indoctrinated into Lavoisier's theory of oxygen cannot translate any of their concepts to earlier chemists' concept of phlogiston. A more recent chemist can only entertain the earlier phlogiston concept by absorbing the entire Pre-Lavoisier theory, not by trying to insert the single phlogiston concept into their more recent theory or by finding an equivalent concept in their theory. A concept can only be understood if an entire system of interrelated concepts is also acquired.

translating between conceptual systems

We will not directly tackle the general question of whether concepts gain their meaning from their connections to each other, or from their connection to the external world. In fact, our eventual claim will be that this is a false dichotomy, and that concepts gain their meaning from both sources. Our destination will be a synergistic integration of conceptual web and external grounding accounts of conceptual meaning. On the road to this destination, we will first argue for the sufficiency of the conceptual web account for conceptual translation. Then, we will show how the conceptual web account can be supplemented by external grounding to establish meanings more successfully than either method could by itself.

Our point of departure for exploring conceptual meaning will be a highly idealized and purposefully simplified version of a conceptual translation task. The existence of translation across different people's conceptual systems, for example between John and Mary's Horse concepts, has been taken as a challenge to conceptual web accounts of meaning. Fodor and Lepore (1992) have argued that if a concept's meaning depends on its role within its larger conceptual system, and if there are some differences between Mary's and John's systems, then the meanings of Mary's and John's concepts would necessarily be different. A natural way to try to salvage the conceptual web account is to argue that determining corresponding concepts across systems does not require the systems to be identical, but only similar. However, Fodor (1998) insists that the notion of similarity is not adequate to establish that Mary and John both possess a concept of Horse. Fodor argues that "saying what it is for concepts to have similar, but not identical contents presupposes a prior notion of beliefs with similar but not identical concepts" [p. 32]. In opposition to this, we will argue that conceptual translation can proceed using only the notion of similarity, not identity, between concepts. Furthermore, the similarities between Mary's and John's concepts can be determined using only relations between concepts within each person's head.

We will present a simple neural network called ABSURDIST (Aligning Between Systems Using Relations Derived Inside Systems Themselves) that finds conceptual correspondences across two systems (two people, two time slices of one person, two scientific theories, two cultures, two developmental age groups, two language communities, etc.) using only interconceptual similarities, not conceptual identities, as input. Laakso and Cottrell (1998,2000) describe another neural network model that uses similarity relations within two systems to compare the similarity of the systems, and Larkey and Love (2003) describe a connectionist algorithm for aligning between graphs that is highly related. ABSURDIST belongs to the general class of computer algorithms that solve graph matching problems. It takes as input two systems of concepts in which every concept of a system is defined exclusively in terms of its dissimilarities to other concepts in the same system. ABSURDIST produces as output a set of correspondences indicating which concepts from System A correspond to which concepts from System B. These correspondences serve as the basis for understanding how the systems can communicate with each other without the assumption made by Fodor (1998) that the two systems have exactly the same concepts. Fodor argues that any account of concepts should explain their "publicity" -the notion that the same concept can be possessed by more than one person. Instead, we will advocate a notion of "correspondence." An account of concepts should explain how concepts possessed by different people can correspond to one another, even if the concepts do not have exactly the same content. The notion of corresponding concepts is less restrictive than the notion of identical concepts, but is still sufficient to explain how people can share a conversational ground, and how a single person's concepts can persist across time despite changes in the person's knowledge. While less restrictive than the notion of concept identity, the notion of correspondence is stronger than the notion of concept similarity. John's Horse concept may be similar to Mary's Donkey concept, but the two do not correspond because John's Horse concept is even more similar in terms of its role within the conceptual system. Two concepts correspond to each other if they play equivalent roles within their systems, and ABSURDIST provides a formal method for determining equivalence of roles.

A few disclaimers are in order before we describe the algorithm. First, ABSURDIST finds corresponding concepts across individuals, but does not connect these concepts to the external world. The algorithm can reveal that Mary's Horse concept corresponds to John's Horse concept, but the basic algorithm does not reveal what in the external world corresponds to these concepts. However, an interesting extension of ABSURDIST would be to find correspondences between concepts within an internal system and physically measurable elements of an external system. Still, as it stands

ABSURDIST falls significantly short of an account of conceptual meanings. The intention of the model is simply to show how one task related to conceptual meaning, finding corresponding concepts across two systems, can be solved using only within-system similarities between concepts. It is relevant to the general issue of conceptual meaning given the arguments in the literature (e.g. Fodor, 1998) that this kind of within-system similarity is insufficient to identify cross-system matching concepts.

Second, our initial intention is not to create a rich or realistic model of translation across systems. In fact, our intention is to explore the simplest, most impoverished representation of concepts and their interrelations that is possible. If such a representation suffices to determine cross-system translations, then richer representations would presumably fare even better. To this end, we will not represent concepts as structured lists of dimension values, features or attribute/value frames, and we will not consider different kinds of relations between concepts such as Is-a, Has-a, Part-of, Used-for, or Causes. Concepts are simply elements that are related to other concepts within their system by a single, generic similarity relation. The specific input that ABSURDIST takes will be two two-dimensional proximity matrices, one for each system. Each matrix indicates the similarity of every concept within a system to every other concept in the system. While an individual's concepts certainly relate to each other in many ways (Medin, Goldstone, and Gentner, 1993), our present point is that even if the only relation between concepts in a system were generic similarity, this would suffice to find translations of the concept in different systems. In the final section, we will describe an extension of ABSURDIST to more structured conceptual systems.

A third disclaimer is that ABSURDIST is not primarily being put forward as a model of how people actually communicate and understand one another. ABSURDIST finds correspondences between concepts across systems, and would not typically be housed in any one of the systems. Unless Mary knows all of the distances between John's concepts, then she could not apply ABSURDIST to find translations between John and Mary. If the primary interpretation of ABSUDIST is not as a computational model of a single human's cognition, then what is it? It is an algorithm that demonstrates the available information that could be used to find translations between systems. It is an example of a hitherto underrepresented class of algorithms in cognitive science - computational ecological psychology. The ecological movement in perception (Gibson, 1979) is concerned with identifying external properties of things in the world that are available to be picked up by people. Although it is an approach in psychology, it is just as concerned with examining physical properties as it is with minds. Similarly, ABSURDIST is concerned with the sufficiency of information that is available across systems for translating between the systems. Traditional ecological psychology proceeds by expressing mathematical relations between physical properties. However, in the present case, a computational algorithm is necessary to determine the information that is available in the observed systems. Thus, the argument will be that even systems with strictly internal relations among their parts possess the information necessary for an observer to translate between them. However, unlike a standard interpretation of claims for "direct perception," an observer using ABSURDIST would perform a time-extended computation in order to successfully recover these translations.

absurdist

ABSURDIST is a constraint satisfaction neural network for translating between conceptual systems. Unlike many neural networks, it does not learn, but rather only passes activation between units. Each of the units in the network represents an hypothesis that two concepts from different systems correspond to one another. With processing, a single set of units will tend to become highly active and all other units will become completely deactivated. The set of units that eventually becomes active will typically represent a consistent translation from one system to the other.

Elements Al..m belong to System A, while elements Bl..n belong to System B. Ct(Aq, Bx) is the activation, at time t, of the unit that represents the correspondence between the qth element of A and the xth element of B. There will be m-n correspondence units, one for each possible pair of corresponding elements between A and B. In the current example, every element represents one concept in a system. The activation of a correspondence unit is bound between 0 and 1, with a value of 1 indicating a strong correspondence between the associated elements, and a value of 0 indicating strong evidence that the elements do not correspond. Correspondence units dynamically evolve over time by the equations:

If Nt(Aq, Bx), the net input to a unit that links the qth element of A and the xth element of B, is positive, then the unit's activation will increase as a function of the net input, passed through a squashing function that limits activation to an upper bound of 1. If the net input is negative, then activations are limited by a lower bound of 0. The net input is defined as

Nt(Aq, Bx) = aEt(Aq, Bx) + PR(Aq, Bx) - (1 - a - P)It(Aq, Bx), (2)

where the E term is the external similarity between Aq and Bx, R is their internal similarity, and I is the inhibition to placing Aq and Bx into correspondence that is supplied by other developing correspondence units.

figure 12.1. An example of the input to ABSURDIST. Two systems, A and B, are each represented solely in terms of the distances/dissimilarities between elements within a system. The correct output from ABSURDIST would be a cross-system translation in which element q was placed in correspondence with x, r with y, and s with z. Arcs are labeled with the distances between the elements connected by the arcs.

When a = 0, then correspondences between A and B will be based solely on the similarities among the elements within a system, as proposed by a conceptual web account.

The amount of excitation to a unit based on within-domain relations is given by

where D(Aq, Ar) is the psychological distance between elements Aq and Bx in System A, and S(F, G) is the similarity between distances F and G, and is defined as S(F, G) = e-|F-G|. The amount of inhibition is given by

These equations instantiate a fairly standard constraint satisfaction network, with one twist. According to the equation for R, Elements Aq and Bx will tend to be placed into correspondence to the extent that they enter into similar similarity relations with other elements. For example, in Figure 12.1, Aq has a distance of 7 to one element (Ar) and a distance of 9 to another element (As) within its System A. These are similar to the distances that Bx has to the other elements in System B, and accordingly there should be a tendency to place Aq in correspondence with Bx. Some similarity relations should count much more than others. The similarity between D(Aq, Ar) and D(Bx, By) should matter more than the similarity between D(Aq, Ar) and D(Bx, Bz) in terms of strengthening the correspondence between Aq and Bx, because Ar corresponds to By not to Bz. This is achieved by weighting the similarity between two distances by the strength of the units that align elements that are placed in correspondence by the distances. As the network begins to place Ar into correspondence with By, the similarity between D(Aq, Ar) and D(Bx, By) becomes emphasized as a basis for placing Aq into correspondence with Bx. As such, the equation for R represents the sum of the supporting evidence (the consistent correspondences), with each piece of support weighted by its relevance (given by the similarity term). This sum is normalized by dividing it by the minimum of (m — 1) and (n — 1). This minimum is the number of terms that will contribute to the R term if only 1-to-i correspondences exist between systems.

The inhibitory Term I is based on a one-to-one mapping constraint (Falkenhainer et al., 1989; Holyoak & Thagard, 1989). The unit that places Aq into correspondence with Bx will tend to become deactivated if other strongly activated units place Aq into correspondence with other elements from B, or Bx into correspondence with other elements from A.

One problem with the original ABSURDIST algorithm described by Goldstone and Rogosky (2002) is that many iterations of activation passing between correspondence units is required before a single set of units converges. An analysis of the network dynamics often reveals that all correspondence units initially decrease their activation value, and then very gradually a set of consistent correspondence units becomes more activated. One strategy that has proven helpful in both speeding convergence in ABSURDIST and improving alignment accuracy has been to define a measure of the total amount of activation across all correspondences units, m n

Next, if T is less than the intended sum if there were a complete set of one-to-one mappings, then each correspondence unit is adjusted so that it is more active. The adjustment is the difference between the ideal sum and the actual sum of activations, weighted by the ratio of the current activation to the total activation. Hence, the boost in activation for a correspondence unit should increase as the activation of the unit relative to the total activation of the network increases. These requirements are met by the following equation for dynamically adjusting correspondence units:

if T < min(m, n) then Ct'+1(Aq, Bx) Ct+AAq , Bx) S

= Ct+1 (Aq, Bx) + cq' x (min(m, n) — T), which would be applied after Equation (1).

Activations that would fall outside of the 0-1 range are assigned the closest value in this range. Correspondence unit activations are initialized to random values selected from a normal distribution with a mean of 0.5 and a standard deviation of 0.05. In our simulations, Equation (1) is iterated for a fixed number of cycles. It is assumed that ABSURDIST places two elements into correspondences if the activation of their correspondence unit is greater than 0.55 after a fixed number of iterations have been completed. Thus, the network gives as output a complete set of proposed correspondences/translations between Systems A and B.

assessing absurdist

Our general method for evaluating ABSURDIST will be to generate a number of elements in an N-dimensional space, with each element identified by its value on each of the N dimensions. These will be the elements of System A, and each is represented as a point in space. Then, System B's elements are created by copying the points from System A and adding Gaussian noise with a mean of 0 to each of the dimension values of each of the points. The motivation for distorting A's points to generate B's points is to model the common phenomenon that people's concepts are not identical, and are not identically related to one another. The Euclidean distance between every pair of elements within a system is calculated. The correspondences computed by ABSURDIST after Equation (1) is iterated are then compared to the correct correspondences. Two elements correctly correspond to each other if the element in System B was originally copied from the element in System A.

0 0

Post a comment