Large Scale System Translation

The small-scale simulations conducted leave open the promise of applying ABSURDIST to much larger translation tasks. Although logistical and technical problems will certainly arise when scaling the algorithm up to large databases, the presented approach should theoretically be applicable to systems such as dictionaries, thesauri, encyclopedias, and social organizational structures. For example, ABSURDIST could provide automatic translations between dictionaries of two different languages using only co-occurrence relations between words within each dictionary. The input to the network would be the full matrix of co-occurrences between every word in English to every other word in English, and the same kind of matrix for a second language. The output would be a set of correspondences across the two language. If such a project were successful, it would provide a striking argument for the power of within-system relations. If unsuccessful, it could still be practically valuable if supplemented by a small number of external hints (e.g., that French "chat" and English "cat" might correspond to each other because of their phonological similarity).

We are not optimistic that a completely unseeded version of ABSURDIST would recover a very accurate translation between two dictionaries. We have collected a set of subjective similarities among a set of 134 animal words from two groups of subjects. The two groups were obtained by randomly assigning each of 120 Indiana University students to one of the groups. We used ABSURDIST to try correctly align the animal words across the two groups of subjects using only each groups' matrix of similarity assessments. ABSURDIST's performance rate of 34% correctly aligned animals was encouraging, but not nearly good enough to be practically useful. Furthermore, performance was higher than might generally be expected because of the high similarity between the groups, and the large number of subjects reducing extraneous noise. If we had tried to align a single pair of randomly selected subjects, ABSURDIST's performance would have been much worse. Although the unseeded ABSURDIST's performance was lackluster, we again found dramatic improvements when even a few animal terms were correctly seeded. An automatic dictionary translator could well be useful even if it needed to be seeded with 5% of the correct matches.

0 0

Post a comment