Hostname: page-component-cd9895bd7-mkpzs Total loading time: 0 Render date: 2024-12-28T02:26:52.929Z Has data issue: false hasContentIssue false

Explorations in the derivation of word co-occurrence statistics

Published online by Cambridge University Press:  05 May 2015

Joseph P. Levy
Affiliation:
Birkbeck College, London, U.K.
John A. Bullinaria
Affiliation:
University of Reading, U.K.
Malti Patel
Affiliation:
Macquarie University, AUSTRALIA

Abstract

Recent work has demonstrated that counts of which other words co-occur with a word of interest can reflect interesting properties of that word. We have studied aspects of this kind of methodology by systematically examining the effects of different combinations of parameters used in the preparation of co-occurrence statistics. Several psychologically relevant evaluation measures are used. We have found that successful performance on the evaluation tasks depends on the correct selection of parameters such as window size and distance metric.

Type
Part III. Psycholinguistics
Copyright
Copyright © University of Papua New Guinea and the Centre for Southeast Asian Studies, Northern Territory University, Australia 1999

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

BIBLIOGRAPHY

Aston, G. & Burnard, L. (1998) The BNC Handbook: Exploring the British National Corpus with SARA. Edinburgh University Press.Google Scholar
Battig, W. F. & Montague, W. E. (1969) Category norms for verbal items in 56 categories: A replication and extension of the Connecticut category norms. Journal of Experimental Psychology Monograph, 80.CrossRefGoogle Scholar
Brown, P. F., Della Pietra, V, J., deSouza, P. V., Lai, J. C. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18(4), 467479.Google Scholar
Bullinaria, J. A. (1995) Modelling Lexical Decision: Who needs a lexicon? in Keating, J. G. (ed.), Neural Computing Research and Applications III, 6269. Maynooth, Ireland: St Patrick's College.Google Scholar
Bullinaria, J. A. & Huckle, C. C. (1997) Modelling lexical decision using corpus derived semantic vectors in a connectionist network. In Bullinaria, J. A., Glasspool, D. W. & Houghton, G. (eds), Fourth Neural Computation and Psychology Workshop: Connectionist representations, 213226. London: Springer.Google Scholar
Dagan, I., Marcus, S., & Markovitch, S. (1993). Contextual word similarity and estimation from sparse data in Proceedings of the 31st Annual Meeting of the ACL Ohio State University, Columbus, Ohio, 1993, pps 164171.Google Scholar
Elman, J. L. (1980). Finding Structure in Time. Cognitive Science, 14, 179211.CrossRefGoogle Scholar
Fellbaum, C. (ed) WordNet: An Electronic Lexical Database, MIT Press.Google Scholar
Finch, S. P. & Chater, N. (1992) Bootstrapping Syntactic Categories. Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society of America. Bloomington, Indiana. 820825.Google Scholar
Finch, S. & Chater, N. (1994). Distributional bootstrapping: from word class to proto-sentence in Proceedings of the 16th Annual meeting of the Cognitive Science Society, pps. 301306.Google Scholar
Gaskell, G. & Marslen-Wilson, W. (1997) Discriminating local and distributed models of competition in spoken word recognition in Proceedings of the 19th annual conference of the Cognitive Science Society, LEA.Google Scholar
Harm, M. W. (1998) A division of labor in a computational model of visual word recognition. PhD thesis, University of Southern California.Google Scholar
Hinton, G. E. & Shallice, T. (1991) Lesioning an attractor network: Investigations of acquired dyslexia. Psychological Review, 98, 7495.CrossRefGoogle ScholarPubMed
Landauer, T. & Dumais, S. (1997) A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge, Psychological Review, 104(2), 211–40.CrossRefGoogle Scholar
Levy, J. (1995) Semantic representations in connectionist models: the use of text corpus statistics, workshop on the neural modeling of cognitive and brain disorders, University of Maryland, 06 1995.Google Scholar
Levy, J. P., Bullinaria, J., & Patel, M. (1997) Evaluating the use of word co-occurrence statistics as semantic representations. Paper given at Computational Psycholinguistics '97, Berkeley.Google Scholar
Lund, K., Burgess, C. & Atchley, R. A. (1995) Semantic and associative priming in high-dimensional semantic space in Proceedings of the 17th Annual meeting of the cognitive science society. Pps 660665.Google Scholar
Lund, K. & Burgess, C. (1996) Producing high-dimensional semantic spaces from lexical co-occurrence, Behavior Research Methods, Instruments, & Computers, 28(2), 203208.CrossRefGoogle Scholar
Lund, K., Burgess, C & Audet, C. (1996) Dissociating semantic and associative word relationships using high-dimensional semantic space. In Proceedings of the 18th annual conference of the Cognitive Science Society, pps 603608, LEA.Google Scholar
McRae, K., de Sa, V., & Seidenberg, M. S. (1993). Modeling property intercorrelations in conceptual memory. In Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society, pps 729734. Hillsdale, NJ: Erlbaum.Google Scholar
Patel, M. (1996) Using neural nets to investigate lexical analysis, In PRICAI '96: Topics in Artificial Intelligence, proceedings of the 4th Pacific Rim International Conference on Artificial Intelligence.CrossRefGoogle Scholar
Foo, N. & Goebel, R. (eds), pps. 241252, Springer.Google Scholar
Patel, M., Bullinaria, J. A. & Levy, J. (1997). Extracting semantic representations from large text corpora In Bullinaria, J. A., Glasspool, D. W. & Houghton, G. (eds), Fourth Neural Computation and Psychology Workshop: Connectionist representations, 199212. London: Springer.Google Scholar
Plaut, D. C. (1995) Semantic and associative priming in a distributed attractor network, Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society, 3742, Mahwah, NJ: Erlbaum Google Scholar
Plaut, D. & Shallice, T. (1993) Deep dyslexia: A case study of connectionist cognitive neuropsychology. Cognitive Neuropsycholgoy, 10, 377500.CrossRefGoogle Scholar
Schütze, H. (1994) Word Space in Advances in Neural Information Processing Systems Editor(s): Hanson, Stephen J., Cowan, Jack D., Giles, C. Lee (Eds). Kaufmann Google Scholar
Schütze, H. & Pedersen, J. (1993) A vector model for syntagmatic and paradigmatic relatedness In Proceedings of the 9th Annual Conference of the UW Centre for the New OED and Text Research, pages 104113, Oxford, England, 1993.Google Scholar