CIS Colloquium, Sep 19, 2007, 03:30PM - 04:30PM, TECH Center 111
Inferring Contextual Semantic Information from Text Using a Model of Human Episodic and Semantic Memory
Dr. Shane Mueller, Applied Research Associates Inc.
Over the past twenty years, a number of techniques have been developed to infer meaningful representations from naturalistic text data based on how words co-occur in common contexts. The most prominent of these is LSA (Landauer & Dumais, 1997), but a number of other "contextual semantic" models have been proposed as well. Although these models typically do a good job at inferring synonymy (because words which appear in the same contexts tend to have similar meanings), they often do poorly representing polysemy (because they assume each word's representation is the same for all contexts). In this talk, I will introduce REM-II, a model of human episodic and semantic memory, which infers both synonymy and polysemy based on the contexts words appear in. Although this model was designed to account for human memory phenomena, it has been adapted as a tool which can process text corpora and develop useful semantic representations for words. I will demonstrate its application to the Mindpixel project’s 80,000-statement GAC corpus, showing how both aspects of contextual semantics (polysemy and synonymy) are important for rich semantic representations.