中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
中文新异复合词的语义加工和学习的认知机制

文献类型:学位论文

作者王静文
答辩日期2023-06
文献子类博士
授予单位中国科学院大学
授予地点中国科学院心理研究所
其他责任者李兴珊
关键词新异复合词 语义识别 语义合成 阅读 词汇化
学位名称理学博士
学位专业基础心理学
其他题名Mechanism of Chinese Novel Compound Word Semantic Processing and Learning
中文摘要Semantic processing of compound words is an essential topic in both psycholinguistic and artificial intelligence. Compound word are formed by combining two morphemes, which can be classified into two types: lexicalized compound words and novel compound words according to the familiarity of individuals. Novelcompound words are commonly used in daily life, such as on websites, social media, and during communication. The present dissertation raises several important questions that need to be addressed: For Chinese readers, how do they process novel compound words during reading? How can novel compound words be represented using natural language processing techniques? What is the preference of Chinese readers for the explanation type of novel compound words and what factors will influence the explanation type? And, how can novel compound words be lexicalized by learning? These questions are addressed by the present thesis. The answers to these questions can provide valuable insights into understanding the cognitive processes involved in novel compound words identification, semantic composition and lexicalization, which can have practical implications for natural language processing techniques and develop effective strategies for teaching Chinese words. Study 1 aimed to investigate how individuals combine the meanings of constituents to construct the meaning of novel compound words when the compound words are in isolation. We used natural language processing techniques to represent the compositional semantics of compound words. Our goal was to investigate how the semantics of compound words are constructed, either by directly adding the meanings of the two constituents or by using compositional rules to combine them. Consequently, we developed both additive and weight additive models and simulated some results of previous studies to compare the effects of these two models. The results showed that the weight additive model outperformed the additive model. In a corpus study, we further investigated how individuals combine the meanings of constituents and what factors influence the composition procedure. We collected participants' ratings of compositionality and explanations for 400 novel compound words, along with language and sensorimotor properties of the constituents of these compound words. After coding all the explanations, we found that Chinese readers tended to explain the meaning of novel compound words using a relation-based method. We also found that language information influenced shallow compositionality ratings and deep explanations, while visual information only influenced deep explanations. This finding is consistent with the Embodied Conceptual Combination Theory, which posits that sensorimotor information plays a critical role in conceptual combination. Study 2 aimed to investigate whether novel compound words are processed through a holistic or decompositional way during sentence reading. The purpose of Experiment 1 was to investigate the processing route of novel compound words and the influence of word length. The experiment was divided into two sub-experiments, using two-character and four-character compound words respectively. Each pair of novel and lexicalized compound words were embedded into the same sentence frame, and the contemporary plausibility of the first constituent of the compound word was manipulated by the verb before it. Participants were required to read sentences and their eye movements were recorded. Results of Experiment 1 a showed that fixation durations in the novel compound word condition were significantly longer than those in the lexicalized compound word condition. No plausibility effect was found in the first constituent region. However, a significant reversed plausibility effect was found in the second constituent region in the novel word condition but not in the lexicalized word condition. Results of Experiment 1b replicated the pattern of Experiment 1 a, except that a significant plausibility effect was observed in the novel compound word condition in the first constituent region. The result of Experiment 1 indicated that while lexicalized compound words are identified through a holistic route, novel compound words are processed via a decompositional route. Experiment 2 further investigated whether two-character novel compound words require semantic composition. The experiment manipulated the compositionality of novel compound words and included high compositional lexicalized compound words as a control condition. Results demonstrated that fixation durations in the low-compositionality novel compound word condition were significantly longer than those of the high-compositionality novel compound word and lexicalized word conditions. However, it seems that your message was cut off before you could finish your thought. Study 3 aimed to investigate the time course of novel compound word lexicalization. We were also concerned with the role of picture information in this process. Experiment 1 investigated how novel compound words are lexicalized by different learning methods. In the explanation learning condition, explanations of novel compound words were presented on the screen. In the picture learning condition, pictures of novel compound words that could indicate their meaning were presented on the screen. We also included a no semantic condition and a control condition. Participants learned all the novel compound words on the first day. To explore the lexicalization process, tests were set on Day 1,Day 2, and Day 8, including memory tests (free recall and recognition) and lexicalization tests (segmentation, semantic priming). On day 8, participants had to finish a natural reading test with their eye movements recorded. We found that participants recalled more words and recognized words more accurately in memory tests. However, compared to explanation learning, no significant benefits were shown in lexicalization tests for picture learning. Furthermore, novel compound words were lexicalized on Day 2 in both picture learning and explanation learning conditions. Words in the control condition were not lexicalized by Day 8. These results indicate that both picture learning and explanation learning can help readers to lexicalize novel compound words, but picture learning can better help learners to remember novel compound words. In summary, Chinese readers identify novel compound words through decomposition and composition during reading. During semantic composition, participants preferred relation-based explanations. While language information influenced both compositionality ratings and semantic composition production, visual information only influenced semantic composition production. Visual information also facilitated novel compound word memory but not lexicalization. Novel compound words can be lexicalized on the day after they were learned. This study contributes to the understanding of general semantic processing, composition, and lexicalization of compound words and has certain implications for Chinese word teaching.
英文摘要复合词的语义加工和表征是心理语言学和自然语言处理中的关键问题。复合词是由两个及以上语素组成的一种词汇形式,根据人们对复合词的熟悉程度可以分为词汇化复合词和新异复合词。新异复合词是未被词典收录但是被人们认为是词的复合形式,并且在语料库中未见或者较为罕见。但新异复合词可能会在网络、媒体报道和日常交流中被使用。本研究关注以下问题:中文读者在阅读中加工新异复合词遵循整体通路还是分解通路?新异复合词的语义如何合成以及有哪些影响因素?如何通过学习使读者达到新异复合词语义词汇化的效果? 研究一探究当新异复合词独立呈现时,读者如何根据成分语义合成整词语义。首先探究复合词的语义合成是将两个成分的语义直接叠加还是通过一定的规则合成。我们使用计算机技术对复合词的合成语义进行表征,构建了叠加模型和加权模型,使用两种模型模拟以往研究中复合词加工的结果并进行比较。结果显示加权模型的效果好于叠加模型,表现为与人为评分的相关性更高、对反应时的预测效果更好,说明复合词的语义是通过一定规则进行合成,而非成分语义的简单叠加。后续在语料分析中,我们进一步探究新异复合词的语义合成方式和影响的因素。在语料库分析中,我们收集被试对于400个两字新异复合词的解释与合成性评分,并收集新异复合词成分的语言(字频和正字法)和感觉运动信息(想象性和运动性)。对所有解释进行编码后发现中文读者倾向于采用基于关系的方式解释新异复合词语义,并且发现首字语言信息影响浅层的合成性评分和合成语义解释,而视觉信息则只影响深层的合成语义解释,但我们并没有发现运动性的显著效应,结果符合具身概念组合理论,即读者在进行语义合成的过程中会在语言层次进行合成,还会在视觉层次进行模拟。 研究二将新异复合词嵌入句子语境中,旨在探究阅读中新异复合词的加工。实验1探究新异复合词在阅读中的加工通路和词长的影响。分为两个子实验,分别用两字词和四字词进行探究,将新异复合词和词汇化复合词嵌入句子语境中,操控复合词前两字动词和复合词第一个成分的搭配合理性,要求被试阅读句子同时记录眼动轨迹。结果表明,两字新异复合词比两字词汇化复合词注视时长更长,在第二成分区域存在反合理性效应,即不合理条件的注视时长显著短于合理条件的注视时长;四字新异复合词比四字词汇化复合词注视时长更长,在第一成分区域存在合理性效应。这些结果表明,词汇化复合词是通过整体加工的,新异复合词是通过分解再合成的方式加工的,第一个成分识别后会立刻和前文语境进行整合。实验2进一步验证两字新异复合词的语义通达是否需要语义合成。操纵新异复合词合成性高低,并且包含高合成性词汇化复合词作为控制条件。结果发现低合成性新异复合词的注视时长显著长于高合成性新异复合词,但高合成性新异复合词和词汇化复合词之间没有显著差异。这些结果进一步证实了新异复合词在阅读中需要分解再合成语义。 研究三探究读者新异复合词的词汇化进程和图片视觉信息对词汇化进程的影响。实验1采用图片学习和解释学习的方式探究读者如何习得新异复合词的语义从而达到词汇化的时间进程。在解释学习中,给被试呈现新异复合词语义的文字解释;在图片学习中,给被试呈现新异复合词和对应概念的图片。此外还包含了只呈现新词的无语义学习条件和完全不学习的控制条件。为了探究词汇化进程,在第一天、第二天和第八天设置了测试,包含探测情景记忆的记忆测试(自由回忆和新词再认)以及探测词汇化的任务(词切分、语义相关性判断、句子阅读)。学习可以更好地帮助学习者记忆新异复合词。 综上所述,读者会采用分解再合成的方式通达新异复合词的语义。在合成的过程中读者偏好采用基于关系的解释,并且新异复合词的第一个成分语言信息影响浅层的语义合成评分和深层的合成语义产出,而视觉信息则只影响深层的合成语义产出过程。在学习方面,新异复合词的语义可以在第二天就达到词汇化,视觉信息只有利于新异复合词的记忆,对词汇化进程没有额外帮助。该研究有助于理解词汇的一般语义加工、合成和词汇化,对中文词汇教学具有一定的启示作用。
语种中文
源URL[http://ir.psych.ac.cn/handle/311026/46186]  
专题心理研究所_认知与发展心理学研究室
推荐引用方式
GB/T 7714
王静文. 中文新异复合词的语义加工和学习的认知机制[D]. 中国科学院心理研究所. 中国科学院大学. 2023.

入库方式: OAI收割

来源:心理研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。