Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus

Authors

  • Marisa Ferrara Boston Cornell University
  • John Hale Cornell University
  • Reinhold Kliegl University of Potsdam
  • Umesh Patil University of Potsdam
  • Shravan Vasishth University of Potsdam

DOI:

https://doi.org/10.16910/jemr.2.1.1

Keywords:

surprisal, parsing costs, potsdam sentence corpus, parsing difficulty

Abstract

The surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difficulty. Using two different grammar types, surprisal is shown to have an effect on fixation durations and regression probabilities in a sample of German readers’ eye movements, the Potsdam Sentence Corpus. A linear mixed-effects model was used to quantify the effect of surprisal while taking into account unigram frequency and bigram frequency (transitional probability), word length, and empirically-derived word predictability; the socalled “early” and “late” measures of processing difficulty both showed an effect of surprisal. Surprisal is also shown to have a small but statistically non-significant effect on empirically-derived predictability itself. This work thus demonstrates the importance of including parsing costs as a predictor of comprehension difficulty in models of reading, and suggests that a simple identification of syntactic parsing costs with early measures and late measures with durations of post-syntactic events may be difficult to uphold.

Downloads

Published

2008-09-08

Issue

Section

Articles

How to Cite

Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus. (2008). Journal of Eye Movement Research, 2(1). https://doi.org/10.16910/jemr.2.1.1