Supervised and unsupervised neural approaches to text readability

  • [b] Look up: Nadeem and Ostendorf 2018; Filighera, Steuer, and Rensing 2019
  • 142 | Traditionally, readability was measured as a lexical/syntactic feature
    • Reduction != Simplification
    • No transferability, e.g. to other cultures (or other languages, for that matter)
  • 143 | Supervised ML models still offer no real generalizability, they often heavily depend on features found in the specific context they were trained in
  • 144 | Usually two approaches: classification task, supervised learning to predict readability of new texts, OR some other statistical measure (SVM ranking, regression)
  • 144 | Traditional readability features: gunning-fog-index, flesch-kincaid-readability-tests, etc, focused on sentence and word length, diversity/complexity
  • 145 | Discourse features: Cohesion and coherence, references, entity mentions
  • 146 | Lexico-semantic features: TTR, word/character n-gram distributions, POS, word list comparison
  • 146 | Syntactic features: grammatical relations, subordination
  • 147 | Language model perplexity could be an indicator for reading level of a document
  • 149:152 | Corpora – extremely diverse in length, domains and target audiences!
  • 157 | LSTMs are used for unsupervised neural approach, BERT uses left & right context which is useful for Masked language modelling
  • 158 | Training a neural language model means minimizing the negative log-likelihood by backpropagation
  • 159 | Tries to answer the question whether unsupervised approach can be used alone to provide whole-document predicted scores

Metadaten ( PDF)

  • Zusammenfassung:: Introducing neural approaches to readability scoring in English and Slovenian, new Ranked Sentence Readability Score (RSRS)
  • Motivation:: Provide crosslinguistic alternative with high transferability
  • Ergebnisse::

Highlights

Imported: 2023-03-03 14:27

⭐ Main ideas

  • “systematic comparison of several neural architectures on a number of benchmark and new labeled readability data sets in two languages” (p. 141)
  • “most newer approaches consider readability as being a classification, regression, or a ranking task” (p. 142)
  • “We demonstrate that the proposed approach is capable of contextualizing the readability because of the trainable nature of neural networks and that it is transferable across different languages. In this scope, we propose a new measure of readability, RSRS (ranked sentence readability score), with good correlation with true readability scores.” (p. 142)
  • “Traditionally, readability in texts was measured by statistical readability formulas, which try to construct a simple human-comprehensible formula with a good correlation to what humans perceive as the degree of readability.” (p. 143)
  • “average sentence length (ASL)” (p. 143)
  • “word length and word difficulty” (p. 143)
  • “The WeeBit corpus” (p. 150)
  • “Three classes targeting younger audiences consist of articles from WeeklyReader, an educational newspaper that covers a wide range of nonfiction topics, from science to current affairs. Two classes targeting older audiences consist of material from the BBC-Bitesize Web site, containing educational material categorized into topics that roughly match school subjects in the UK.” (p. 150)
  • “The OneStopEnglish corpus (Vajjala and Luˇ ci ́ c 2018) contains aligned texts of three distinct reading levels (beginner, intermediate, and advanced) that were written specifically for English as Second Language (ESL) learners.” (p. 150)
  • “The Newsela corpus (Xu, Callison-Burch, and Napoles 2015)” (p. 151)
  • “The corpus contains 1,911 original English news articles and up to four simplified versions for every original article” (p. 151)
  • “Corpus of English Wikipedia and Corpus of Simple Wikipedia” (p. 153)

✅ Useful

  • “Readability is concerned with the relation between a given text and the cognitive load of a reader to comprehend it. This complex relation is influenced by many factors, such as a degree of lexical and syntactic sophistication, discourse cohesion, and background knowledge (Crossley et al. 2017).” (p. 141)
  • “The findings of the related research suggest that a separate language model should be trained for each readability class in order to extract features for successful readability prediction (Petersen and Ostendorf 2009; Xia, Kochmar, and Briscoe 2016).” (p. 158)

📚 Investigate

  • “Gunning fog index (Gunning 1952) (GFI)” (p. 143)
  • “Flesch reading ease (Kincaid et al. 1975) (FRE)” (p. 144)
  • “Flesch-Kincaid grade level (Kincaid et al. 1975) (FKGL)” (p. 144)
  • “returns values corresponding to the years of education required to understand the text is the Automated Readability Index (Smith and Senter 1967) (ARI)” (p. 144)
  • “Dale-Chall readability formula (Dale and Chall 1948) (DCRF) requires a list of 3,000 words that fourth-grade US students could reliably understand” (p. 144)
  • “SMOG grade (Simple Measure of Gobbledygook) (McLaughlin 1969)” (p. 144)