Optimizing Statistical Machine Translation for Text Simplification

Quelle:: @xu2016

Hinzugefügt am 2023-02-05

Text simplification has applications for reducing input complexity for natural language processing (Siddharthan et al., 2004; Miwa et al., 2010; Chen et al., 2012b) and providing reading aids for people with limited language skills (Petersen and Ostendorf, 2007; Watanabe et al., 2009; Allen, 2009; De Belder and Moens, 2010; Siddharthan and Katsos, 2010) or language impairments such as dyslexia (Rello et al., 2013), autism (Evans et al., 2014), and aphasia (Carroll et al., 1999). (p. 1)

splitting, deletion and paraphrasing (Feng, 2008). The splitting operation decomposes a long sentence into a sequence of shorter sentences. Deletion removes less important parts of a sentence. The paraphrasing operation includes reordering, lexical substitutions and syntactic transformations. (p. 1)

1) two novel simplification-specific tunable metrics; 2) large-scale paraphrase rules automatically derived from bilingual parallel corpora (p. 2)

3) rich rule-level simplification features; and 4) multiple reference simplifications collected via crowdsourcing for tuning and evaluation (p. 2)

the parallel Wikipedia simplification corpus contains a large proportion of inadequate (not much simpler) or inaccurate (not aligned or only partially aligned) simplifications (p. 2)

We propose two new light-weight metrics instead: FKBLEU that explicitly measures readability and SARI that implicitly measures it by comparing against the input and references. (p. 3)

we include the Flesch-Kincaid Index (FK) which estimates the readability of text using cognitively motivated features (Kincaid et al., 1975): FK = 0.39 × ( \words \sentences ) (2) +11.8 × ( \syllables \words ) − 15.59 with a lower value indicating higher readability. (p. 3)

FK measures readability assuming that the text is well-formed (p. 3)

we use all the 33 features that were distributed with PPDB 1.0 (p. 6)

9 new features for simplification purposes:5 length in characters, length in words, number of syllables, language model scores, and fraction of common English words in each rule (p. 6)

Like with machine translation, where there are many equally good translations, in simplification there may be several ways of simplifying a sentence. (p. 6)

we collect multiple human reference simplifications that focus on simplification by paraphrasing rather than deletion or splitting (p. 6)

BLEU was designed to evaluate bilingual translation systems. It measures the n-gram precision of a system’s output against one or more references. BLEU ignores recall (and compensates for this with its brevity penalty). BLEU prefers an output that is not too short and contains only n-grams that appear in any reference. The role of multiple references in BLEU is to capture allowable variations in translation quality. When applied to monolingual tasks like simplification, BLEU does not take into account anything about the differences between the input and the references. (p. 10)