@bevendorff2019: “given a text of unknown authorship and texts from known candidate authors, attribute the unknown text to its true author among the candidates” (1098f.)
One of today’s most effective and robust verification approaches is unmasking by Koppel and Schler (2004). It decomposes to-be-compared texts into two chunk sets, and iteratively trains a linear classifier to discriminate between them while removing the most significant features in each iteration to measure the increased reconstruction error. This error increases faster for same-author cases since those share more function words than do differentauthors cases. Fooling unmasking verification provides us with evidence that our obfuscation technique works at a deeper level than just the few most superficial text features. Unmasking further produces curve plots of the declining classification accuracy, which render the effects of obfuscation accessible to human inspection and interpretation.
– @bevendorff2019, p. 1104
- Keywords: authorship
@halvani2017’s Compression Models