Takeaway đ
Zusammenfassung:: This paper deals with authorship obfuscation through paraphrasing character trigrams. Bevendorff et al. develop both a greedy and a heuristic obfuscation method prototype and evaluate it against common attribution techniques such as unmasking.
Motivation:: Achieve flexibility and maintain readability.
Ergebnisse:: Unmasking performance decreased by ~3-9 %. Compression performance decreased by 15 %. Path cost decrease of up to 75%. However, text is still barely readable after obfuscation, creates some nonsensical phrases.
Code: https://github.com/webis-de/acl19-heuristic-authorship-obfuscation
Keywords: authorship
đ Index
- 1098 | Nutzen Jensen-Shannon divergence um Abstand zwischen Original und paraphrasierter âFĂ€lschungâ zu berechnen
- 1099 | Literature review
- authorship-attribution: Abbasi and Chen (2008) Writeprints, Teahan et al. Compression, Koppel and Schler (2004) Unmasking
- authorship-obfuscation: monoand multilingual machine translation (lack of data), reverse unmasking
- 1100 | Stylistic Distance measured between character trigram frequencies as JS distance: with so that
- 1101 | Strategy 1: Greedy authorship-obfuscation-by-reducing-idiosyncratic-text-features
- 1102 | Strategy 2: Heuristic obfuscation by searching for cost function to optimise process â Generating solutions
- 1103 | cost-intensity-of-author-obfuscation â obfuscation-operators
- 1105 | Longer texts decrease performance, most of text left un-obfuscated
Metadaten
- Highlights:: highlights-zu-bevendorff2019
- Gelesen am:: 2022-10-31
Bevendorff, Janek, Martin Potthast, Matthias Hagen & Benno Stein. 2019. Heuristic Authorship Obfuscation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1098â1108. Florence, Italy: Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1104.