Takeaway đ
Zusammenfassung:: Evert et al. seek to define and evaluate Burrowsâ Delta measurement to improve authorship attribution results.
Motivation:: Progress in attribution performance has been stagnating up until this point and there still was no real understanding on why Burrowâs Delta is relatively reliable⊠and why it fails when it does.
Ergebnisse:: Using the most frequent words as features and standardizing them yields better results. Both Burrowâs and Cosine Delta improve significantly by using vector normalization.
Keywords: authorship
- 79 | Definition Authorship Attribution
- Idiosyncracies/habitual tendencies in a personâs language use
- Clustering and classification tasks
- Burrowâs Delta (@burrows2005) is a distance measure, very robust but outperformed by Cosine Delta (Smith and Aldridge 2011)
- 80 | where is the mean, is a distribution of the relative frequencies of words in a document , the stdev of the word
- 80 | Burrowsâ Delta: (Burrows 2002)
- 80 | Cosine Delta: (Smith and Aldridge 2011)
Todo
- [b] What is vector normalization?
Metadaten
- Highlights:: highlights-zu-evert2015
- Gelesen am:: 2022-10-31
Evert, Stefan, Thomas Proisl, Thorsten Vitt, Christof Schöch, Fotis Jannidis & Steffen Pielström. 2015. Towards a better understanding of Burrowsâs Delta in literary authorship attribution. In Proceedings of the Fourth Workshop on Computational Linguistics for Literature, 79â88. Denver, Colorado, USA: Association for Computational Linguistics. https://doi.org/10.3115/v1/W15-0709.