Takeaway 🏃

Zusammenfassung:: Comprehensive overview of computational authorship attribution methods.

Motivation:: Focus on technology rather than pure literary studies, give an overview of the methods to verify authorship up until this point

Ergebnisse:: The field has seen much growth in recent years, away from pure literary studies towards interdisciplinary and automated approaches. Objective evaluation critera have been defined but the test corpora are still few. Accuracy depends on number of candidate authors. Open-set classifications (true author is not included in candidates) are best in forensics.

Keywords: authorship

  • 538 | History of the field, notably @mosteller1964 who introduced “nontraditional” AA
  • 539 | types-of-authorship-analysis-tasks
  • 539 | applications-for-authorship-attribution
  • ???
  • 547 | Profile-based approaches
    • Compression: concatenate all texts by a given author and then apply compression algorithm over each of them, then compare the bit-wise size of the compressed files (GZIP, RAR etc)
  • 548 | Instance-based approaches: Classifier is trained on individual document vectors
  • 549 | Similarity-based approaches: Burrows’ Delta
  • 553 | Frequency is not always keys, sometimes it’s more important to put more weight and value on certain features
  • 553 | Lack of training data



Stamatatos, Efstathios. 2009. A survey of modern authorship attribution methods. Journal of the American Society for Information Science and Technology 60(3). 538–556. https://doi.org/10.1002/asi.21001.