Luke, rambling…

Search

SearchSearch

Recent Writings

  • Day in the Life of someone who potentially has ADHD maybe

    Aug 21, 2023

    • #post/self
  • Preparing this site

    Jul 30, 2023

    • #post/selfhosting
    • #meta
  • Why I don't write

    Apr 06, 2023

    • #post/self

See 1 more →

Jurafsky & Martin (2009) Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition

Jan 04, 2023, 1 min read

  • #book

Zusammenfassung::

Motivation::
Ergebnisse::
Keywords: machine-learning, nlp


Ch2: Regular Expressions, Text Normalisation and Edit Distance §

  • 2 – text data needs to be preprocessed to build a basis for analysis
    • Tokenization (we-dont-know-how-to-define-what-a-word-is)
    • lemmatization
    • Stemming
    • Segmentation

Todo

Metadaten

  • Highlights:: highlights-zu-jurafsky2009

Graph View

Backlinks

  • Lemmatization
  • Text classification

©2023. Unless stated otherwise, content on this site is licensed under CC BY-SA 4.0.

  • Mastodon
  • Discord