Text mining with r
Mentor
GK

Gábor Kismihók

Beschreibung

This learning path provides a comprehensive introduction to text mining concepts and their practical application using R. It covers essential techniques such as text preprocessing (lower case conversion, punctuation and stopword removal, tokenization), stemming, lemmatization, feature extraction (Bag of Words, TF-IDF), and advanced models like Word2Vec, Doc2Vec, Sentiment Analysis, Latent Semantic Analysis, and Latent Dirichlet Allocation, all with a focus on implementation in R.

Lernziele

  • Understand fundamental text mining concepts.

  • Apply text preprocessing techniques in R (lower case conversion, punctuation and stopword removal, tokenization).

  • Implement stemming and lemmatization in R.

  • Utilize feature extraction methods like Bag of Words and TF-IDF in R.

  • Work with advanced text mining models such as Word2Vec and Doc2Vec in R.

  • Perform sentiment analysis using R.

  • Apply Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) in R for topic modeling.

18 Module

Inklusive

22.03.2026

Aktualisiert

-

Benötigte Zeit (Stunde)

1. Overview of text mining
2. Lower case conversion, remove punctuation and stopwords, text tokenization in r
3. Stemming and lemmatization
4. Stemming and lemmatization in r
5. Bag of word
6. Bag of word in r
7. Tf-idf
8. Tf-idf in r
9. Part of speech tagging
10. Part of speech tagging in r
11. Word2vec
12. Word2vec in r
13. Doc2vec
14. Sentiment analysis in r
15. Latent semantic analysis
16. Latent semantic analysis in r
17. Latent dirichlet allocation
18. Latent dirichlet allocation in r
Lerne "Text mining with r" | Technische Informationsbibliothek (TIB)