Lerne "Text mining with r" | Technische Informationsbibliothek (TIB)

Text mining with r

Mentor

Gábor Kismihók

Beschreibung

This learning path provides a comprehensive introduction to text mining concepts and their practical application using R. It covers essential techniques such as text preprocessing (lower case conversion, punctuation and stopword removal, tokenization), stemming, lemmatization, feature extraction (Bag of Words, TF-IDF), and advanced models like Word2Vec, Doc2Vec, Sentiment Analysis, Latent Semantic Analysis, and Latent Dirichlet Allocation, all with a focus on implementation in R.

Lernziele

Understand fundamental text mining concepts.
Apply text preprocessing techniques in R (lower case conversion, punctuation and stopword removal, tokenization).
Implement stemming and lemmatization in R.
Utilize feature extraction methods like Bag of Words and TF-IDF in R.
Work with advanced text mining models such as Word2Vec and Doc2Vec in R.
Perform sentiment analysis using R.
Apply Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) in R for topic modeling.

18 Module

Inklusive

22.03.2026

Aktualisiert

-

Benötigte Zeit (Stunde)

1. Overview of text mining

2. Lower case conversion, remove punctuation and stopwords, text tokenization in r

3. Stemming and lemmatization

4. Stemming and lemmatization in r

5. Bag of word

6. Bag of word in r

7. Tf-idf

8. Tf-idf in r

9. Part of speech tagging

10. Part of speech tagging in r

11. Word2vec

12. Word2vec in r

13. Doc2vec

14. Sentiment analysis in r

15. Latent semantic analysis

16. Latent semantic analysis in r

17. Latent dirichlet allocation

18. Latent dirichlet allocation in r