Lerne "Text mining with python" | Technische Informationsbibliothek (TIB)

Text mining with python

Mentor

Gábor Kismihók

Beschreibung

This learning path provides a comprehensive introduction to text mining techniques, focusing on practical implementation using Python. It covers essential steps from text preprocessing, such as lower case conversion, punctuation removal, stopword elimination, tokenization, stemming, and lemmatization, to advanced topics like Bag-of-Words, TF-IDF, Part-of-Speech tagging, Word2Vec, Doc2Vec, sentiment analysis, Latent Semantic Analysis, and Latent Dirichlet Allocation. Learners will gain hands-on experience with various text mining concepts and their application in Python.

Lernziele

Understand the fundamental concepts and applications of text mining.
Apply various text preprocessing techniques, including lowercasing, punctuation removal, stopword elimination, tokenization, stemming, and lemmatization, using Python.
Implement and utilize Bag-of-Words and TF-IDF models for text representation.
Perform Part-of-Speech tagging on text data with NLTK.
Understand and apply word embedding techniques like Word2Vec and Doc2Vec.
Conduct sentiment analysis on text data using Python.
Explore and implement topic modeling techniques such as Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA).

18 Module

Inklusive

21.03.2026

Aktualisiert

-

Benötigte Zeit (Stunde)

1. Overview of text mining

2. Lower case conversion, remove punctuation and stopwords, text tokenization in python

3. Stemming and lemmatization

4. Stemming and lemmatization in python

5. Bag of word

6. Bag of word in python

7. Tf-idf

8. Tf-idf in python

9. Part of speech tagging

10. Part of speech tagging with nltk

11. Word2vec

12. Word2vec in python

13. Doc2vec

14. Sentiment analysis in python

15. Latent semantic analysis

16. Latent semantic analysis in python

17. Latent dirichlet allocation

18. Latent dirichlet allocation in python