Skip to content

This project applies Text Mining techniques using Python (NLTK, spaCy, TextBlob) to analyze a book. It includes text cleaning, tokenization, sentiment analysis, and keyword extraction to uncover insights.

Notifications You must be signed in to change notification settings

CaritoRamos/text-mining-project-in-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

In this project, the fascinating world of text mining is explored, applied to the analysis of a specific book. Using natural language processing (NLP) techniques and tools like NLTK and SpaCy, various aspects of the book are analyzed, from word frequency to sentiment analysis and entity recognition. The goal is to leverage computational language analysis to extract meaningful insights from literary works.

A large text corpus is first collected, serving as the raw material for the analysis. Through the NLTK library, the text is accessed and prepared by converting all words to lowercase, removing punctuation, and tokenizing the words for further exploration.

Next, word frequency is analyzed using NLTK, identifying the most common words in the text. A word cloud visualizes these frequencies, highlighting thematic trends and recurring motifs in the book.

The analysis is then deepened with lemmatization and named entity recognition using SpaCy, revealing the base forms of words and identifying key entities within the literary universe of the book.

Finally, sentiment analysis is performed with NLTK, calculating the polarity and subjectivity of the text. This provides insights into the emotions and tone conveyed throughout the work, enhancing the understanding of character feelings, plot evolution, and the emotional impact on readers, while offering a deeper interpretation of underlying themes and author intent.

About

This project applies Text Mining techniques using Python (NLTK, spaCy, TextBlob) to analyze a book. It includes text cleaning, tokenization, sentiment analysis, and keyword extraction to uncover insights.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published