Mahabharata: An Exploratory Text Analysis

Welcome 🦚

This project explores the historical text Mahabharata - among the world’s greatest epics which took place in 5561 BCE — using data science and natural language processing.

Leveraging the public domain English translation by Kisari Mohan Ganguli, I have analyzed all 18 Parvas (books) of the epic, uncovering patterns in sentiment, emotion, character relationships, and thematic structure.

Key Features

Interactive visualizations of sentiment and emotion across the epic’s timeline
Topic modeling and word embeddings to reveal hidden themes
Quantitative evidence supporting traditional interpretations of the Mahabharata’s characters and events

Analytical Approach

The analysis follows the Standard Text Analytic Data Model:

F0: Raw text (from sacred-texts.com, Ganguli translation)
F2: Parsing into structured tables (books, chapters, tokens)
F3: Annotation with linguistic features (POS, term frequency)
F4: Vectorization (TF-IDF, Bag-of-Words)
F5: Modeling (PCA, LDA, Word2Vec)

Begin your journey on the Analysis page, or learn more about this project.