Skip to content

Saka284/imdb-dataset-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

IMDB Dataset Analysis Project

This project involves analyzing the IMDB movie reviews dataset using Python in Google Colab. The analysis includes data exploration, data cleaning, and deriving insights from the dataset.

Introduction

The IMDB dataset contains movie reviews along with their associated sentiment (positive or negative). This project aims to explore the dataset, clean the data, perform exploratory data analysis (EDA), and visualize the findings.

Dataset

The dataset used in this project is the IMDB Top 250 Movies Excel file, which includes:

  • Movie Names: The names of the top-rated movies on IMDb.
  • Ratings: IMDb ratings for each movie.
  • Count of Ratings: The total number of ratings submitted by users for each movie.
  • Release Date: The date when each movie was officially released.
  • Country: The country of origin for each movie.
  • Budget: The estimated budget for producing each movie.
  • Domestic Gross: Earnings within the country of origin.
  • Domestic Weekend Gross: Opening weekend earnings within the country of origin.
  • Worldwide Gross: Total global earnings.

Data Cleaning

Key steps in data cleaning included:

  1. Converting dates to datetime format.
  2. Standardizing currency values to USD.
  3. Handling missing values and ensuring correct data types.

Exploratory Data Analysis (EDA)

EDA included:

  • Visualization of rating distributions.
  • Analysis of release year trends.
  • Comparison of budgets and worldwide gross earnings.
  • Identification of top-grossing movies.

Visualizations

  1. Rating Distribution: Histogram of movie ratings.
  2. Release Year Trend: Line chart of ratings over years.
  3. Top 5 Movies: Bar chart of the highest worldwide grossing movies.
  4. Budget vs. Earnings: Scatter plot of budget against worldwide earnings.

Insights

  • High budget does not guarantee high earnings; other factors such as storyline and cast are crucial.
  • Top-grossing movies have varied budgets, showing that financial success depends on multiple factors.

Conclusion

This analysis of the IMDB Top 250 Movies dataset reveals key insights into movie ratings and earnings. It highlights the importance of factors beyond budget in determining a movie's success, providing valuable information for movie enthusiasts, researchers, and industry professionals.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published