Menu

Exploratory Data Analysis Projects

Project 1

Top 100 IMDb Movie Analysis

IMDb Movie project explores IMDb movie data through detailed exploratory data analysis (EDA). It involves cleaning, processing, and visualizing key movie attributes such as ratings, genres, and box office performance. The analysis uncovers trends in movie success, factors influencing audience ratings, and insights into the film industry. Using Python (Pandas, Matplotlib, and Seaborn), the project provides data-driven conclusions about the dynamics of popular movies.

Python Pandas Matplotlib Seaborn Data Cleaning
Project 2

Credit Approval/Disapproval Analysis

This credit approval analysis project employs Exploratory Data Analysis (EDA) to identify key factors influencing loan defaults for a consumer finance company. Through comprehensive data cleaning, outlier detection, and feature engineering, I uncovered critical risk indicators including income-credit mismatches, employment status anomalies, and external credit score thresholds. My analysis revealed that applicants with credit amounts exceeding 4x their annual income, unemployed individuals, and those with low external credit scores presented the highest default risks. The project delivers actionable insights that enable the company to optimize their loan approval process, minimize financial losses, and maintain a healthy portfolio while approving creditworthy applicants.

Data Cleaning Outlier Detection Feature Engineering Risk Analysis Statistical Analysis
Project 3

Titanic Dataset Analysis

The Titanic Dataset Analysis explores key factors influencing passenger survival using Exploratory Data Analysis (EDA) and machine learning preparation. Through data cleaning, feature engineering, and statistical tests, the project identifies Sex, Pclass, and Embarked as the most significant predictors of survival. Key insights reveal that females, first-class passengers, and those from Cherbourg had higher survival rates, while males in third class faced the highest fatalities. The dataset was successfully processed for predictive modeling, making it a solid foundation for classification tasks.

Data Cleaning Feature Engineering Statistical Tests Data Visualization Survival Analysis
Project 4

World Population Analysis

This project explores global population trends using the World Bank's dataset covering 266 countries/regions from 1960 to 2023. The analysis identifies key growth patterns, demographic shifts, and regional variations while assessing the impact of historical events like pandemics and conflicts. Key insights include India surpassing China as the most populated country, Africa's rapid growth, and Europe's population decline due to aging demographics. Urbanization trends, density variations, and migration effects are also analyzed. The dataset was cleaned, processed, and scaled for better readability, making it a valuable resource for population forecasting and policy planning.

Time Series Analysis Data Cleaning Demographic Analysis Geospatial Analysis Data Scaling