Investigating Changes in Global Happiness Following the COVID-19 Pandemic

Cynthia Fonderson | Jul 15, 2023 min read

Following my data analysis project on the global impact of the pandemic, I was interested in seeing whether global happiness was also affected by the virus.

The dataset used in this project contains updated data from the World Happiness Report currated by Kaggle user mathurinache. Data was collected over a period of 8 years (2015 to 2022).

Project Outline

  1. Data Ingestion & Cleaning
  2. Exploratory Data Analysis & Visualization
  3. Happiness Score Prediction using Regression
  4. Clustering Analysis Based on Factors Affecting Happiness
  5. Summary

Project Summary & Conclusions

Data Curation

Following data ingestion and cleaning, the dataset contained 10 columns and ~1300 records

VariablesDescription
YearReporting year
CountryCountry name
Happiness ScoreHappiness score
Economy (GDP per Capita)Score based on the country’s gross domestic profit
FamilyScore based on the country’s social support systems
Health (Life Expectancy)Score based on the country’s life expectancy
FreedomScore based on the citizen’s freedom to make life choices
Trust (Government Corruption)Score based on the citizen’s perception of corruption in the government
GenerosityScore based on the citizen’s perception of generosity
Happiness RankThe country’s overall performance relative to other nations



Data Exploration and Analysis

To investigate differences in global happiness before and after the pandemic, I divided the dataset into two periods (pre: 2015 to 2018 and post: 2019 to 2022). Overall, no significant changes were observed in happiness ranking between both periods. ranks

corr_mat

Generally, developed countries, including the US, Canada, Australia and Scandinavian countries (Norway, Findland) achieved the highest happiness scores across the globe. On the other hand, third wold countries, particularly African nations, are reportedly the least happy in the world. happiness

Regression and Clustering Analysis

Following EDA, I evaluated the performance of five regression models in predicting a nation’s happiness score based on its economy, and other factors

Results:

ModelAccuracy (%)
LinearRegression77.1
SVR79.5
DecisionTreeRegressor54.6
RandomForestRegressor78.4
MLPRegressor72.2

Overall, the SVR and Random Forest models performed better than other regression models with prediction accuracies of ~80%. The Decision Tree Regressor performed poorly compared to other models with an accuracy of ~55%.

I also wanted to tease out the characteristics that separated “happy” countries from other nations of the world. Consequently, I conducted a clustering analysis on the dataset using the KMeans algorithm. Interestingly, I found that global happiness is highly correlated with a countrie’s GDP, which in turn influences other factors considered, including life expectancy and generosity. Therefore, a bias likely exists in the way we as humans determine happiness on the global scale. clusters

Full Project