Coronavirus disease (COVID-19) is an infectious disease caused by a novel coronavirus. In early March 2020, COVID-19 was declared a pandemic by the World Health Organization.1 Since its emergence and global spread, the pandemic has been one of the greatest crises in decades, with over one hundred million confirmed cases, several million deaths, and an array of socio-economic, political, and other impacts. Although COVID-19 has been a truly global problem, affecting all regions, the COVID-19 burden in Sub-Saharan Africa (SSA) has been substantially lower than was predicted. Countries in SSA reported first confirmed cases and experienced a surge of the virus much later than other parts of the world, while also consistently reporting a comparatively low number of cases and deaths.2,3 Additionally, the case-fatality ratio for many countries in SSA has generally been lower than many other parts of the world, suggesting that disease outcomes have been less severe among populations in the region.4,5

Figure 1
Figure 1.Cumulative confirmed COVID-19 deaths per million people, sub-Saharan Africa

At the time, while SSA has had a generally lower overall COVID-19 burden than many other parts of the world, there has been great heterogeneity in burden and outcomes between individual countries within the region. For instance, some countries, such as South Africa and Ethiopia, have reported considerably higher prevalence and mortality compared to others. What accounts for the significant cross-country variability apparent in SSA and why have some countries performed better than others? The present study examines factors potentially associated with COVID-19 burden in the region. Through conducting a multiple regression analysis on a unique cross-sectional dataset comprising 48 countries from SSA, important country-level factors associated with COVID-19 deaths are investigated.

The present study is significant for several reasons. It strengthens and deepens understanding of prominent factors associated with COVID-19 burden, thus complementing the existing knowledge base and potentially informing policies or approaches to future pandemics. As well, whereas many previous studies have focused on biological and medical factors related to COVID-19, this study extends consideration to socio-economic, political, and other country-level factors which can also play a role in elucidating differences in COVID-19 outcomes across countries. Moreover, it investigates a considerably large number of countries and factors potentially associated with COVID-19, therefore offering a broader, wide-ranging, and more comprehensive investigation than many previous studies, which have been limited to examining single or only a few factors or countries. Last, the present study expands awareness and offers useful insights about COVID-19 in SSA, a massive region comprising 48 countries, which has received comparatively limited coverage and a paucity of research attention.

The paper is structured as follows. The methods are presented in the next section, featuring an outline of the design and description of the data. Subsequently, the results are reported, followed by the discussion. The final section presents conclusions and notes possible limitations.



The present study is a cross-sectional analysis of 48 countries from SSA, as classified by the World Bank’s regional designations.6 A unique dataset is constructed utilizing data that was extracted from various public open access databases. The variables selected for analysis cover socio-demographic, political, economic, and health-related dimensions of countries and are included on the basis of empirical findings reported in recent studies, the growing and evolving literature, relevant theoretical frameworks, and general data availability.7,8


COVID-19 burden: consistent with a large number of empirical studies, cumulative total deaths per million of the population is the outcome variable.9 The data is from the Our World in Data open access database and accurate up to 7 May 2021.10

Gross national income per capita: this variable presents a country’s gross national income (GNI), divided by its total population. Data are presented in current US dollars and taken from the World Bank’s World Development Indicators (WDI) open access database, which provides comprehensive cross-country comparable data on development.11

Global connectivity: the KOF Globalisation Index (KOFGI), which is the most widely used globalization index in the academic literature, is used as a proxy for global connectivity. A composite index based on 43 individual variables, the KOFGI measures globalization for every country in the world along the economic, social, and political dimension. Higher KOFGI values reflect greater degrees of globalization.12

Health system: data for traditional measures of national health system capacity (e.g., physicians per 1,000 population, intensive care unit capacity, or hospital beds per 1,000 population) is unavailable for many countries. Accordingly, national diphtheria, tetanus, and pertussis (DTP) immunization coverage is utilized. DTP coverage is regarded as a useful, standard measure of the strength and efficiency of national health systems, mainly because delivery requires three contacts with the health system at appropriate times and also because coverage is usually part of routine national vaccination programs rather than campaigns.13 Defined as the percentage of children ages 12-23 months who received vaccinations before 12 months or at any time before the survey, this variable is available from the WDI database.14

Population density: drawn from the WDI dataset, population density is measured as the total population of a country divided by its total land area in square kilometers.15

Median age of population: this variable, gathered from the World Population Prospects public database, presents the median age of the population of a country, which divides the population in two parts of equal size, so that there are as many persons with ages above the median as there are with ages below the median. It is expressed in years.16

Polity: the data for this collected from extremely popular and extensively used Polity5 Project database, which codes the political regime and authority characteristics of states in the world system from for purposes of comparative, quantitative analysis. For each country, a “Polity Score” is determined, ranging from +10 (strongly democratic) to -10 (strongly autocratic). Scores are based on an evaluation of democratic and autocratic characteristics and elements of regimes, including elections, the nature of political participation, and the extent of checks on executive authority.17

Women’s leadership and parliamentary representation: this variable is the number of seats held by women members in single or lower chambers of national parliaments, expressed as a percentage of all occupied seats. Retrieved from the WDI dataset, it is derived by dividing the total number of seats occupied by women by the total number of seats in parliament.18

Inequality: in accord with numerous other empirical studies, the Gini coefficient is used as a measure of income inequality.19 The Gini coefficient measures the extent to which the distribution of income among individuals or households within an economy deviates from a perfectly equal distribution. It runs from zero to one hundred, with zero reflecting complete equality and 100 being complete inequality. It is available from the WDI.20


For preliminary exploration of the relationships between variables, summary statistics and correlations are examined. Correlation analysis explores the association between variables, allowing for better understanding of the magnitude and direction of relationships. Correlation coefficients are measured on a standard scale that ranges between -1 and +1, with positive and negative values indicating the direction of relationship. Coefficients that are closer to -1 or +1 reflect stronger or larger correlations, with coefficients of -1 or +1 representing perfect correlation. Coefficients that are closer to 0 reflect a very weak or small association.21

As well, regression analysis, one of the most widely used statistical techniques in scenarios where multiple variables affect a single outcome, is conducted. Regression analysis examines the relationship between two or more factors at the same time and analyzes the extent to which each predicts or explains variations in the outcome of interest while others are controlled.22 Notably, regression analysis has been used in a considerable amount of work conducted on COVID-19.23

All analyses were performed utilizing the Statistical Package for the Social Sciences (23rd ed.) statistical software program.


Summary statistics

The summary statistics are shown within Table 1. The mean cumulative COVID-19 deaths per million is 93.39, mean GNI per capita is 2443.56, and mean global connectivity value is 50.15. For inequality, the mean is 43.81, while the means for DTP coverage and polity score are 80.31 and 2.81, respectively. The mean median age is 20.05, while population density has a mean of about 108.84. Last, the mean percentage of parliamentary seats held by women is 21.70.

Table 1.Summary statistics of variables for sub-Saharan African countries
Observations Mean Maximum Minimum Standard deviation
Deaths per million 48 93.39 922.07 0.35 168.39
Population density 48 108.84 623.30 2.97 137.19
Median age 48 20.05 37.50 15.20 4.19
GNI per capita 45 2443.56 16900.00 280.00 3314.23
Inequality 45 43.81 63.00 32.60 7.81
Globalization 47 50.15 72.00 30.00 8.44
DTP coverage 48 80.31 99.00 42.00 16.34
Polity 48 2.81 10.00 9.0 4.91
Women in parliament 47 21.70 61.25 3.38 12.47

DTP – diphtheria, tetanus and pertussis, GNI – gross national income

Correlation coefficients

Pearson’s correlation coefficients, which help to reveal possible associations between variables and COVID-19 burden, are reported in Table 2. The results illustrate that there is a moderate, statistically significant positive correlation between COVID-19 deaths per million and GNI per capita (r = .398, P = .007), global connectivity (r = .378, P = .009), inequality (r = .531, P = .000), and median age (r = .471, P = .001). This indicates that higher GNI per capita and greater inequality are associated with more COVID-19 deaths per million. Furthermore, as the value of the median age and globalization of a country increases, so does the value of cumulative COVID-19 deaths per million.

Table 2.Pearson correlation coefficients
Deaths per million
Population density -.063
Median Age .471***
GNI per capita .398***
Inequality .531***
Globalization .378***
DTP coverage .190
Polity .098
Women in parliament .156

P-values are in parentheses.
*** = Correlation is significant at the 0.01.
** = Correlation is significant at the 0.05 level.
* = Correlation is significant at the 0.10 level.
DTP – diphtheria, tetanus and pertussis, GNI – gross national income

Regression analysis

In Table 3, the results of the regression analyses are presented. The different variables are included in several stages, according to socio-demographic, political, economic, and health-related dimensions, in order to better examine their possible association with COVID-19 deaths per million.

Panel I explores population density and median age. Results reveal that population density and median age together explain 29.4% of the variance (R2=0.294) and that Panel I is a significant predictor of deaths per million, F (2, 45)=9.371, P<0.001. Both population density (B=-0.357, P=0.038) and median age (B=23.444, P<0.001) contribute significantly.

In Panel II, population density and median age are retained, while GNI per capita and the measure of inequality are introduced. Panel II accounts for approximately 50.4% of the variability in deaths per million (R2=0.504) and is significantly useful in explaining deaths per million F (4, 39)=9.896, P<0.001. The estimates show that median age (B=29.4, P<0.006) and inequality (B=10.173, P<0.001) are positively associated with deaths per million. That is, as median age increases by one unit, cumulative COVID-19 deaths per million increase by 29.4 units, while as inequality increases by one unit, COVID-19 deaths per million increase by 10.173 units. However, population density is negatively associated with deaths per million (B=-.285, P=0.078) at the 10% level, while GNI per capita fails to demonstrate a statistically significant relationship with COVID-19 deaths per million.

Panel III builds on the previous panels by adding DTP coverage and global connectivity. About 51.2% of the variability in COVID-19 deaths per million can be accounted for with Panel III (R2=0.512), which is also significantly useful in explaining deaths per million F (6, 37)=6.466, P=0.001. Once again, the estimates show that median age (B=26.572, P=0.020) and inequality (B=10.048, P=0.001) are positively associated with COVID-19 deaths, while population density (B=-0.303, P =0.085) remains negatively associated at the 10% level. However, the various other variables included do not make a statistically significant contribution.

In Panel IV all of the variables are included. Panel IV explains approximately 54.2% of the variation in COVID-19 deaths per million (R2=0.542) and is significantly useful in explaining COVID-19 deaths per million F (8, 34)=5.033, P=0.000. As with other regressions, median age (B=31.991, P=0.011) and inequality (B=9.442, P=0.002) demonstrate a positive association with deaths per million, while population density also contributes significantly, being negatively associated with deaths per million at the 5% level (B=-0.384, P=0.049). Thus, as median age and inequality increase, so do COVID-19 deaths per million. On the other hand, as population density increases, cumulative COVID-19 deaths per million decrease. The various other variables included do not make a statistically significant contribution.


The findings in the present study are noteworthy and provide several key insights. For instance, past scholarship has demonstrated an association between inequality and various health outcomes,24 while recent studies from various settings show an association between inequality and COVID-19 burden.25,26 Consistent with this body of work, the present study finds a clear positive association between inequality and COVID-19 deaths.

Table 3.Regression analysis of factors associated with COVID-19 deaths in Africa
Panel I Panel II Panel III Panel IV
Population density -.357**
Median age 23.444***
GNI per capita -.014
Inequality 10.173***
Globalization 1.625
DTP coverage .846
Polity -7.943
Women in parliament .330
R2 .294 .504 .512 .542
F 9.371 9.896 6.466 5.033
P (.000)*** (.000)*** (.000)*** (.000)***

Dependent variable is COVID-19 deaths per million.
P-values are in parentheses.
*** = significant at the 0.01.
** = significant at the 0.05 level.
* = significant at the 0.10 level
DTP – diphtheria, tetanus and pertussis, GNI – gross national income

There are several different possible mechanisms that may help explain this finding. Societies with greater levels of inequality may have less social cohesion or lower public trust levels,27 which can undermine compliance with government and public health guidelines,28 and inequality is additionally associated with an array of health conditions and lifestyle factors, such as diabetes, obesity, and lack of exercise, which can leave individuals or communities vulnerable to infections, severe cases, and deaths.29,30 Inequality may also increase risk of infection, as the most disadvantaged individuals cannot work from home and must remain in high-risk employment,31 while inequality and exclusion are associated with differences in health literacy and information seeking behavior, which can impact health outcomes.32

Also, the findings related to population density are notable. In particular, previous work has illustrated that population density can be a major factor in the transmission of infectious diseases and that epidemics or outbreaks may be more frequent or severe when population density is high.33,34 More recently, a substantial body of work has found an association between population density and COVID-19.35 In stark contrast, this study finds that population density is negatively associated with COVID-19 deaths. While potentially somewhat surprising, this is in accord with recent work indicating that higher population density may be linked with fewer COVID-19 cases or deaths. Possible explanations include that areas of greater density generally have better and more concentrated healthcare infrastructure and facilities, more quality staff and specialized services, faster response times, and more intensive-care beds and other health resources, all of which can help reduce infections or deaths related to COVID-19.36

From early on in the COVID-19 pandemic, researchers have widely documented that older populations, who are more likely to have comorbidities, declining immune systems, and preexisting medical conditions, have more vulnerability to infection, dramatically greater risk of developing severe and more serious complications, and strikingly higher mortality rates compared to younger groups.37,38 Furthermore, studies have found a strong association between median age and COVID-19 burden.39 The present study confirms this existing body of empirical work, finding a clear association between median age and COVID-19 deaths.

Global connectivity and levels of international air traffic or travel facilitated the spread of disease during past crises, and some research suggests that they may have also played a role in the early importation of COVID-19.40,41 Still, the present study finds that global connectivity, as measured by a standard, extensively used index of globalization, the KOFGI, is not associated with COVID-19 deaths.

Generally, national wealth, income, or economic performance, and healthcare systems or capacity have consistently been shown to be major determinants of health and development outcomes.42 However, the fact that GNI per capita and DTP coverage do not demonstrate statistical significance in the present study parallels work suggesting that countries’ level of wealth, development, and income is not the main criteria or sole determinant of health outcomes or COVID-19 burden. It is quite notable that as the COVID-19 pandemic has unfolded around the world, many of the wealthiest, highly developed countries, characterized by well-equipped healthcare systems and considerable, high-quality resources, have been ill-prepared and among those most severely hit. On the other hand, some less developed, low- and middle-income countries have reported much lower COVID-19 incidence and mortality rates. Despite facing genuine obstacles and various constraints, the latter may be able to plan and execute timely, innovative, and efficient responses, coordinate efforts of various sectors and mobilize civil society, implement and enforce effective policies or guidelines, and maximize limited assets or resources.43

A broad literature exists exploring the relationship, vulnerability, and response of different political systems or regime types to disasters, emergencies, and crises.44 Since the onset of the COVID-19 pandemic, as well, the questions of whether and how different political systems may impact national COVID-19 burden have been prominent and widely debated. While the scholarship continues to grow and develop, to date, there is no strong or unambiguous consensus; there are conflicting arguments and findings apparent and both forms of government seem to have specific strengths and weaknesses.45 This study does not find an association between political system and COVID-19 deaths. The pandemic has not distinguished between democracies and autocracies in SSA. Moreover, there has been wide variance in responses within democratic and autocratic regimes in the region, and strictly focusing on the type of political system risks neglecting other potentially significant factors. Rather than the particular type of political regime, findings suggest that several country-specific variables, such as income inequality, population density, and age, have more bearing on COVID-19-related outcomes.

Another frequent topic of commentary, debate, and comparison during the COVID-19 pandemic has been whether countries led by women have fared better than those led by men.46 Plausibly, gender may be significant in different ways. For example, some work has indicated that women tend to be more loss- and risk-averse than men,47 therefore possibly making them less willing to accept health risks and more likely to introduce restrictive measures earlier. Additionally, more women in legislature has also been associated with an increase in public health spending,48 while a lack of diverse representation in leadership can exclude those who offer unique perspectives or expertise.49 Johnson and Williams50 also suggest that women leaders may be able to draw on their traditional motherly role - for example, as the member of the household who traditionally cares for the sick - to display forms of feminine protectionism.

Notably, in a study of COVID-19 deaths in the United States, it was found that states with women governors had fewer COVID-19 deaths compared to states with men governors.51 Similarly, in a study of differences in the policy responses of male and female leaders from 194 countries, researchers found that COVID-19 outcomes, especially deaths, are better in countries led by women, who reacted more quickly and decisively in the face of potential fatalities.52

Nevertheless, the present study finds that the proportion of seats held by women in national parliaments is not statistically significant, thus generally aligning with other empirical studies that have found no statistical support for differences in reported fatalities between women-led and men-led countries notion.53 It might be the case that rather than leadership gender, what matters most for COVID-19-related outcomes are the specific features and characteristics of the countries that they oversee and administer.


The COVID-19 pandemic has been one of the greatest global crises in modern human history. Interestingly, in SSA, COVID-19-related outcomes have been generally better than were initially feared or expected. At the same time, there has been great heterogeneity in COVID-19 outcomes between countries in the region, with some reporting particularly high burdens compared to others. Using a novel dataset comprising an array of socio-demographic, political, economic, and health-related variables gathered from publicly available sources, the present study investigates country-specific factors that may help to explain differences in COVID-19 outcomes across 48 countries in SSA. Findings from statistical analyses demonstrate that COVID-19 deaths per million is positively associated with inequality and median age, and negatively associated with population density. On the other hand, a number of other variables, including GNI per capita, global connectivity, DTP immunization coverage, the proportion of seats in parliament held by women, and political system or regime type, are not statistically significant.

A number of important limitations in the present study can be noted. For one, the findings are dependent on primary data sources, some of which may be of uncertain quality. In particular, it is possible that COVID-19 data for some countries in SSA are not wholly accurate, since infections and deaths may be underreported due to limited testing capacity and challenges in the attribution of the cause of death (e.g., weak or nonexistent national death registration systems). Thus, findings and conclusions, while informative and enlightening, must also be considered with care and caution.

Second, despite many countries around the world making considerable progress in the rollout of rapidly developed vaccines, SSA remains considerably far behind. The pandemic continues to evolve in the region, new and highly transmissible variants are circulating, and future dynamics are still yet unclear. As a result, some of the findings and conclusions presented here may possibly change. Additionally, while the present study examined many country-specific variables, there are still several others that may be included in future analyses, including the role of civil society, experience with past epidemics or crises, influence of media or information, and temperature or climate. Last, the present study is cross-sectional, thus findings may be different for other timeframes or when utilizing time series, longitudinal data.

Notwithstanding these limitations, however, the present study contributes to our general understanding of important factors associated with COVID-19 burden and helps to expand awareness of the regional situation in SSA, which has received comparatively limited coverage and a paucity of research attention. Moreover, rather than representing a final, definitive judgment or account of the situation in the region, the findings in the present study offer useful insights and indicate possible avenues for further debate, research, and investigation.


No protocol approval was required because no research involving human participants was conducted.


There are no sources of funding to declare.

Competing Interests

The authors completed the Unified Competing Interest form at (available upon request from the corresponding author) and declare no conflicts of interest.

Correspondence to:

Dr. Fikresus Amahazion

College of Arts and Social Sciences

Adi Keih, Eritrea

[email protected]