Exploring national COVID-19 variability across sub-Saharan Africa

Background: In early March 2020, coronavirus disease (COVID-19), an infectious disease caused by a novel coronavirus, was declared a pandemic by the World Health Organization. Since its emergence and global spread, the pandemic has been one of the greatest global crises in modern human history. Notably, in Sub-Saharan Africa (SSA), COVID-19-related burden and outcomes have been generally lower than many other parts of the world and substantially better than were initially feared. At the same time, there has been great heterogeneity in COVID-19 burden and outcomes between countries in the region, with some reporting particularly high incidence and death figures compared to others. What accounts for the significant cross-country variability apparent in SSA and why have some countries performed better than others? The present study investigates country-specific factors that may help to explain differences in COVID-19 outcomes across 48 countries in SSA.


Background
In early March 2020, coronavirus disease (COVID-19), an infectious disease caused by a novel coronavirus, was declared a pandemic by the World Health Organization.Since its emergence and global spread, the pandemic has been one of the greatest global crises in modern human history.Notably, in Sub-Saharan Africa (SSA), COVID-19-related burden and outcomes have been generally lower than many other parts of the world and substantially better than were initially feared.At the same time, there has been great heterogeneity in COVID-19 burden and outcomes between countries in the region, with some reporting particularly high incidence and death figures compared to others.What accounts for the significant cross-country variability apparent in SSA and why have some countries performed better than others?The present study investigates country-specific factors that may help to explain differences in COVID-19 outcomes across 48 countries in SSA.

Methods
A novel cross-sectional dataset, comprising a wide array of socio-demographic, political, economic, and health-related variables, is constructed through gathering data from publicly available sources.Descriptive statistics, correlation analyses, and multiple regression analyses are performed to reveal important country-level factors associated with COVID-19 deaths in SSA.

Results
Findings from statistical analyses show that in SSA COVID-19 deaths per million is positively associated with income inequality and median age, and negatively associated with population density.In contrast, a number of other variables, including gross national income (GNI) per capita, global connectivity, diphtheria, tetanus and pertussis (DTP) immunization coverage, the proportion of seats in parliament held by women, and political system or regime type, are not statistically significant.

Conclusions
Although findings from recent studies conducted in various settings around the world indicate that a range of socio-economic, demographic, political, and health-related factors may be linked with COVID-19 burden, the present investigation finds that COVID-19 deaths in SSA are associated with population density, median age, and income inequality.

Coronavirus disease (COVID-19
) is an infectious disease caused by a novel coronavirus.In early March 2020, COVID-19 was declared a pandemic by the World Health Organization. 1 Since its emergence and global spread, the pandemic has been one of the greatest crises in decades, with over one hundred million confirmed cases, several million deaths, and an array of socio-economic, political, and other impacts.Although COVID-19 has been a truly global problem, affecting all regions, the COVID-19 burden in Sub-Saharan Africa (SSA) has been substantially lower than was predicted.Countries in SSA reported first confirmed cases and experienced a surge of the virus much later than other parts of the world, while also consistently reporting a comparatively low number of cases and deaths. 2,3Additionally, the case-fatality ratio for many countries in SSA has generally been lower than many other parts of the world, suggesting that disease outcomes have been less severe among populations in the region. 4,5t the time, while SSA has had a generally lower overall COVID-19 burden than many other parts of the world, there has been great heterogeneity in burden and outcomes between individual countries within the region.For instance, some countries, such as South Africa and Ethiopia, have reported considerably higher prevalence and mortality com-pared to others.What accounts for the significant crosscountry variability apparent in SSA and why have some countries performed better than others?The present study examines factors potentially associated with COVID-19 burden in the region.Through conducting a multiple regression analysis on a unique cross-sectional dataset comprising 48 countries from SSA, important country-level factors associated with COVID-19 deaths are investigated.
The present study is significant for several reasons.It strengthens and deepens understanding of prominent factors associated with COVID-19 burden, thus complementing the existing knowledge base and potentially informing policies or approaches to future pandemics.As well, whereas many previous studies have focused on biological and medical factors related to COVID-19, this study extends consideration to socio-economic, political, and other country-level factors which can also play a role in elucidating differences in COVID-19 outcomes across countries.Moreover, it investigates a considerably large number of countries and factors potentially associated with COVID-19, therefore offering a broader, wide-ranging, and more comprehensive investigation than many previous studies, which have been limited to examining single or only a few factors or countries.Last, the present study expands awareness and offers useful insights about COVID-19 in SSA, a massive region comprising 48 countries, which has received comparatively limited coverage and a paucity of research attention.
The paper is structured as follows.The methods are presented in the next section, featuring an outline of the design and description of the data.Subsequently, the results are reported, followed by the discussion.The final section presents conclusions and notes possible limitations.

DATA
The present study is a cross-sectional analysis of 48 countries from SSA, as classified by the World Bank's regional designations. 6A unique dataset is constructed utilizing data that was extracted from various public open access databases.The variables selected for analysis cover sociodemographic, political, economic, and health-related dimensions of countries and are included on the basis of empirical findings reported in recent studies, the growing and evolving literature, relevant theoretical frameworks, and general data availability. 7,8RIABLES COVID-19 burden: consistent with a large number of empirical studies, cumulative total deaths per million of the population is the outcome variable. 9The data is from the Our World in Data open access database and accurate up to 7 May 2021. 10 Gross national income per capita: this variable presents a country's gross national income (GNI), divided by its total population.Data are presented in current US dollars and taken from the World Bank's World Development Indicators (WDI) open access database, which provides comprehensive cross-country comparable data on development. 11lobal connectivity: the KOF Globalisation Index (KOFGI), Higher KOFGI values reflect greater degrees of globalization.12 Health system: data for traditional measures of national health system capacity (e.g., physicians per 1,000 population, intensive care unit capacity, or hospital beds per 1,000 population) is unavailable for many countries.Accordingly, national diphtheria, tetanus, and pertussis (DTP) immunization coverage is utilized.DTP coverage is regarded as a useful, standard measure of the strength and efficiency of national health systems, mainly because delivery requires three contacts with the health system at appropriate times and also because coverage is usually part of routine national vaccination programs rather than campaigns.13 Defined as the percentage of children ages 12-23 months who received vaccinations before 12 months or at any time before the survey, this variable is available from the WDI database.14 Population density: drawn from the WDI dataset, population density is measured as the total population of a country divided by its total land area in square kilometers. 15edian age of population: this variable, gathered from the World Population Prospects public database, presents the median age of the population of a country, which divides the population in two parts of equal size, so that there are as many persons with ages above the median as there are with ages below the median.It is expressed in years.16 Polity: the data for this collected from extremely popular and extensively used Polity5 Project database, which codes the political regime and authority characteristics of states in the world system from for purposes of comparative, DTP -diphtheria, tetanus and pertussis, GNI -gross national income quantitative analysis.For each country, a "Polity Score" is determined, ranging from +10 (strongly democratic) to -10 (strongly autocratic).Scores are based on an evaluation of democratic and autocratic characteristics and elements of regimes, including elections, the nature of political participation, and the extent of checks on executive authority.17 Women's leadership and parliamentary representation: this variable is the number of seats held by women members in single or lower chambers of national parliaments, expressed as a percentage of all occupied seats.Retrieved from the WDI dataset, it is derived by dividing the total number of seats occupied by women by the total number of seats in parliament.18 Inequality: in accord with numerous other empirical studies, the Gini coefficient is used as a measure of income inequality.19 The Gini coefficient measures the extent to which the distribution of income among individuals or households within an economy deviates from a perfectly equal distribution.It runs from zero to one hundred, with zero reflecting complete equality and 100 being complete inequality.It is available from the WDI.20

ANALYSIS
For preliminary exploration of the relationships between variables, summary statistics and correlations are examined.Correlation analysis explores the association between variables, allowing for better understanding of the magnitude and direction of relationships.Correlation coefficients are measured on a standard scale that ranges between -1 and +1, with positive and negative values indicating the direction of relationship.Coefficients that are closer to -1 or +1 reflect stronger or larger correlations, with coefficients of -1 or +1 representing perfect correlation.Coefficients that are closer to 0 reflect a very weak or small association. 21s well, regression analysis, one of the most widely used statistical techniques in scenarios where multiple variables affect a single outcome, is conducted.Regression analysis examines the relationship between two or more factors at the same time and analyzes the extent to which each pre-dicts or explains variations in the outcome of interest while others are controlled. 22Notably, regression analysis has been used in a considerable amount of work conducted on COVID-19. 23ll analyses were performed utilizing the Statistical Package for the Social Sciences (23 rd ed.) statistical software program.

SUMMARY STATISTICS
The summary statistics are shown within Table 1.The mean cumulative COVID-19 deaths per million is 93.39,mean GNI per capita is 2443.56,and mean global connectivity value is 50.15.For inequality, the mean is 43.81, while the means for DTP coverage and polity score are 80.31 and 2.81, respectively.The mean median age is 20.05, while population density has a mean of about 108.84.Last, the mean percentage of parliamentary seats held by women is 21.70.

CORRELATION COEFFICIENTS
Pearson's correlation coefficients, which help to reveal possible associations between variables and COVID-19 burden, are reported in Table 2.The results illustrate that there is a moderate, statistically significant positive correlation between COVID-19 deaths per million and GNI per capita (r = .398,P = .007),global connectivity (r = .378,P = .009),inequality (r = .531,P = .000),and median age (r = .471,P = .001).This indicates that higher GNI per capita and greater inequality are associated with more COVID-19 deaths per million.Furthermore, as the value of the median age and globalization of a country increases, so does the value of cumulative COVID-19 deaths per million.

REGRESSION ANALYSIS
In Table 3, the results of the regression analyses are presented.The different variables are included in several stages, according to socio-demographic, political, eco- Panel III builds on the previous panels by adding DTP coverage and global connectivity.About 51.2% of the variability in COVID-19 deaths per million can be accounted for with Panel III (R 2 =0.512), which is also significantly useful in explaining deaths per million F (6, 37)=6.466,P=0.001.Once again, the estimates show that median age (B=26.572,P=0.020) and inequality (B=10.048,P=0.001) are positively associated with COVID-19 deaths, while population density (B=-0.303,P =0.085) remains negatively associated at the 10% level.However, the various other variables included do not make a statistically significant contribution.
In Panel IV all of the variables are included.Panel IV explains approximately 54.2% of the variation in COVID-19 deaths per million (R 2 =0.542) and is significantly useful in explaining COVID-19 deaths per million F (8, 34)=5.033,P=0.000.As with other regressions, median age (B=31.991,P=0.011) and inequality (B=9.442,P=0.002) demonstrate a positive association with deaths per million, while population density also contributes significantly, being negatively associated with deaths per million at the 5% level (B=-0.384,P=0.049).Thus, as median age and inequality increase, so do COVID-19 deaths per million.On the other hand, as population density increases, cumulative COVID-19 deaths per million decrease.The various other variables included do not make a statistically significant contribution.

DISCUSSION
The findings in the present study are noteworthy and provide several key insights.For instance, past scholarship has demonstrated an association between inequality and various health outcomes, 24 while recent studies from various settings show an association between inequality and COVID-19 burden. 25,26Consistent with this body of work,

DTP -diphtheria, tetanus and pertussis, GNI -gross national income the present study finds a clear positive association between inequality and COVID-19 deaths.
There are several different possible mechanisms that may help explain this finding.Societies with greater levels of inequality may have less social cohesion or lower public trust levels, 27 which can undermine compliance with government and public health guidelines, 28 and inequality is additionally associated with an array of health conditions and lifestyle factors, such as diabetes, obesity, and lack of exercise, which can leave individuals or communities vulnerable to infections, severe cases, and deaths. 29,30Inequality may also increase risk of infection, as the most disadvantaged individuals cannot work from home and must remain in high-risk employment, 31 while inequality and exclusion are associated with differences in health literacy and information seeking behavior, which can impact health outcomes. 32lso, the findings related to population density are notable.In particular, previous work has illustrated that population density can be a major factor in the transmission of infectious diseases and that epidemics or outbreaks may be more frequent or severe when population density is high. 33,34More recently, a substantial body of work has found an association between population density and COVID-19. 35In stark contrast, this study finds that population density is negatively associated with COVID-19 deaths.While potentially somewhat surprising, this is in accord with recent work indicating that higher population density may be linked with fewer COVID-19 cases or deaths.Possible explanations include that areas of greater density generally have better and more concentrated healthcare infrastructure and facilities, more quality staff and specialized services, faster response times, and more intensive-care  * = significant at the 0.10 level DTP -diphtheria, tetanus and pertussis, GNI -gross national income beds and other health resources, all of which can help reduce infections or deaths related to COVID-19. 36rom early on in the COVID-19 pandemic, researchers have widely documented that older populations, who are more likely to have comorbidities, declining immune systems, and preexisting medical conditions, have more vulnerability to infection, dramatically greater risk of developing severe and more serious complications, and strikingly higher mortality rates compared to younger groups. 37,38urthermore, studies have found a strong association between median age and COVID-19 burden. 39The present study confirms this existing body of empirical work, finding a clear association between median age and COVID-19 deaths.
Global connectivity and levels of international air traffic or travel facilitated the spread of disease during past crises, and some research suggests that they may have also played a role in the early importation of COVID-19. 40,41Still, the present study finds that global connectivity, as measured by a standard, extensively used index of globalization, the KOFGI, is not associated with COVID-19 deaths.
Generally, national wealth, income, or economic performance, and healthcare systems or capacity have consistently been shown to be major determinants of health and development outcomes. 42However, the fact that GNI per capita and DTP coverage do not demonstrate statistical significance in the present study parallels work suggesting that countries' level of wealth, development, and income is not the main criteria or sole determinant of health outcomes or COVID-19 burden.It is quite notable that as the COVID-19 pandemic has unfolded around the world, many of the wealthiest, highly developed countries, characterized by well-equipped healthcare systems and considerable, highquality resources, have been ill-prepared and among those most severely hit.On the other hand, some less developed, low-and middle-income countries have reported much lower COVID-19 incidence and mortality rates.Despite facing genuine obstacles and various constraints, the latter may be able to plan and execute timely, innovative, and efficient responses, coordinate efforts of various sectors and mobilize civil society, implement and enforce effective policies or guidelines, and maximize limited assets or resources. 43 broad literature exists exploring the relationship, vulnerability, and response of different political systems or regime types to disasters, emergencies, and crises. 44Since the onset of the COVID-19 pandemic, as well, the questions of whether and how different political systems may impact national COVID-19 burden have been prominent and widely debated.While the scholarship continues to grow and develop, to date, there is no strong or unambiguous consensus; there are conflicting arguments and findings apparent and both forms of government seem to have specific strengths and weaknesses. 45This study does not find an association between political system and COVID-19 deaths.The pandemic has not distinguished between democracies and autocracies in SSA.Moreover, there has been wide variance in responses within democratic and autocratic regimes in the region, and strictly focusing on the type of political system risks neglecting other potentially significant factors.Rather than the particular type of political regime, findings suggest that several country-specific variables, such as income inequality, population density, and age, have more bearing on COVID-19-related outcomes.
Another frequent topic of commentary, debate, and comparison during the COVID-19 pandemic has been whether countries led by women have fared better than those led by men. 46Plausibly, gender may be significant in different ways.For example, some work has indicated that women tend to be more loss-and risk-averse than men, 47 therefore possibly making them less willing to accept health risks and more likely to introduce restrictive measures earlier.Additionally, more women in legislature has also been associated with an increase in public health spending, 48 while a lack of diverse representation in leadership can exclude those who offer unique perspectives or expertise. 49Johnson and Williams 50 also suggest that women leaders may be able to draw on their traditional motherly role -for example, as the member of the household who traditionally cares for the sick -to display forms of feminine protectionism.
Notably, in a study of COVID-19 deaths in the United States, it was found that states with women governors had fewer COVID-19 deaths compared to states with men governors. 51Similarly, in a study of differences in the policy responses of male and female leaders from 194 countries, researchers found that COVID-19 outcomes, especially deaths, are better in countries led by women, who reacted more quickly and decisively in the face of potential fatalities. 52evertheless, the present study finds that the proportion of seats held by women in national parliaments is not statistically significant, thus generally aligning with other empirical studies that have found no statistical support for differences in reported fatalities between women-led and men-led countries notion. 53It might be the case that rather than leadership gender, what matters most for COVID-19-related outcomes are the specific features and characteristics of the countries that they oversee and administer.

CONCLUSION
The COVID-19 pandemic has been one of the greatest global crises in modern human history.Interestingly, in SSA, COVID-19-related outcomes have been generally better than were initially feared or expected.At the same time, there has been great heterogeneity in COVID-19 outcomes between countries in the region, with some reporting particularly high burdens compared to others.Using a novel dataset comprising an array of socio-demographic, political, economic, and health-related variables gathered from publicly available sources, the present study investigates country-specific factors that may help to explain differences in COVID-19 outcomes across 48 countries in SSA.Findings from statistical analyses demonstrate that COVID-19 deaths per million is positively associated with inequality and median age, and negatively associated with population density.On the other hand, a number of other variables, including GNI per capita, global connectivity, DTP immunization coverage, the proportion of seats in parliament held by women, and political system or regime type, are not statistically significant.
A number of important limitations in the present study can be noted.For one, the findings are dependent on primary data sources, some of which may be of uncertain quality.In particular, it is possible that COVID-19 data for some countries in SSA are not wholly accurate, since infections and deaths may be underreported due to limited testing capacity and challenges in the attribution of the cause of death (e.g., weak or nonexistent national death registration systems).Thus, findings and conclusions, while informative and enlightening, must also be considered with care and caution.
Second, despite many countries around the world making considerable progress in the rollout of rapidly developed vaccines, SSA remains considerably far behind.The pandemic continues to evolve in the region, new and highly transmissible variants are circulating, and future dynamics are still yet unclear.As a result, some of the findings and conclusions presented here may possibly change.Additionally, while the present study examined many country-specific variables, there are still several others that may be included in future analyses, including the role of civil society, experience with past epidemics or crises, influence of media or information, and temperature or climate.Last, the present study is cross-sectional, thus findings may be different for other timeframes or when utilizing time series, longitudinal data.
Notwithstanding these limitations, however, the present study contributes to our general understanding of important factors associated with COVID-19 burden and helps to expand awareness of the regional situation in SSA, which has received comparatively limited coverage and a paucity of research attention.Moreover, rather than representing a final, definitive judgment or account of the situation in the region, the findings in the present study offer useful insights and indicate possible avenues for further debate, research, and investigation.

Submitted: May 23
, 2021 GMT, Accepted: June 05, 2021 GMT This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CCBY-4.0).View this license's legal deed at http://creativecommons.org/licenses/by/4.0 and legal code at http://creativecommons.org/licenses/by/4.0/legalcodefor more information.Exploring national COVID-19 variability across sub-Saharan Africa Journal of Global Health Reports

Table 1 . Summary statistics of variables for sub-Saharan African countries
Exploring national COVID-19 variability across sub-Saharan Africa Journal of Global Health Reports nomic, and health-related dimensions, in order to better examine their possible association with COVID-19 deaths per million.Panel I explores population density and median age.Results reveal that population density and median age together explain 29.4% of the variance (R 2 =0.294) and that Panel I is a significant predictor of deaths per million, F (2, 45)=9.371,P<0.001.Both population density (B=-0.357,P=0.038) and median age (B=23.444,P<0.001) contribute significantly.In Panel II, population density and median age are retained, while GNI per capita and the measure of inequality are introduced.Panel II accounts for approximately 50.4% of the variability in deaths per million (R 2 =0.504) and is significantly useful in explaining deaths per million F (4, 39)=9.896,P<0.001.The estimates show that median age (B=29.4,P<0.006) and inequality (B=10.173,P<0.001) are positively associated with deaths per million.That is, as median age increases by one unit, cumulative COVID-19 deaths per million increase by 29.4 units, while as inequality increases by one unit, COVID-19 deaths per million increase by 10.173 units.However, population density is negatively associated with deaths per million (B=-.285,P=0.078) at the 10% level, while GNI per capita fails to demonstrate a statistically significant relationship with COVID-19 deaths per million.

Table 2 . Pearson correlation coefficients
P-values are in parentheses.*** = Correlation is significant at the 0.01.** = Correlation is significant at the 0.05 level.* = Correlation is significant at the 0.10 level.

Table 3 . Regression analysis of factors associated with COVID-19 deaths in Africa
Dependent variable is COVID-19 deaths per million.P-values are in parentheses.***= significant at the 0.01.** = significant at the 0.05 level.