This paper uses data from the Demographic and Health Surveys (DHS) program in 29 countries to examine how the growing youth and young adult population (aged 15-24 years) is different from older adults in terms of prevalence of human immunodeficiency virus (HIV), knowledge and attitudes, risk taking, and economic vulnerability to the epidemic. The countries in this study are primarily African countries, along with the Dominican Republic and Haiti, all of which were supported by The US President’s Emergency Plan for AIDS Relief (PEPFAR). They all have all conducted a DHS version VI or VII survey since 2010.

The youth and young adult population segment is worrisome. In 2018 there were approximately 1.7 million new HIV infections globally among adults, with roughly 37% of those new infections in the 15-24 age group.1–3 Among this group of youth and young adults, women are disproportionately being infected with HIV; in 2016, women represented 59% of new infections.4 Among sub-Saharan African (SSA) adolescents (15-19 years)5,6 four of five new infections were in females, and prevalence was six times greater for sexually active women than for men. Recent studies have identified the role of underlying risk behaviors between young women and men such as earlier sexual debut, early marriage, disability, less condom use, transactional sex, number of sexual partners, and age gaps between partners.2,7,8

There is some literature on economic status and its influence on HIV risk in young adults. A South Africa study found that higher prevalence in females than young males is associated with lower education levels and lower incomes. Fox (2012), in a SSA study of economic factors9 reported that wealth (or poverty) is a far less important risk factor than is living in an area that has high-wealth inequality. And paradoxically, in high-wealth areas, those with the lowest incomes are more likely to be infected, and in low-wealth areas the richest persons were most at risk. Extending Fox’s (2012) work,9 a recent paper by Gaumer et al. (2021),10 which adjusts for sample bias in the DHS observational sample, reports that wealth inequality is a consistent risk factor in low- and middle-income countries (LMICs) across levels of wealth, though more pronounced among the poor, particularly so for young adults (age 15-24).

In response to such evidence that young, poor women are a high-risk group, in 2014 PEPFAR introduced the DREAMS program11 in selected locations within 10 countries, a unique, multi-sector approach aimed at reducing the HIV risk of adolescent girls and young women. Preliminary research, though largely anecdotal, was encouraging about the counseling services.12 More recent research on the DREAMS program in Uganda by Nakawooya (2020)13 demonstrates risk reduction behaviors and delay in sexual debut of teenage girls. But, the randomized impact evaluation of the DREAMS program is not yet complete.14


We obtained individual-level data from Demographic and Health Surveys (DHS) of 29 countries between 2010 and 2016 (DHS phases VI and VII) through the DHS data portal.15,16 Our study countries have varying rates of HIV prevalence, with the highest prevalence in Lesotho (with over 1 in 3 of the older adults being HIV positive, and nearly 10% in the age <25 young adults). Zimbabwe, Zambia, Mozambique, and Namibia are also very high-prevalence countries. The sample also includes other countries where adult prevalence is less than 1 in 50 (Burkina Faso, Dominican Republic, Ethiopia, Niger, Senegal).

Since the HIV epidemic began, the DHS survey of demographic and self-reported behavior has also included a biomarker component, wherein blood samples were drawn from consenting respondents to test for HIV. The resulting DHS public use data files link the survey responses with the HIV test results. A total of over 403,293 individual adults (defined by DHS as persons 15 years or older) including 157,935 youth and young adults 15-24 years of age, were pooled (using blood sample weights) and studied across the 29 countries comprised of 27 sub-Saharan countries plus the Dominican Republic and Haiti. We used 15-24 as the age group of interest as the DHS data includes adults 15 years and older, and this age range adheres to the WHO and UN definitions of youth.17,18

The Household Wealth Index was taken from DHS, which enumerated assets in the household (type of plumbing, type of transport vehicles, appliances, water source, etc.). Weights for various household assets are estimated by DHS within each country and an index value for each household is based on those weights. The level of wealth for each household are assigned to a tertile: high-, medium-, or low-wealth levels within each country. Using the household wealth index, inequality in the distribution of wealth is measured as a Gini coefficient. Each respondent was assigned to a wealth inequality tertile reflecting the high, medium, or low inequality of wealth across the sub-country regions in our sample of 29 countries. The low threshold values for tertiles 1 and 2 are 0.005 and 0.361 respectively.

We used a collection of available questions from DHS surveys to report on the knowledge of HIV transmission and prevention methods, attitude towards people living with HIV/AIDS and indicators of risky sex behavior. Individual questions were aggregated into composite variables for knowledge, attitudes, and three variables for risky behaviors: condom use last intercourse, sexually transmitted disease (STD) in last year, and ever multiple concurrent partners. A companion paper by Sherafat-Kazemzadeh provides more details.19

Absent a cohort design, as is the case with the DHS prevalence data, we observe that wealthy people who are HIV-positive tend to live longer than HIV-positive persons who are poor. Indeed, wealthy people, to the extent that wealth is “protective” may consequently be “oversampled” in DHS. To rebalance the DHS sample, this paper uses an adjustment technique of inverse probability weighting (IPW).20,21 This reweighting technique creates an “instrumental variable” for expected wealth level using a model to predict a person’s wealth level. The predicted values for each observation are used to create an inverse probability weight (e.g. w = 1/P). These weights are used to estimate prevalence rates and in the hypothesis testing models. More details of this procedure and the implications of reweighting on the relationship between wealth, weight inequality, and HIV prevalence are found in a recent paper by Gaumer et al.10

Regression models were fit on prevalence rates and risk factors (logistic regressions) and knowledge and attitudes (ordinary least squares, OLS). Odds ratios and regression coefficients are reported along with confidence intervals. We have examined the issue of collinearity across the covariates in our multivariate analyses. Included in this work we gathered Variance Inflation Factor (VIF) statistics,22 a statistic defined on least squares regression to measure the extent of collinearity for each covariate with respect to other covariates. The key variables remain significant. There does not appear to be a significant collinearity issue here.


Table 1 describes some of the sexual risk factors for young men and young women in study countries. Background data on country prevalence rates and demographic patterns are available in the Online Supplementary Document, Tables S1 and S2. The pattern in Table 1 reveals that young women are sexually active at an earlier age than young men, and they engaged earlier in risky sexual activities, such as contracting STDs and having multiple partners. The age trajectory of infection for young women starts earlier and is steeper than for young men.

Table 1.Breakdown by age and sex of youth and young adults sexual activity (percent)
Age 15-18 Age 19-21 Age 22-24
Male Female Male Female Male Female
Ever sexually active 34.4 41.0 70.9 80.5 84.7 91.0
HIV prevalence 0.8 1.5 1.4 3.4 2.2 4.7
Ever multiple partners 17.3 20.1 22.5 29.1 29.4 30.1
STD reported 0.9 2 3.1 5 4.8 6.5
Condom use 49.2 24.9 50.3 18.6 40.9 15.4

Notes: HIV – human immunodeficiency virus; STD – sexually transmitted disease.

Logistic regression models were fit on prevalence rates (Table 2), risk factors (Table 3) and odds ratios and confidence intervals are reported here. Ordinary least squares models for knowledge and attitudes about HIV can be found in the Online Supplementary Document, Table S3. These models show the differences between younger and older adults for these variables in two ways. One modeling approach (model 1) pools the two age-based populations and includes a variable that estimates the difference between the younger and older adults (holding constant gender, urban/rural, literacy, marital status, wealth level, wealth inequality level, and country). The second approach estimates separate models for each population age segment (models 2 and 3). For the prevalence regressions we also show models for young males and young females separately, which utilize interaction terms for family wealth and wealth inequality (models 4 and 5).

Results in Table 2 show that, all else held constant, the likelihood of positive prevalence is lower for the young adults is about 11% less than adults age 25 or older. The younger age group is distinctive by having larger gender differentials in prevalence (females have higher prevalence), and literacy turns out to be a risk factor for older adults. Marriage is slightly more protective in the younger adults. Wealth is also more protective in the younger group and not significant in the older adults. Living in areas with high-wealth inequality is a risk factor for all adults, and a larger risk for the young adults, particularly for young adults who are poor (being in lowest tertile of wealth). The added risk of being poor and living in a high-inequality area is actually shown to be higher for young men than young women. Table 2 also shows a steep age gradient for the youth and young adult population age 15-24. Other things the same, the change in prevalence (per 3-year age category) steepened for the young adults and is more pronounced for women than for men. The added HIV risk of age for the over 25-year old populations is negligible, holding other things in the model constant.

Table 3 examines sexual risk behaviors. To avoid the potential bias caused by receiving a positive HIV test result (such as a change in knowledge), these models are fit on the data for the large HIV-negative population only. The ‘safe-sex’ practices being measured were from self-reported data on whether a condom was used in the last intercourse, whether in the past year the person was free from STDs, and whether the respondent ever had concurrent sexual partners. The covariates are the same as the models in Table 2.

Consistently, these models show that being under age 25 is associated with less risky behaviors. Young adults 15-24 are 54% more likely that older persons to use condoms, 14% less likely to have had a STD last year, and 43% less likely to have had a concurrent partner. Two of the three risk variables are measured over long recall periods, such as STD in the past year and concurrent partners status measured over lifetime. These long recall periods might be expected to show more risky behaviors for older adults, who generally have a longer exposure period. But for the “condom use” behavior, where the recall period is the “last intercourse”, the risk behavior is better in the younger age segment. And among the young age segment 15-24, the females have riskier behavior than the young men on two of the three risky behaviors (52% less likely to have used condoms, and 52% more likely to have had an STD last year).

In these risk behavior models the demographic factors do not show a consistent pattern across the models. In Table 2 marriage and gender are the only consistent demographic factors; marriage and being female, other things the same, are consistently associated with more risky behaviors (less use of condoms, more STDs), though marriage is associated with fewer concurrent partners. Rural residents, other things the same, have less condom use than urban counterparts, but fewer STDs and fewer concurrent partners. Literate persons (relative to their counterparts who cannot read and write) are much more likely to use condoms and less likely to have had STDs, but seems unrelated to concurrent partner frequency.

The patterns of risk behavior and the wealth and wealth inequality indicators are not consistent. The poor (low wealth) are 25-35% less likely to use condoms, and about 10% more often to report STDs. Wealth inequality shows no consistent pattern for risky behaviors.

Table 2.Odds ratios and 95% confidence intervals for HIV prevalence by age group
Model 1 Model 2 Model 3 Model 4 Model 5
All adults Adults Adults Males Females
age 15-24 age 25+ age 15-24 age 15-24
HIV prevalence 5.22% 2.25% 7.17% 1.30% 2.96%
Age 15-24   0.0891***
Age 15-18 (ref) (ref) (ref)
Age 19-21 3.550*** 1.807 3.256***
[1.912,6.592] [0.994,3.283] [1.738,6.099]
Age 22-24 6.057*** 3.357* 5.837***
[3.359,10.92] [1.285,8.769] [3.147,10.83]
Age 25-27 (ref)
Age 28-30 0.923
Age 31-33 1.863
Age 34-43 0.945
Age 44+ 1.283
Female 1.400 2.737** 1.418
[0.856,2.290] [1.480,5.060] [0.884,2.276]
Literate 2.283** 1.377 2.503*** 2.515 1.411
[1.312,3.973] [0.795,2.383] [1.476,4.242] [0.786,8.047] [0.739,2.692]
Married 0.413*** 0.288*** 0.360*** 0.969 0.387***
[0.248,0.686] [0.147,0.563] [0.231,0.561] [0.453,2.071] [0.235,0.637]
Rural 1.435 0.640 1.605 1.603 0.487*
[0.603,3.414] [0.341,1.201] [0.789,3.265] [0.643,3.999] [0.272,0.871]
Wealth low (poor) (ref) (ref) (ref)
Wealth medium 0.682 0.457* 0.752
[0.403,1.154] [0.222,0.942] [0.442,1.282]
Wealth high 0.925 0.395** 1.086
[0.366,2.336] [0.216,0.724] [0.482,2.449]
Wealth Ineq low (ref) (ref) (ref)
Wealth Ineq medium 2.857* 2.010 2.822*
[1.148,7.114] [0.847,4.770] [1.153,6.910]
Wealth Ineq high 3.610** 5.912*** 2.885**
[1.584,8.225] [2.257,15.48] [1.318,6.316]
Poor+low ineq (ref) (ref)
Poor+high ineq 6.277** 6.619***
[1.846,21.35] [2.363,18.54]
Rich+low ineq 0.621 1.367
[0.101,3.828] [0.377,4.953]
Rich+high ineq 2.169 2.762
[0.678,6.932] [0.961,7.940]
Countries not shown
N 403,293 157,935 245,358 69,075 87,844

Notes: * P < 0.05, ** P < 0.01, *** P < 0.001; HIV – human immunodeficiency virus; Ineq – inequality; Ref – category omitted in the regression. Additional information on country coefficients is available from the corresponding author’s website: 95% confidence intervals are shown in brackets.

Table 3.Odds ratios and 95% confidence intervals on sexual behavior for the HIV negative population
Used condoms last intercourse No STDs in last year No concurrent partner, ever
Age All Adults Adults 15-24 Adults 25+ All Adults Adults 15-24 Adults 25+ All Adults Adults 15-24 Adults 25+
Dependent variable average 23.35% 48.56% 20.52% 83.48% 77.35% 85.85% 54.90% 24.99% 52.87%
Age 15-24 1.542*** 1.140*** 0.577***
[1.499,1.586] [1.109,1.171] [0.542,0.615]
Age 15-18 (ref) (ref) (ref)
Age 19-21 1.064** 0.475*** 1.265***
[1.016,1.114] [0.452,0.499] [1.112,1.440]
Age 22-24 0.991 0.419*** 1.653***
[0.944,1.041] [0.398,0.442] [1.451,1.883]
Age 25-27 (ref) (ref) (ref)
Age 28-30 0.919** 1.017 1.134*
[0.872,0.969] [0.972,1.065] [1.022,1.257]
Age 31-33 0.801*** 1.079** 1.426***
[0.756,0.849] [1.026,1.134] [1.279,1.589]
Age 34-43 0.638*** 1.227*** 1.742***
[0.609,0.669] [1.179,1.277] [1.592,1.906]
Age 44+ 0.408*** 1.725*** 2.768***
[0.385,0.432] [1.644,1.809] [2.513,3.048]
Female 0.478*** 0.476*** 0.420*** 0.392*** 0.482*** 0.385*** 0.772*** 1.086 0.751***
[0.466,0.490] [0.458,0.495] [0.405,0.435] [0.383,0.402] [0.461,0.504] [0.373,0.397] [0.716,0.832] [0.964,1.223] [0.680,0.829]
Literate 1.851*** 1.997*** 1.726*** 0.916*** 0.904*** 0.928*** 0.946 1.002 0.992
[1.783,1.922] [1.882,2.120] [1.644,1.812] [0.889,0.944] [0.857,0.953] [0.894,0.963] [0.885,1.012] [0.860,1.167] [0.918,1.071]
Married 0.113*** 0.135*** 0.117*** 0.672*** 0.595*** 0.924*** 2.763*** 1.617*** 2.587***
[0.110,0.116] [0.129,0.142] [0.113,0.122] [0.654,0.692] [0.568,0.623] [0.890,0.958] [2.603,2.934] [1.436,1.820] [2.393,2.796]
Rural 0.717*** 0.709*** 0.754*** 1.149*** 1.183*** 1.063** 1.207*** 1.004 1.221***
[0.693,0.742] [0.673,0.746] [0.720,0.790] [1.112,1.187] [1.120,1.249] [1.020,1.107] [1.129,1.290] [0.880,1.144] [1.127,1.322]
Wealth low (poor) (ref) (ref) (ref) (ref) (ref) (ref) (ref) (ref) (ref)
Wealth medium 1.258*** 1.356*** 1.205*** 0.932*** 0.928** 0.919*** 1.011 1.015 0.971
[1.216,1.302] [1.289,1.427] [1.150,1.262] [0.906,0.959] [0.883,0.976] [0.887,0.952] [0.952,1.074] [0.890,1.158] [0.906,1.042]
Wealth high 1.835*** 2.130*** 1.704*** 0.947** 0.963 0.923*** 1.047 1.033 0.991
[1.764,1.909] [2.007,2.261] [1.615,1.798] [0.914,0.982] [0.905,1.025] [0.883,0.966] [0.972,1.128] [0.885,1.205] [0.908,1.081]
Wealth Ineq low (ref) (ref) (ref) (ref) (ref) (ref) (ref) (ref) (ref)
Wealth Ineq medium 1.103*** 1.131*** 1.083** 0.974 1.014 0.943** 0.981 0.973 0.972
[1.061,1.145] [1.070,1.196] [1.027,1.142] [0.943,1.006] [0.959,1.071] [0.906,0.982] [0.913,1.055] [0.835,1.134] [0.894,1.058]
Wealth Ineq high 1.028 1.034 1.033 0.944** 0.957 0.915*** 0.996 0.965 0.992
[0.987,1.071] [0.973,1.100] [0.976,1.092] [0.909,0.980] [0.898,1.020] [0.873,0.959] [0.919,1.080] [0.822,1.134] [0.901,1.092]
Countries not shown
N 280,578 80,606 199,972 380,310 153,520 226,790 34,086 9,916 24,170

Notes: * P < 0.05, ** P < 0.01, *** P < 0.001; STD – sexually transmitted disease; Ineq – inequality; Ref – category omitted in the regression. Additional information on country coefficients is available from the corresponding author’s website: 95% confidence intervals are shown in brackets

Online Supplementary Document, Table S3, uses OLS models to show the factors driving differences in knowledge about HIV (on a scale of eight items) and attitudes about having HIV (on a scale of four items) using only the HIV-negative population. For both dependent variables, higher values indicate more knowledge, or better attitudes. Models show that there is no knowledge difference between the older and younger adults, but attitudes are worse among the younger group, other things the same. In the younger population 15-24 females have more knowledge and better attitudes. With respect to the covariates, the patterns show that urban residents, literate persons, and persons who are not poor tend to have more knowledge, and better attitudes than their counterparts. Married persons have inconsistent patterns. Areas of high inequality in wealth tend, other things the same, to have better attitudes, but inconsistent patterns for knowledge.


The prevalence models tell an interesting story about the HIV prevalence differences between young and the older adults in the study countries. From the prevalence models there are four major differences between the younger and older segments of the population: (i) a very steep age trajectory for prevalence in the young adults (and no systematic age trajectory for the over 25 population), (ii) being female has a much larger prevalence differential in the young group, (iii) poverty is a risk factor in only the younger group, and (iv) wealth inequality is risky for both age segments, but much larger for the younger adults. Additional information on HIV prevalence by age group, wealth and region, as well as risk profiles by sex and age group can be found in the Online Supplementary Document Figures S1 – S5.

Digging deeper into these patterns in our models for the young adult group, we see that the age trajectory of prevalence is much steeper for young women than young men – largely due to young women’s physical development leading to their earlier sexual debut (Table 1). We also show that being unmarried and living in urban areas is riskier for young females than young males, and that being poor and living in a place that has high-wealth inequality is a very large risk factor for the young age segment. What is it about being young, female, poor, and living in urban places that puts young women at more HIV risk than young men or older women? Unfortunately, our models do not provide a direct answer, but one possibility is proximity to older, wealthier men who engage sexually with young women. A second factor seems evident from our data suggesting that younger women are being driven to engage in risky sexual practices (not wearing condoms, suffering STDs, and having multiple partners). This could potentially be related to transactional sex, which is common in age-disparate relationships in Eastern, Southern, Western and Central Africa, resulting in a cycle of transmission from older men to younger women.8,23,24 And, younger women in these circumstances are often unable to negotiate condom use. 25–27

One of the striking and possibly unexpected findings shown in Table 3 is that the group of young adults (age 15-24) are less likely engage in risky sexual practices than their older counterparts. Indeed, young adults, other things the same, are much more likely to use condoms, not as likely to report STDs, and report fewer concurrent partners than older adults. In spite of this study’s findings that young adults have poorer attitudes about HIV/AIDS, the issue with young adults appears definitely to not be their relative sexual recklessness, but their sheer numbers. The large cohort of 15-24-year olds (about 40% of the population in these 29 countries) is an epidemic challenge. Within that population we also find here that the variance in prevalence rates across these young adults seems mainly due to three factors: (1) young women have higher prevalence than young men by about 45%, (2) for young women, in spite of having more HIV knowledge and better attitudes than young men, they have more risky sexual practices in terms of lack of condom use and more STDs, and (3) among both genders, prevalence is about 23% higher when the young person’s family lives in a high-wealth inequality place. These findings are consistent with other HIV research conducted in SSA, as well as other analyses of DHS data.28–31


This study has limitations. Self-reported measures of risk behavior on DHS surveys and the high frequency of missing values on some items are a problem, and often stem from systematic instrument differences across countries. The DHS survey also provides little information on sexual behavior and partners that would help understand risk behavior differences between young women and young men, such as if young people ever engage in transactional sex.


The high HIV prevalence among young and poor women living in places with high-wealth inequality suggests patterns of economically motivated risk behavior, and possibly transactional sex. While we cannot confirm this through analysis of DHS data, if true, it indicates that economic motivations for risky behaviors by young women may be a dominant concern. And it suggests that PEPFAR programs like DREAMS, which are targeted to locations and populations most at risk, may reduce the incidence of transactional sex among poor young women. The added risk for young, poor men is shown here to be even higher than comparable young women in these same places. This suggests the need for continued targeting both sexes separately and concentrating in those high-wealth inequality areas.


Authors thank Tymon Słoczyński from the International Business School at Brandeis for assistance with the inverse probability weighting, and Clare L. Hurley of Brandeis University for editorial assistance.


This paper was produced with funding from Centers for Disease Control and Prevention (CDC), Division of Global HIV/AIDS and TB (DGHT) under Cooperative Agreement Number U2GGH001531. Its contents are solely the responsibility of Cardno and Brandeis University and do not necessarily represent the official views of CDC.

Authorship contributions

Author contributions to the manuscript were design of study and review (GG, AN), computing and weighting (RS, DH), literature review (RS, MJ, VB), and writing (GG, RS, MJ).

Competing interests

The authors have completed the Unified Competing Interest form at and declare no conflicts of interest.

Correspondence to:

Gary Gaumer, PhD, Institute for Global Health and Development, The Heller School for Social Policy and Management, Brandeis University, 415 South Street, MS035, Waltham, MA 02453, USA, [email protected]