Prematurity is the leading cause of death in children younger than five years. The majority of these deaths occur in low- and middle-income countries (LMICs).1 Early identification of a newborn as premature can be lifesaving because it may trigger specialized care or referral to a higher-level facility. Across LMICs, there exist regional, country-level, and local differences in access to antenatal and neonatal care. In sub-Saharan Africa (including Malawi), only 25% of women receive early antenatal care within the first 3 months of pregnancy, whereas in Southern Asia (including India), approximately 50% of women have early antenatal care coverage.2 Country-level differences exist with regard to the availability of physicians. India has a significantly more robust workforce with 9.28 physicians per 10 000, whereas Malawi has only 0.36 physicians per 10 000 population.3 Despite these differences, the burden of prematurity is significant in both India and Malawi. India has the greatest number of preterm births in the world with more than 3.5 million infants born prematurely each year, and Malawi has the highest rate of preterm births, with 18.1 preterm births per 1000 live births.4 The strategic objectives of the World Health Organization (WHO) for the care of small and sick newborns include the expansion of specialized care for small newborns and improved data and metrics for these infants.5

In many LMICs, an accurate estimate of gestational age may be lacking at the time of birth. The WHO recommends that pregnant women receive an ultrasound before 24 weeks’ gestation to accurately estimate gestational age.6 Ultrasound in the third trimester is a poor determinant of an estimated date of delivery because of variation in size for gestational age.7 However, early ultrasound is often not available because of late prenatal care or limited availability of ultrasound equipment and trained technicians.8,9 In the absence of ultrasound data, last menstrual period (LMP) is an accepted method of gestational age assignment, but its accuracy is hampered by low literacy rates, recall bias, and late prenatal care.10 Symphysis-fundal height is also used in certain settings to estimate the gestational age at the time a mother presents in labor. This method has been shown to predict gestational age to within ±5-6 weeks for 95% of women.11,12 This high degree of inaccuracy limits the practical use of symphysis-fundal height to determine gestational age. Prenatal gestational age determination is often inaccurate or lacking completely.

In the absence of accurate antenatal dating, several methods of assigning gestational age using variables collected during standardized physical examinations at birth have been developed. Most postnatal dating methods combine the infant’s physical appearance with components of a neuromuscular examination. Two of the most common are the Dubowitz and the Ballard examinations, although many other methods have been proposed but not widely adopted.13 The Dubowitz examination combines 22 signs (12 physical and 10 neuromuscular) into a composite score that is inserted into a linear equation to produce an estimated gestational age between 26 and 44 weeks.14 A meta-analysis comparing 18 different methods for the determination of gestational age found the Dubowitz examination to be the most accurate, dating 95% of pregnancies within ±2.6 weeks of dating based on an ultrasound.15 The Ballard examination is a simplified version of the Dubowitz examination that uses 12 physical and neuromuscular signs to assign gestational age.16 Meta-analysis of studies validating the Ballard method found that 95% of pregnancies were dated within ±3.8 weeks of ultrasound dating.15 A direct comparison of the Ballard and Dubowitz examinations found the Dubowitz examination to produce an estimated gestational age that correlates more closely with LMP dating than the Ballard examination.17

Despite the documented potential to improve the identification of premature infants, the Dubowitz examination is not widely used in LMIC settings. Several factors may account for this. First, the examination components are complex, and preparation of examiners requires trainers with sufficient expertise, particularly in the neuromuscular/neurological maneuvers. Trainers with this expertise may not be available in some settings.18 Second, the examination can be seen as time-consuming, which may inhibit adoption in busy health centers or hospitals with a shortage of workers and overwhelming workloads. Finally, certain signs require infant positioning that may appear uncomfortable to parents. While all of these factors could potentially impede routine use of the Dubowitz examination, there is little documentation of the feasibility and acceptability of training and administering the examination.

The objective of this cross-sectional study was to explore factors related to the training and perception of the Dubowitz examination that may influence its implementation in India and Malawi.


Study design

This cross-sectional study included (1) pretraining and posttraining surveys administered during a Dubowitz examination training workshop and (2) an analysis of trainer-trainee agreement for Dubowitz examination results. This exercise was conducted as part of the training for a larger prospective cohort study: the Low-Birthweight Infant Feeding Exploration (LIFE) study ( registry: NCT04002908, Clinical Trials Registry: India CTRI/2019/02/017475). The secondary data analysis was exempt from ethical review by the Harvard T.H. Chan School of Public Health Institutional Review Board.

Training description

One of the investigators (KN) led training workshops to teach the Dubowitz examination to clinicians in India and Malawi as a component of the overall training for the LIFE study. The site in Malawi was a research center that is affiliated with a government, university hospital in Lilongwe. The site in India was a research center that is affiliated with a university hospital in Belgaum, Karnataka State, although some trainees traveled from Odisha State. The training was conducted over the course of two half-day sessions in each location. The first half-day session included a classroom-based format in which individual signs of the Dubowitz examination were explained with the use of pictures and manikins. The second session focused on hands-on skills in which trainees practiced the Dubowitz components with live late-preterm or term neonates in small groups supervised by the trainer and independently.

Pretraining/posttraining surveys

Before the training, participants completed a survey regarding their clinical background and history of using various gestational age dating methods. Immediately after the training, participants completed a second survey to evaluate the components of the workshop and the feasibility of implementing the Dubowitz examination within their local context. Responses to the surveys were anonymous. For data analysis of the pretraining and posttraining surveys, we generated basic descriptive statistics on demographics and multiple-choice responses. We collated the responses to open-ended questions and examined them for themes.

Standardization exercise

At the end of the skills session, trainees completed a standardization exercise in which they independently completed a Dubowitz examination with one to three infants. The examination was independently performed by the trainer who served as the comparison for trainer-trainee agreement using a Bland-Altman analysis. These results are plotted on a scatterplot with circles indicating the correlation between the average of pairs and difference between the two values. The circle size is weighted to indicate the number of repeated responses by trainer-trainee pairs with an identical delta and average estimated gestational age. We also conducted Bland-Altman analysis with physician-only and nurse-only subgroups.

For each individual examination component, we calculated the percentage of trainee results that were within one point of the trainer. We also calculated the average time trainees took to complete the full examination. Examinations that were incomplete were not included in the analysis of individual signs or composite scores. We conducted data analyses using Stata 15 (StataCorp, College Station, Texas) and Microsoft Excel.


Trainee characteristics

Twenty trainees participated in the Dubowitz workshop in June and July 2019, seven from India and 13 from Malawi (Table 1). All participants were members of the research team for the LIFE study, and the majority (90%) had a clinical background: 50% of the participants were nurses, and 35% were physicians. Most trainees in Malawi were nurses (10 of 13), whereas most trainees in India were physicians (5 of 7). More than half (55%) had more than 10 years of experience in their role, and 45% had prior experience with either the Dubowitz or Ballard examination.

Table 1.Trainee demographics (N=20).
Country n (%)
India 7 (35)
Malawi 13 (65)
Nurse 10 (50)
Physician 7 (35)
Nonclinical researcher 2 (10)
Clinical officer* 1 (5)
Years of experience in role
0-5 6 (30)
6-10 3 (15)
11-15 3 (15)
>15 8 (40)
Prior experience with Ballard or Dubowitz
Yes 9 (45)
No 9 (45)
Not known 2 (10)

*Clinical officers are nonphysician practitioners.

Pretraining survey

In the pretraining survey (Table 2), we asked participants to identify methods of both prenatal and postnatal gestational age assignment that they had previously used. Most participants (85%) reported using LMP to estimate gestational age. More than half of the participants (55%) reported ever using any type of postnatal examination. Forty-five percent had used symphysis-fundal height to estimate gestational age, all in Malawi.

Table 2.Pretraining and posttraining survey results (N=20 unless otherwise specified).
Pretraining n (%)
  Methods used to estimate gestational age in clinical practice*
  Last menstrual period 17 (85)
  Examination of infant after birth 11 (55)
  Symphysis-fundal height 9 (45)
  Ultrasound, first trimester 9 (45)
  Ultrasound, second trimester 8 (40)
  Ultrasound, third trimester 9 (45)
  Birthweight 5 (25)
  Past challenges encountered with use of Dubowitz or Ballard examination (among trainees with previous experience)* (n/N)
  Discomfort by mothers 5/9 (56)
  Baby cries too much 4/9 (44)
  Provider lack of confidence in conducting the examination 3/9 (33)
  Provider concerns related to hurting the baby 3/9 (33)
  Baby becoming too cold 1/9 (11)
  Insufficient provider training 1/9 (11)
  Confidence performing the examination  
  Not at all confident 0 (0)
  Slightly confident 0 (0)
  Moderately confident 11 (55)
  Very confident 9 (45)
  Length of training  
  Too short 6 (30)
  Just right 14 (70)
  Too long 0 (0)
  Perceived challenges related to examination/training*  
  Takes too much time to learn 6 (30)
  Takes too much time to perform 5 (25)
  Too complex 3 (15)
  Afraid of hurting the baby 2 (10)
  Other 1 (5)
  Perceived challenges related to parental acceptance of the examination* 
  Baby cries too much 14 (70)
  Pain caused to baby 13 (65)
  Harm to baby 8 (40)
  Long duration of examination 8 (40)
    Other 3 (15)

*Respondents could select more than one answer.

When asked for participants’ opinions of which methods most accurately assessed gestational age in the absence of optimal prenatal dating, most trainees identified the examination of an infant after birth (60%). Less common answers included third-trimester ultrasound (20%), LMP recall (20%), and birthweight (10%). No trainees identified symphysis-fundal height as an accurate dating method.

The nine trainees who had previously used the Ballard or Dubowitz examinations identified challenges they had encountered. More than half (56%) had encountered parental discomfort during the administration of the examination. Nearly half (44%) also reported excessive crying by the babies. The same proportion (44%) reported feeling that they were either insufficiently trained or lacked confidence in their ability to accurately perform the examination

Posttraining survey

After training, all participants reported confidence in their ability to complete the examination (Table 2). The majority (70%) reported that the length of the training was appropriate; however, when asked for recommendations to improve the training, 70% suggested adding components to the training to make it longer and more in-depth. Most recommended more hands-on practice, particularly with preterm infants.

Nearly all trainees (89%) believed that they would use the Dubowitz examination in their clinical practice in the future and that it would not cause disruptions in workflow. Most trainees (58%) did not think that the Dubowitz examination would be limited because of the physical space in their work environment. Some concerns were expressed about the time required to learn and complete the examination. Thirty percent of participants reported that the training took too much time, and 25% reported that it took too long to perform the examination. Eighty-five percent of trainees anticipated that parents may have a negative perception of the examination, including concerns that the examination would cause distress or pain to their child.

Dubowitz standardization findings

Trainer-trainee agreement: Individual signs

Trainees independently completed the Dubowitz examination on 23 infants for trainer-trainee comparisons. For each individual sign, we calculated the trainer-trainee agreement as the percentage of trainee scores falling within one score point of the trainer (Figure 1). We found greater than 80% agreement for 16 of the 22 signs (male/female genitalia considered separately). For five signs, 100% of trainee values were within a single point of the trainer. These signs were female labia, ear recoil, breast size, scarf sign, and skin texture. The four signs with the lowest rate of agreement were popliteal angle (74%), head lag (73%), male testes (70%), and edema (61%).

Figure 1
Figure 1.Trainer-trainee agreement on individual Dubowitz score signs: percentage of trainee scores within one point of trainer by individual sign (n=23).

Trainer-trainee agreement: Composite score

Bland-Altman analysis comparing trainee to trainer gestational age scores demonstrated 95% limits of agreement of -3.7 to 2.7 weeks with no significant bias (Figure 2). On subgroup analysis, physicians performed similarly to nurses (physicians: 95% CI -3.9 to 2.1 weeks; nurses: 95% CI -3.4 to 3.0 weeks). Most infants included in this analysis were late preterm (34-36 weeks’ gestation) or term (≥37 weeks’ gestation) with a range of paired average values between 34.4- and 39.4-weeks’ gestation; the mean difference was not significant. The average time for a trainee to complete the composite examination was 14 minutes.

Figure 2
Figure 2.Bland-Altman plot of trainer-trainee agreement between the estimated gestational ages of trainer and trainee.

Each dot represents a paired measurement that is averaged and plotted against the difference in values. Larger circles are weighted to indicate repeated pairs of trainee/trainer responses (occurred three times in this dataset). The 95% limits of agreement were -3.7 to 2.7 weeks.


Principal findings

The Dubowitz examination is a tool that can improve the classification of infants as preterm in the absence of prenatal care. We found that clinicians could be trained to perform the Dubowitz examination with a reasonable level of competency and that trainees did not perceive the time or complexity of the training to pose a significant barrier to implementation. After the training, all trainees reported confidence in their ability to complete the examination. Most reported that the training was of appropriate length. The predominant potential barrier to the administration of the Dubowitz examination by trainees was the perception of negative parental reactions to the examination. Although we expected that trainees would identify the complexity of the examination as a potential barrier to implementation, only a minority of trainees reported that the examination was too complex (15%) or time-consuming (30%) to teach health workers.

At the end of the training, trainer-trainee agreement for individual signs was high, with more than 80% of trainee scores falling within one point of the trainer for most signs. Almost all (95%) of the trainee estimates of gestational age were within ±3 weeks of the reference pediatrician’s estimate. It is possible that the distribution of composite scores would improve further if future trainings targeted the individual signs associated with the lowest levels of trainer-trainee concordance.

Strengths and limitations of the study

Strengths of this study include (1) the combination of a posttraining skills assessment with the survey exploring potential barriers to implementation of the Dubowitz examination and (2) the inclusion of participants in select study facilities in both India and Malawi. Other studies have focused primarily on the accuracy of the Dubowitz examination compared with other methods of gestational age assignment, not on questions related to training. Furthermore, we found no previously published qualitative studies related to the feasibility of implementing the Dubowitz examination in a clinical setting in LMICs. The information from our study contributes to these gaps in the literature.

Like all research, our study had some limitations. Because this was conducted in preparation for a separate study, the number of trainees was limited. This small sample size may have limited our ability to detect results that were manipulated as a result of intrahospital or regional trends. Nearly half of the trainees had prior experience performing either the Dubowitz or the Ballard examination, although several reported that they had received only abbreviated training. The assessment of trainer-trainee agreement was conducted immediately after the training but ideally would also be conducted at a later time to assess skill retention. Infants examined in this workshop were late preterm or full term; therefore, the results may not be applicable to a population that was extremely or moderately preterm.

Differences in results from other studies

Trainees did not express concern about the complexity of the Dubowitz examination or the belief that it could be harmful to the examined infants. This is contrary to the opinion expressed elsewhere in the literature. Other authors have identified the complexity of the Dubowitz examination as the impetus for creating simplified postnatal assessments, such as the New Ballard.19–22 These simplified methods of gestational age calculation are typically shortened at the cost of examination validity.15 Only 25% of trainees in our study were concerned that the examination would take too long for health workers to perform in a clinical setting.

One hindrance to widespread implementation may be the time required to conduct the examination. On average, trainees required 14 minutes for completion, a time that could be prohibitive in a busy clinical setting. We anticipate the time required to perform the examination may decline with practice, which would be consistent with other studies of the Dubowitz examination in which the average time ranged from 5 to 8 minutes.23

Another reported barrier to Dubowitz implementation was the concern that the examination may harm or cause discomfort to the infant. The Dubowitz examination has been deemed by some researchers and clinicians to be excessively disturbing to an infant.24 Certain neuromuscular signs, such as ventral suspension, have been considered particularly concerning.25 In our survey, only a small percentage (10%) of trainees were concerned that they may harm the infant being examined.

Implications for clinicians and policymakers, unanswered questions, and future research

Our results suggest that examiners can be trained to an adequate level of proficiency after a brief workshop. Particular focus on the individual signs with the lowest trainer-trainee agreement in subsequent training sessions (edema, testes, head lag) may improve agreement on composite scores. Nearly all participants in this study were either physicians or nurses, so these results related to the reliability after training may not be applicable to all cadres of health workers.

Contrary to the literature justifying simplified algorithms of postnatal gestational age assignment, examination complexity may not be a significant barrier to implementation; other barriers such as time requirements or parental concern may need to be addressed. The majority of trainees believed that parents would find the examination to be excessively distressing to an infant. Future implementation efforts could include strategies to address parental concerns and minimize disturbance to infants. Our assessment of potential barriers to implementation included trainees only. Future studies could include other stakeholders such as traditional birth attendants, parents, or community health workers. Focus-group discussions and in-depth interviews are a fruitful method of obtaining insight on feasibility and acceptability among clinicians and parents alike.


In settings in which prenatal care is limited or delayed, the Dubowitz examination is a valuable tool for gestational age assignment; however, implementation has not been widespread. We demonstrate that clinicians can be trained to a sufficient level of reliability in a short session. Subsequent training workshops could be strengthened by placing greater focus on certain signs with lower rates of trainee-trainer agreement. The time required to perform the examination may not be a barrier to implementation in certain settings. However, parental concerns may be a barrier. Strategies to better understand parental attitudes and overcome this potential barrier should be included in future studies. Future research should also target other stakeholders important to the adoption of the Dubowitz examination.


The authors would like to thank all participants in the Dubowitz examination training workshops and the members of the LIFE study team. We would like to thank Katelyn Fleming and Eliza Fishman, who played integral roles in preparing for the LIFE training workshops. Importantly, we would like to thank the mothers and infants who participated in the training workshops.

Data-sharing statement

Individual participant data that underlie the results reported in this article will be available after deidentification immediately after publication to researchers who provide a methodologically sound proposal to be used to achieve aims in the approved proposal upon request to the LIFE study leadership team. Email communication may be directed to the corresponding author at [email protected].


This work was supported by the Bill & Melinda Gates Foundation, grant number OPP1192260. The Bill & Melinda Gates Foundation reviewed the study design and sample-size calculations but was not involved in data collection, management, analysis, interpretation, writing of the manuscript, or the decision to submit manuscripts for publication.

Authorship contributions

ACCL, CB, KN, and KEAS conceptualized and designed the study. KN and VH led the training workshops. KN performed the data analysis and wrote the manuscript. All authors reviewed multiple drafts of the manuscript, provided critical feedback, and approved the final version.

Competing interests

Dr. Lee reports grants from the Bill & Melinda Gates Foundation, grants from the National Institutes of Health/Eunice Kennedy Shriver National Institute of Child Health and Human Development, and others from the WHO Department of Maternal, Newborn, Child and Adolescent Health Research outside the submitted work. Dr. Semrau reports grants from the Bill & Melinda Gates Foundation during the conduct of the study.

Correspondence to:

Krysten North, MD, MPH

Department of Pediatric Newborn Medicine

Brigham and Women’s Hospital

75 Francis Street

Boston, MA 02115

[email protected]