Read Aloud the Text Content

This audio was created by Woord's Text to Speech service by content creators from all around the world.


Text Content or SSML code:

Early pubertal timing has consistently been associated with internalizing psychopathology in adolescent girls. Here, we aimed to examine whether the association between timing and mental health outcomes varies by measurement of pubertal timing and internalizing psychopathology, differs between adrenarcheal and gonadarcheal processes, and is stronger concurrently or prospectively. We assessed 174 female adolescents age 10.0–13.0 at Time 1 twice, with an 18month interval. Participants provided selfreported assessments of depressionanxiety symptoms and pubertal development, subjective pubertal timing, and date of menarche. Their parentsguardians also reported on the adolescent’s pubertal development and subjective pubertal timing. We assessed salivary dehydroepiandrosterone DHEA, testosterone, and estradiol levels and conducted clinical interviews to determine the presence of case level internalizing disorders. From these data, we computed 11 measures of pubertal timing at both time points, as well as seven measures of internalizing psychopathology, and entered these in a Specification Curve Analysis. Overall, earlier pubertal timing was associated with increased internalizing psychopathology. Associations were stronger prospectively than concurrently, suggesting that timing of early pubertal processes might be especially important for later risk of mental illness. Associations were strongest when pubertal timing was based on the Tanner Stage Line Drawings and when the outcome was caselevel Diagnostic and Statistical Manual of Mental Disorders, 4th edition DSM–IV depression or Hierarchical Taxonomy of Psychopathology HiTOP distress disorders. Timing based on hormone levels was not associated with internalizing psychopathology, suggesting that psychosocial mechanisms, captured by timing measures of visible physical characteristics might be more meaningful determinants of internalizing psychopathology than biological ones in adolescent girls. Future research should precisely examine these psychosocial mechanisms. Adolescence is a sensitive period of life for neurobiological development and risk for psychopathology Ladouceur et al., 2012 Paus et al., 2008. Girls are two to three times more likely to experience depression than boys from puberty onward Angold et al., 1998. The substantial changes in social, physical, and hormonal development that occur during pubertal development can be related to mental health outcomes Patton Viner, 2007. Pubertal timing, which is pubertal status or stage relative to sameage and samesex peers, has repeatedly and independently been associated with risk for psychopathology Angold Costello, 2006 KaltialaHeino et al., 2003 Mendle et al., 2010. In particular, many studies show that early timing i.e., developing ahead of peers is associated with increased risk for internalizing disorders like depression and anxiety Graber, 2013, although some studies have found that this effect is small Stice et al., 2001 or not statistically significant Angold et al., 1998. A recent metaanalysis Ullsperger Nikolas, 2017 of 101 studies found that, overall, early timing is associated with more internalizing psychopathology, although this was moderated by the measurement method of pubertal timing. Different methods may tap into two different groups of mechanisms proposed to drive the association between pubertal timing and mental health psychosocial and biological mechanisms Rudolph, 2014. Biological processes include sensitivity of the brain to pubertal hormones, for example Ge Natsuaki, 2009. Meanwhile, psychosocial mechanisms might include negative selfperceptions of physical differences, or consequences of others overestimating an adolescent’s social or cognitive maturity. Subjective timing asking adolescents to rate their own pubertal timing addresses psychosocial mechanisms more, whereas age at menarche or hormone levels relative to age both represent biological mechanisms, and physical maturation measures e.g., the Pubertal Development Scale PDS Petersen et al., 1988 or the Tanner Staging Line Drawings LD Morris Udry, 1980 capture a combination of both because they are the direct result of hormonal changes but are also visible to the adolescent and people in their environment. In addition to distinctions between measures that capture biological versus psychosocial mechanisms, another meaningful distinction in measurement of pubertal timing lies in the different processes of puberty, adrenarche and gonadarche Counts et al., 1987. Compared with gonadarche, there is much less research on adrenarcheal processes predicting psychopathology. Adrenarcheal processes include increases in hormones produced in the adrenal gland dehydroepiandrosterone DHEA and, in girls, testosterone, whereas gonadarche is driven by gonadal hormones like estradiol. Animal and human studies have shown that adrenal hormones can influence brain function and development through a range of mechanisms, including antiglucocorticoid effects of DHEA Campbell, 2011 Maninger et al., 2009, but the direct evidence for associations between timing of adrenacheal processes of puberty and internalizing psychopathology is inconsistent Byrne et al., 2017. Differences in measurement or definition of internalizing psychopathology may also contribute to inconsistencies in associations with pubertal timing for review, see Negriff Susman, 2011. The aforementioned metaanalysis found a significant association between pubertal timing and both “distress” and “fear” psychopathology Ullsperger Nikolas, 2017. Importantly, however, they did not distinguish between symptomatic and diagnostic measures of psychopathology. Limiting outcomes to only caselevel diagnoses may miss associations between pubertal timing and variation in subclinical symptoms or have reduced power compared with continuous symptomlevel variables with greater sample variance. On the other hand, focusing only on symptoms may obfuscate clinically meaningful outcomes, and typically relies on selfreport questionnaires, which can include subjective bias. It is also possible that discrete diagnostic categories alone may not fully capture the spectrum of mechanisms underlying developmental psychopathology. The categorical framework of the Diagnostic and Statistical Manual of Mental Disorders DSM may not fully capture the heterogeneity within disorders and the common cooccurrence between certain disorders. The Hierarchical Taxonomy of Psychopathology HiTOP is an example of a researchdriven approach to classifying mental disorder, wherein the structure of psychopathology is conceptualized through higher order dimensions e.g., internalizing within which lower order subfactors e.g., distress, fear are embedded. Studies using both approaches have found associations between timing and internalizing disorders Alloy et al., 2016 Graber et al., 1997 Platt et al., 2017. Critically, that same metaanalysis Ullsperger Nikolas, 2017 found that age of the sample did not moderate the association between early timing and psychopathology, but they only used crosssectional data, and cannot show if pubertal timing can predict later mental health outcomes. Of the handful of prospective longitudinal studies available, some show that various measures of pubertal timing Conley et al., 2012 Copeland et al., 2010 Graber et al., 2004 Marceau et al., 2011 have been prospectively associated with internalizing psychopathology in later adolescence and sometimes through young adulthood, although not always Lee et al., 2017. There is also conflicting evidence whether early timing is related to internalizing psychopathology when controlling for history of psychopathology Crockett et al., 2013 Hamlat et al., 2020. This has implications for identifying the best time window for prevention and early intervention efforts that mitigate the risk of internalizing psychopathology. Research Questions and Hypotheses Previous research has established that early pubertal timing is a risk factor for internalizing psychopathology in adolescents. However, several substantive i.e., mechanistic and methodological questions remain. These are related to a the measurement of pubertal timing, b the relevance of adrenarcheal versus gonadarcheal processes, c the measurement of internalizing psychopathology, and d the strength of concurrent versus prospective associations between timing and psychopathology. The aim of the current longitudinal study was to determine the ways in which pubertal timing is crosssectionally and prospectively associated with internalizing psychopathology in a sample of mostly White adolescent girls Barendse et al., 2020. We focused on female adolescents because an important part of the analyses includes pubertal processes, which differ vastly between the sexes, and because girls become increasingly at risk for internalizing mental health problems during puberty Angold et al., 1998 Patton et al., 1996. To address the open questions discussed above, we applied specification curve analysis SCA also called multiverse analysis, a technique that allows researchers to examine and report all nonredundant, reasonable, and justifiable measurement and analytic specifications, and to identify the consequences of specification decisions Simonsohn et al., 2020. SCA thereby prevents selective reporting and phacking in the context of multiple, justifiable measurement and analytic approaches, and it has builtin usually bootstrapping approaches to handle multiple comparisons problems. The choices in the SCA included detailed in the Method section 1. Different types of measurement methods of pubertal timing 2. Within those types, measures of adrenarcheal vs. gonadarcheal processes 3. Different types of measurement methods of internalizing psychopathology 4. Crosssectional and prospective associations between pubertal timing and internalizing psychopathology 5. Inclusion of control variables the covariates we considered were body mass index BMI, threatrelated early life stress, and preexisting internalizing psychopathology 6. If missing data were imputed or deleted listwise. Based on the findings from the previous metaanalysis Ullsperger Nikolas, 2017, we predicted that the largest effect sizes for the association between pubertal timing and internalizing problems would be for age of menarche and timing measured through selfreported Tanner scores. We did not make hypotheses about differences between timing of adrenarcheal and gonadarcheal processes, including adrenal and gonadal hormones, because no previous studies have compared these. We further expected to see associations with all forms of internalizing psychopathology. Finally, based on the literature to date, we expected both crosssectional and prospective associations, but had no predictions about the relative strength of each compared with the other. Method Participants We recruited 174 female adolescents for this longitudinal study, primarily from schools. Inclusion criteria at enrollment included age 10.0–13.0 years fluent in English no developmental disability or autism, psychotic disorder, or behavioral disorder ODDCD and no current use of psychotropic medication other than stimulants. We used data from the first two time points Time 1 and Time 2, which were 18 months apart M age at Time 1 11.63, SD .82 M age at Time 2 13.20, SD .84 M timespan 1.57 years, SD .12 years. We administered all measures below at both time points. Sixtysix percent of participants were White, and of the remaining participants most were multiracial or LatinaHispanic. Full inclusion and exclusion criteria, racial and socioeconomic status distribution of the sample, and further details on the procedure can be found in the protocol paper Barendse et al., 2020. We received ethics approval from the Institutional Review Board of the University of Oregon protocol 03232015.027. Parents provided informed consent and adolescents assented to participate. Measures of Pubertal Timing Subjective Timing We used the question in the Pubertal Development Scale Petersen et al., 1988 that asks about subjective impression of pubertal timing as a measure of subjective timing, both adolescent and parentreported “Do you think youryour child’s development is any earlier or later than most other girls yourher age” This question was not used in the creation of the PDS score described below. This question is answered on a 5point scale, ranging from much earlier to much later. Age at Menarche We asked adolescents at every time point whether they had ever had their period and if yes, to report the date of menarche. To obtain age at menarche from as many participants as possible, we also included data beyond Time 2 for n 46 the first report of date of menarche was after Time 2 Time 3 data collection is ongoing and occurs 18þ months after Time 2. If participants reported date of menarche during multiple study time points and dates were inconsistent, we used the first reported date closest to the actual event. If participants did not remember the exact date, we imputed the middle of the range they reported e.g., June 2018 became June 15, 2018. Age at menarche was available for 81 of participants, 14 was premenarcheal at their latest participation date, and the remaining 5 was postmenarche but did not remember or report the date. ResidualBased Timing Variables We additionally used the following measures of pubertal development PDS, Tanner Stage LD, physical maturation composite scores, and hormone levels. These measures are described in detail below. We created timing variables from these by regressing the pubertal development variable linearly on age within each time point i.e., two separate linear models, as a single linear model across age did not fit the data and outputting the residuals. In the remainder of the paper, we refer to these timing variables as “residualized name of pubertal development measure,” for example, residualized selfreport PDS stage. Pubertal Development Scale Participants and parents completed the PDS. This questionnaire consists of five questions regarding the adolescent’s secondary sexual characteristics. We converted answers on the selfreported and parentreported PDS to Tanner stages Morris Udry, 1980 using validated conversion methods Shirtcliff et al., 2009. Tanner Stage Line Drawings The Tanner stage LD Morris Udry, 1980, female version, consist of two sets of five drawings depicting breasts and pubic hair. For both sets, adolescents choose the image that most closely reflects their current stage of development. Scores range from 1 prepubertal to 5 postpubertal. Gonadal and Adrenal Composites For the gonadal and adrenal composites, we first calculated gonadal and adrenal scores on the PDS. The average of the adrenal PDS score and the lower body LD stage formed the adrenal composite, and the average of the gonadal PDS score and the upper body LD stage formed the gonadal composite. Hormone Assessment We asked participants to collect four saliva samples of 2 ml at waking, with 1 week numerals should be used with units in between samples. This allowed us to obtain a more stable estimate of the hormone level, considering momentary, diurnal and monthly fluctuations. We instructed participants not to eat or brush their teeth before collecting the sample. Families stored the samples in their home freezer until bringing it to their lab session on ice in a cooler bag. At the lab, we stored samples in a 80 C freezer until they were shipped overnight on dry ice to the Stress Physiology Investigative Team at the Iowa State University. There they were assayed in duplicate for DHEA, testosterone, and estradiol using Salimetrics EnzymeLinked Immunosorbent Assay ELISA kits. Samples were rerun if the optical density coefficient of variation CV was greater than 7 and enough sample was left over to do so. The intraassay coefficients of variation CVs at Time 1 were 10.48 for DHEA, 1.80 for testosterone T, and 7.76 for estradiol E2. The intraassay CVs at Time 2 were 2.07 for DHEA, 2.89 for T, and 1.84 for E2. We processed the samples in two batches per time point. The interassay CVs at Time 1 for Batch 1 13 plates were 20.62 for DHEA, 10.23 for T, and 11.53 for E2, and for Batch 2 7 plates were 21.43 for DHEA, 8.34 for T, and 15.55 for E2. The interassay CVs at Time 2 for Batch 1 15 plates were 11.9 for DHEA, 7.11 for T, and 17.7 for E2, and for Batch 2 2 plates were 5.85 for DHEA, 19.6 for T, and 15.4 for E2. All CVs reported are for the optical density wavelengths. See Barendse et al. 2020 for our procedures for handling outliers and undetectable hormone levels. Hormone levels were logtransformed, and they were adjusted for confounds by running mixed effects models predicting the levels of each sample from the time difference between waking and starting collection, whether the sample was collected on a weekday or weekend day, whether the participant felt sick, and use of glucocorticoid spraysinhalers, contraceptives, and antibioticsantifungals. We selected these confounds because they predicted the levels of at least one hormone significantly. We fit separate models for both time points and extracted random intercepts for each participant correcting for these confounds, to obtain one basal hormone level per person per time point. Measures of Internalizing Psychopathology Depressive Symptoms We measured depressive symptoms with the Center for Epidemiologic Studies Depression Scale for Children CESDC Faulstich et al., 1986 Weissman et al., 1980. The CESDC is a 20item selfreport measure of depression symptoms over the past week with responses ranging from 0 Not at all to 3 A lot, and a total maximum score of 60. The CESDC has demonstrated excellent internal consistency and concurrent validity with the Children’s Depression Inventory Faulstich et al., 1986 and DSM diagnoses, as well as good discriminant validity Fendrich et al., 1990. Anxiety Symptoms Participants filled out the short form of the revised Screen for Child Anxiety Related Disorders SCAREDR as a measure of anxiety symptoms. The brief version of the SCAREDR screens for DSM–IV anxietyrelated symptomatology through a 5item multidimensional anxiety scale Birmaher et al., 1999. Answer options range from 0 Not True or Hardly Ever True to 2 Very True or Often True. The measure has good internal consistency and concurrent validity Birmaher et al., 1999. Diagnoses Trained interviewers conducted clinical interviews at Time 1 and 2 with participants using the Schedule for Affective Disorders and Schizophrenia for School Aged Children 6–18 Years Present and Lifetime Version Interview KSADSPL Kaufman et al., 1997. Approximately 20 of the interviews were double scored by a second rater, and we calculated interrater reliability at the item level, including all screening symptoms and supplemental symptoms if applicable, using the kappa j statistic Cohen, 1960 Fleiss, 1971. At Time 1, the average j was .806, and at Time 2 it was .782, which are considered to be in the “excellent” range Kaufman et al., 1997. Diagnoses were determined by the trained interviewers in consultation with a clinicianresearcher and the whole process was monitored by a professor in developmental clinical psychology Nicholas B. Allen. At Time 1, the interviewers inquired about current symptoms and lifetime history, and at Time 2, about current symptoms and those occurring after Time 1. Current and past diagnoses of major depressive disorder, dysthymia, adjustment disorder with depressed mood and depression not otherwise specified based on the DSM–IV were combined in a binary “depressive disorder” variable. We combined current and past diagnoses of generalized anxiety disorder GAD, social anxiety disorder, separation anxiety disorder, panic disorder, agoraphobia, specific phobia, obsessive–compulsive disorder, posttraumatic stress disorder PTSD, and anxiety disordernot otherwise specified based on the DSM–IV in a binary “anxiety disorder” variable. Further, we created an “internalizing disorder” variable, counting everyone with either a depressive disorder, an anxiety disorder or both as having an internalizing disorder. Finally, diagnoses were also categorized using the Hierarchical Taxonomy of Psychopathology HiTOP method Kotov et al., 2017, which produced additional “distress disorder” including depressive disorders, GAD and PTSD and “fear disorder” the remaining anxiety disorders variables. Control Variables We considered three control variables Time 1 internalizing psychopathology, early life stress ELS, and Time 1 BMI. The Time 1 psychopathology measure always matched the outcome variable e.g., if CESDC at Time 2 was the outcome variable, CESDC at Time 1 was considered as a control variable. As a measure of ELS, participants filled out the Childhood Trauma Questionnaire Bernstein et al., 2003 at Time 1. Previous research has demonstrated that the association between ELS and pubertal timing is limited to threatrelated ELS Colich et al., 2020. Therefore, we excluded physical and emotional neglect from the total ELS score. To limit the ELS score to early life and before puberty, we only included items endorsed as having occurred before age 7. We included BMI as another control variable because it likely leads to earlier timing of puberty Chen et al., 2019 and BMI is positively associated with risk for internalizing psychopathology Ames et al., 2015. BMI was calculated from experimentermeasured height and weight for detailed procedures, see Barendse et al., 2020 and converted to ageandsexspecific zscores based on the 2000 CDC growth charts. Analyses Data were analyzed in R v3.6.3. Scripts for analysis can be found on Github DOI 10.5281zenodo.4269697. Imputation We imputed missing pubertal stage variables, subjective timing variables, psychopathology outcome variables, and control variables using multiple imputation MI with Amelia II in R Honaker et al., 2011, because we considered these variables to be missing at random Honaker et al., 2011. For a table of percentages of missing data and details on imputation, see the online supplemental material. Specifications Considered We considered 11 measures of pubertal timing and 7 measures of internalizing psychopathology, as described in the sections above. Additionally, we considered prospective and crosssectional associations by including the Time 1 or Time 2 pubertal timing measure, respectively. The exception to this is age at menarche, since this occurs only once. Further, we fit both models with multiply imputed data and completecase analyses, as a sensitivity analysis due to the parentreported PDS at Time 1 assumed to be missing not at random. Finally, we considered all possible combinations of the control variables. This led to a total of 2352 specifications. Specification Curve We multiplied residualbased timing variables by 1 to align all pubertal timing variables in the same direction, that is, higher values represent later timing. We fit linear regression models for continuous outcomes depressive and anxiety symptoms and logistic regression models for binary outcomes diagnoses. All continuous variables were standardized before fitting the regression model. After running all specified models, we ranked them by their regression coefficient and plotted them in a specification curve. The bottom part of the specification curve visualizes how results differ depending on predictor, outcome and analytical decisions. Bootstrapping and Inferential Statistics We performed bootstrapping to examine whether the associations across specifications were significant Simonsohn et al., 2020. To this end, we created data sets in which we knew the null hypothesis was true using the method suggested by Simonsohn et al. 2020 for continuous outcomes extract the regression coefficient of the predictor, multiply it by the predictor, and subtract it from the outcome. For binary outcomes, we first calculated probabilities of the outcome with the effect of the pubertal timing predictor set to zero. Subsequently, we generated a binary variable with this probability at every bootstrap sample. We then used the resulting variable as the outcome and ran 1.000 bootstrapped with replication specificationcurve analyses with this nullhypothesis data. To obtain a p value, we divided the number of bootstraps with more significant specifications in the dominant direction or more extreme median point estimates than the original dataset by the overall number of bootstraps. In the results, a pvalue .05 would indicate that less than 5 of the nullhypothesis data sets had more specifications in the dominant direction, more significant specifications in the dominant direction or more extreme median point estimates, than the original dataset. We adapted code for our analyses from code by Orben and Przybylski 2019 and code to plot the specification curve from the specr package in R3.6.3. Results Descriptives and Correlations Between Measures of Pubertal Timing See Table 1 for descriptive information of the sample and the distribution of pubertal development and internalizing psychopathology at Time 1 and Time 2. There was substantial comorbidity of internalizing disorders At Time 1, 21 of participants with an internalizing disorder had both an anxiety and a depressive disorder, and at Time 2 this was 42. The overlap between distress and fear disorders was 17 at Time 1 and 32 at Time 2. Total CESDC scores ranged from 0 to 49 at Time 1 and 0 to 57 at Time 2 the screening cutoff is 15 Fendrich et al., 1990. Mean SCAREDR scores ranged from 0 to 2 at both time points the screening cutoff is a mean score of .6 Birmaher et al., 1999. Figure 1 shows that correlations between the various measures of pubertal timing varied from weak to very strong and were similar at both time points. Specification Curve and Overall Effects Figure 2 shows that the associations were overwhelmingly in the negative direction. The strength and significance appeared to vary depending on how pubertal timing is defined. Comparing the observed associations to bootstrapped null models demonstrated that early pubertal timing was significantly associated with more internalizing psychopathology median point estimate that is, regression coefficient .10, 95 CI .11, .09, p , .001 share of results in the negative direction 18272352 p , .001 share of significant results in the negative direction 1932352 p ,.001. Table 2 and 3 show these three inferential statistics and their p values, split by either the predictor or outcome variable. As shown in Table 2, the strongest associations were found for residualized LD Tanner stage and residualized gonadal composite scores i.e., LD and PDS combined. Table 3 demonstrates that pubertal timing has the strongest association with risk for depressive disorders. Prospective Versus CrossSectional Associations All pubertal timing measures except age at menarche were acquired at both Time 1 and Time 2, which allowed us to examine prospective Time 1 predictor and crosssectional Time 2 predictor associations. Bootstrapping the pairwise difference between crosssectional and prospective models showed that the median point estimate of the association between pubertal timing and internalizing psychopathology was equally strong prospectively as crosssectionally both .10, bootstrapped p .94 note that age at menarche was excluded from this comparison. Table 2 presents the inferential statistics for crosssectional and prospective models separately. The bottom row shows the share of significant results in the negative direction for prospective models and for crosssectional models. This share was significantly higher for prospective models 1201120 vs. 731120 bootstrapped p , .001. Relevance of Imputing Missing Values and Including Control Variables Bootstrapping the pairwise difference between models using imputed and completecase data demonstrated that imputation did not change the point estimate of the association between pubertal timing and internalizing psychopathology bootstrapped p .61 models with imputed data had median point estimate .10 and median SE .19 for models with complete data it was .10 and .21, respectively. The median point estimate was somewhat weakened by including BMI as a control variable, but not by the other controls bootstrapped p .01 for specifications with BMI compared with those without bootstrapped p .52 for with vs. without Time 1 psychopathology bootstrapped p .47 for with vs. without ELS. The share of significant models in the negative direction was 30294 for no controls and 25294 for all controls. Discussion This study applied a novel approach, specification curve analysis, to determine how pubertal timing is crosssectionally and prospectively associated with internalizing psychopathology across different measurements of pubertal timing and internalizing psychopathology. The association was strongest when pubertal timing was Tanner Stage and the outcome was caselevel DSM–IV depression or HiTOP distress disorders. Prospective associations were significantly more often significant than crosssectional ones. Importantly, the current study is one of the few that has allowed comparative examination of these relationships within the same sample, allowing us to both draw an overall conclusion about the association of timing with internalizing psychopathology regardless of measure, as well as identify which measurement and analysis decisions were impactful. CrossSectional Versus Prospective Associations One of the most valuable contributions of both this dataset and statistical approach is the revelation that associations with psychopathology were more often significant prospectively than crosssectionally. of early, normal, and late timing. We did this to avoid choosing arbitrary cut offs for these categories. We conducted additional exploratory analyses to test a nonlinear association between age at menarche and internalizing problems see the online supplemental material, but these did not change the pattern of results. Age at menarche is a rough estimate of pubertal timing based on one milestone, the onset of menstruation. The process of puberty is not a singular event onset of multiple processes can occur early or late compared with peers. Further, menarche is a lateoccurring event typically occurring years after pubertal onset. So, another reason age at menarche may be different from the other metrics is that it mixes pubertal onset and pubertal duration. In contrast, a subjective measure of timing or an assessment of body changes can be measured at any or multiple points during the process of puberty. If you consider pubertal timing as where an adolescent is at any point in the process relative to peers, it could for example be “early” compared with peers at one stage, and “on time” compared with peers at another stage later. Therefore, measures of timing that capture individual variation in timing of earlier maturational milestones prior to menarche, may matter most for the etiology of mental health disorders. This could suggest that effects simply take time to emerge. Or, it could be that the timing of the initial steps in the pubertal process are particularly salient therefore, the period of age 10–12 might be a sensitive window for capturing the aspects of pubertal timing that occur early in pubertal development and are relevant to internalizing mental health. However, our prospective associations were measured over a time span of 18 months during earlymid adolescence, so we cannot draw any conclusions about associations with mental health at later ages. Furthermore, we were unable to examine the effect of pubertal timing on nonlinear trajectories of internalizing symptoms because we only had two time points. Moreover, including the equivalent Time 1 psychopathology measure did not eliminate or weaken the results. Few previous studies have controlled for the history of psychopathology when examining how early timing is related to internalizing psychopathology, and the two studies that did this showed conflicting findings Crockett et al., 2013 Hamlat et al., 2020. Our specification curve analysis included a decision point of whether or not to control for Time 1 psychopathology and thus showed that associations between pubertal timing and internalizing psychopathology remained after controlling for this variable. It is possible, therefore, that the direction of effect is pubertal timing predicting later internalizing psychopathology, but our methods do not allow us to infer causality. Future studies should consider examining whether psychopathology earlier in life may predict pubertal timing. Age at Menarche In contrast with several studies looking at age at menarche in relation to depression and anxiety, we did not find significant associations with age at menarche Patton et al., 1996 Platt et al., 2017 Rierdan Koff, 1991 Stice et al., 2001. The majority of our participants reached menarche within the course of our study, therefore allowing us to limit recall bias as much as possible. In contrast to the majority of previous studies, we examined age at menarche as a continuous variable instead of creating categories of early, normal, and late timing. We did this to avoid choosing arbitrary cut offs for these categories. We conducted additional exploratory analyses to test a nonlinear association between age at menarche and internalizing problems see the online supplemental material, but these did not change the pattern of results. Age at menarche is a rough estimate of pubertal timing based on one milestone, the onset of menstruation. The process of puberty is not a singular event onset of multiple processes can occur early or late compared with peers. Further, menarche is a lateoccurring event typically occurring years after pubertal onset. So, another reason age at menarche may be different from the other metrics is that it mixes pubertal onset and pubertal duration. In contrast, a subjective measure of timing or an assessment of body changes can be measured at any or multiple points during the process of puberty. If you consider pubertal timing as where an adolescent is at any point in the process relative to peers, it could for example be “early” compared with peers at one stage, and “on time” compared with peers at another stage later. Therefore, measures of timing that capture individual variation in timing of earlier maturational milestones prior to menarche, may matter most for the etiology of mental health disorders. Adrenarche Versus Gonadarche Associations were similar for timing of adrenal and gonadal maturation. First, both the residualized adrenal composite and the residualized gonadal composite from selfreport measures were prospectively associated with internalizing psychopathology, even though these composites were only moderately correlated with each other see Figure 1. Therefore, in girls aged approximately 10 to 14.5, both adrenarcheal and gonadarcheal processes may contribute to internalizing mental health problems. Second, timing based on adrenal hormone DHEA and testosterone levels or gonadal hormone estradiol levels was not related to any measure of symptoms or disorder, despite using bestpractice methods for assessing hormone levels. Therefore, calculating pubertal timing from hormones may not be a useful method of associating timing with internalizing problems in earlytomid adolescent girls. Hormone levels relative to age may represent aspects of pubertal timing that do not contribute to the mechanisms that are most relevant to the association between timing and internalizing psychopathology. Measure of Internalizing Psychopathology The results also varied by outcome measure of internalizing psychopathology, with the strongest associations for depressive disorders and “distress” disorders i.e., depression, generalized anxiety and PTSD. These categorical variables were based on a clinical diagnostic interview, demonstrating that associations between early pubertal timing and depressivedistress psychopathology also exist when not solely based on the adolescent’s perception. This significantly improves upon what could be inferred from the metaanalysis Ullsperger Nikolas, 2017, where 80 of the studies included only measured symptoms as the outcome. The median effect sizes of the association between pubertal timing and depressive disorders, as well as distress disorders, were much stronger than between pubertal timing and selfrated depressive symptoms. Since caselevel disorders were based on diagnostic interviews, these findings suggest the association is not simply a result of perceptual bias or selfreport bias. They might even suggest that selfreport bias obfuscates the association with pubertal timing, or alternatively, that pubertal timing might be most relevant in distinguishing more severe, caselevel depression from moderate and low depressive symptoms. However, our results are still inconsistent with other studies that have found associations between pubertal timing and subclinical depressive symptoms. The bootstrapped inferential statistics point to no significant association between pubertal timing and anxiety disorders outside of those captured in the HiTOP distress category. This may be attributable, in part, to the heterogeneity of the anxiety disorder category in the DSM–IV. It is further possible that anxiety disorders and HiTOP fear disorders phobias, SAD, panic, OCD are less impacted by pubertal timing as they tend to develop earlier than depression Lijster et al., 2017. Testing Potential Mechanisms The lack of effects of purely biological hormonal measures of timing, combined with the significant results for early timing based on selfreported bodily changes offers indirect support for hypotheses linking early pubertal timing to internalizing psychopathology through social processes. Adolescents with early timing may be perceived as physically different from their peers by other people andor themselves, and therefore others may treat them differently andor they may feel negatively about themselves. We acknowledge that we have not directly tested mechanisms, and therefore recommend future research to test mediation models that include specific psychosocial measures, such as selfperception and treatment by others i.e., adultification. As an example, in line with our proposed psychosocial mechanisms, the amount of sexual harassment experienced has been shown to mediate the link between early pubertal timing and depressive symptoms in girls Skoog et al., 2016. Furthermore, these mechanisms may have an effect on the development of new social and romantic relationships that, in turn, influence mental health according to a recently proposed model Pfeifer Allen, 2021 so changes in relationship functioning should also be measured. Longitudinal measurements are especially important for determining mechanisms of the association between psychopathology and subjective timing. The measurement of subjective timing does not tell us whether the earlymaturing adolescent is simply noticing that they are physically different and not being cognitively, affectively, or socially ready for the physical changes, or whether the adolescent has a negative bias about themselves already, owing to their risk for depression that will emerge later. Limitations and Future Directions Our findings have to be considered in light of several limitations. First, 14 of our participants were still premenarcheal and had to be set as missing on our age at menarche variable. However, as mentioned, we conducted additional posthoc analyses of complete data that did not change the pattern of results. Nevertheless, analysis from longitudinal studies where all participants have completed menarche may uncover additional findings if there is substantial variance in later timing measured this way. Also, our sample was 66 White, which is more diverse than the local population, but still did not give us enough power to test whether the examined associations hold for specific racialethnic groups. Pubertal development occurs, on average, earlier for BlackAfrican American adolescents Chumlea et al., 2003 Herting et al., 2020, and there are inconsistent findings as to whether the link between early timing and internalizing psychopathology is similar for BlackAfrican American adolescents compared with White adolescents or those of other races Carter et al., 2013 Deardorff et al., 2021. Therefore, examining associations between pubertal timing and mental health within each racialethnic group is an important future direction that can be addressed using large, representative samples such as the ABCD study for the United States. Further, we found that including threatrelated early life stress before age 7 as a control variable had no impact on the results. However, our sample had low levels of ELS. Future studies with more variability on this measure should continue to test this as a control variable. Finally, genetically informed studies should be conducted to examine whether pubertal timing and psychopathology have overlapping genetic etiology, since both are partly heritable Meyer et al., 1991 Mustanski et al., 2004. Conclusions The current study is one of the first to comprehensively examine the relationships between a wide range of measures of pubertal timing and internalizing psychopathology within the same sample. Overall, this study of adolescent girls showed that selfreported measures of timing were associated with internalizing problems, but ageadjusted hormone levels were not. Furthermore, associations between timing and mental health were strongest for depressive and distress disorders and were more often significant prospectively than crosssectionally. Future studies should examine mechanisms explaining the link between pubertal timing and internalizing psychopathology that can be targeted in prevention and intervention efforts. For these studies, we suggest that researchers carefully choose the methods of measurement for both pubertal timing and mental health. Ultimately, this research will assist clinicians in treating internalizing disorders in adolescent girls by highlighting biological and psychosocial risk factors.