In recent years, the implementation of mindfulness-based interventions (MBIs) has shown promising effects in improving youth well-being in clinical (e.g., Reangsing et al.,
2021; Shah et al.,
2022), school (e.g., Felver et al.,
2016; Pickerell et al.,
2023), and family settings (e.g., Burgdorf et al.,
2022; Lyu & Lu,
2023). Mindfulness refers to paying attention to one’s experience purposefully, in the present moment, and without judgment (Kabat-Zinn,
2023). It can be seen as a trait (i.e., the predisposition to be mindful in daily life) and a state (i.e., the ability to experience a temporary state of awareness during a mindfulness practice) (Felver et al.,
2016). Despite the growing interest in the relation between mindfulness and youth well-being, using valid measurements to operationalize the complex concept of mindfulness across cultures remains a challenge for mindfulness research among youth (Goodman et al.,
2017; Pallozzi et al.,
2017).
Two existing mindfulness measurements are most commonly used (Pallozzi et al.,
2017). One is the Child and Adolescent Mindfulness Measure (CAMM; Greco et al.,
2011) that was specifically developed for youth. The other is the Mindful Attention Awareness Scale for Children (MAAS-C; Brown & Ryan,
2003; Lawlor et al.,
2014) that was modified from the MAAS adult version. Although these two measures showed good reliability and validity, they both use a one-dimensional structure. To date, limited research has explored a reliable and validated multifaceted structure of youth mindfulness measures (Pallozzi et al.,
2017).
Research has highlighted the importance of using multifaceted mindfulness measurements for several reasons. First, multifaceted measures allow researchers to examine how different aspects of mindfulness naturally evolve in children and youth across developmental stages and how these changes impact their developmental trajectories (Roeser & Eccles,
2015). Second, multifaceted measures would help MBI researchers delineate mechanisms of change by examining the mediating effects of mindfulness’s different aspects between interventions and treatment outcomes in children and adolescents (Goodman et al.,
2017). Third, measuring mindfulness at the facet level is critical to understanding the effects of mindfulness practices on specific aspects of mindfulness, offering insights for developing MBIs targeting different outcomes (Hölzel et al.,
2011).
As the recent literature increasingly conceptualizes mindfulness as a multifaceted construct, the Five Facet Mindfulness Questionnaire (FFMQ) has become a widely used measure of mindfulness in adults across cultures (e.g., Christopher et al.,
2012; Correa et al.,
2023; Hou et al.,
2014; Okafor et al.,
2023; Tran et al.,
2013). The FFMQ originated from 112 items derived from five existing mindfulness measures and initially examined among young adults (Baer et al.,
2006), and the final FFMQ scale included 39 items that measure five facets:
Observe (noticing or attending to one’s experiences),
Describe (describing inner experiences with words),
Actaware (attending to the present moment and stepping out of automatic pilot mode),
Nonjudge (accepting feelings and thoughts without judgment), and
Nonreact (letting feelings, thoughts, and emotions come and go) (Baer et al.,
2008).
Although FFMQ has been used in youth MBI studies, relatively few studies have examined its reliability and validity in youth. Among the few studies that validated FFMQ in youth populations (e.g., Abujaradeh et al.,
2020; Cortazar et al.,
2020), the sample age ranged from middle- to late-adolescence. Their results indicated that a multifaceted structure seemed applicable and a shorter form may be an effective tool for youths. However, to our knowledge, no study has examined the psychometric properties of FFMQ in early adolescents.
In addition to the lack of multifaceted measurements, mindfulness measurements must also consider the contextual difference in youths’ understanding of emotions, feelings, and behaviors; their interpretation of measurement items; and their perceived purpose of mindfulness practices, as these perceptions are largely shaped by their social interactions in specific cultural environments (Goodman et al.,
2017). Previous research (Karl et al.,
2020) suggested that the current individualistic conceptualization of mindfulness may be more suitable for cultures with higher individualism, which may bias its application to collectivistic cultural contexts. For example, individualism in Western countries focuses more on one’s emotional expression, whereas collectivism in East Asian cultures emphasizes interdependence that encourages suppression of emotions to achieve interpersonal harmony and bonding (Oyserman et al.,
2002; Teuber et al.,
2023). A study that explored the cultural difference between Chinese and German youths’ emotion regulation found that Chinese youth were more likely to use cognitive reappraisal and expressive suppression strategies to regulate their emotions (Teuber et al.,
2023).
Chinese youth may also perceive the concepts, values, and practices of mindfulness differently from Western youth (Schmidt,
2011). The concept of mindfulness originated from Eastern Buddhist philosophy nearly 2500 years ago and was incorporated into Western psychology in the past few decades (Siegel et al.,
2009). Some common mindfulness practices (e.g., sitting meditation) are similar to activities rooted in Eastern philosophy, such as the traditional seated meditation practice that is promoted by Confucianism, which encourages people to reflect themselves through quiet sitting (“
jing-zuo”). Confucian teaching also mentioned that by knowing one’s strengths and weaknesses through mindfulness, individuals improve themselves and therefore filfull their family and society responsibilities (Tan,
2019). Despite the cultural roots of mindfulness in Eastern philosophy, mindfulness research is significantly lagging in non-Western populations (Karl et al.,
2020; Xie et al.,
2021). With the burgeoning development of mindfulness research and practices in the West, the mindfulness concept was subsequently re-introduced to the East; however, this intercultural interaction process may have resulted in shifted meanings of mindfulness due to the translation, interpretation, and cultural contextual discrepancies (Schmidt,
2011). Therefore, it is necessary to explore the validity of youth mindfulness measurements in non-Western cultures.
Our study focused on Chinese early adolescents, representing youth from one of the traditionally collectivist East Asian cultures. An increasing number of studies have evaluated the effects of MBIs on Chinese children and youth’s well-being, such as emotional regulation and behavioral problems (e.g., Lu et al.,
2018,
2022). Most of these studies used the MAAS or CAMM, but none examined the validity of multifaceted mindfulness measures. In this study, we examined the psychometric properties of a Chinese version of the short-form FFMQ in a sample of Chinese early adolescents in school settings, including the test of factor structure, dimensionality, internal consistency, convergent validity, and measurement invariance across gender and grade. Given that short-form measures may be more appropriate for children and adolescents (Mellor & Moore,
2014), we used a 20-item FFMQ that was validated in an adolescent sample in the United States (Abujaradeh et al.,
2020) and an Austrian nonclinical adult sample and cross-validated in a group of Austrian university students (Tran et al.,
2013). We selected this version of FFMQ because it has been validated in nonclinical adolescents and student samples with satisfactory validity and reliability across cultures.
Method
Participants
The initial recruitment notice was sent out to 608 children in three primary schools in Shenzhen, a metropolitan city in southeast China. Since a school-wide sample was used, all students were eligible to participate. After excluding 25 participants who did not consent to join and 32 who showed invalid response patterns (e.g., items rated the same answer across all scales, or repetitive answer patterns such as “123,454,321”), our final sample size was 551. Table
1 summarizes participants’ demographic characteristics. The participants included 228 girls (41.38%) and 323 boys (58.62%), with the mean age of 10.38 (
SD = 0.80, range 9–12.50); 181 children were in 4th grade (32.85%) and 370 children were in 5th grade (67.15%). Participants were approximately equally distributed among the three schools, with 4–5 classes per school, 36–52 students per class (Online Resource 1, Table
S1 in Supplementary Information). In terms of socioeconomic status, 30.85% (
n = 170) of the participants had monthly household income of CNY 5000 (approximately equivalent to USD $695) or less; 34.12% (
n = 188) participants had CNY 5001–10,000; 17.60% (
n = 97) participants had CNY 10,001–15,000; 7.99% (
n = 44) participants had CNY 15,001–20,000; 5.26% (
n = 29) participants had over CNY 20,000; and 4.17% (
n = 23) did not report income information. Written informed consent was obtained from all participants and their guardians before the study.
Table 1
Demographic characteristics of participants
Agea | 10.38 (0.80) | 10.38 (0.80) | 10.37 (0.80) |
Gender |
Girls | 228 (41.38) | 109 (39.49) | 119 (43.27) |
Boys | 323 (58.62) | 167 (60.51) | 156 (56.73) |
School |
School 1 | 191 (34.66) | 105 (38.04) | 86 (31.27) |
School 2 | 192 (34.85) | 102 (36.96) | 90 (32.73) |
School 3 | 168 (30.49) | 69 (25.00) | 99 (36.00) |
Grade |
Grade 4 | 181 (32.85) | 92 (33.33) | 89 (32.36) |
Grade 5 | 370 (67.15) | 184 (66.67) | 186 (67.64) |
Monthly household income |
CNY 5000 | 170 (30.85) | 85 (30.80) | 85 (30.91) |
CNY 5001–10,000 | 188 (34.12) | 101 (36.59) | 87 (31.64) |
CNY 10,001–15,000 | 97 (17.60) | 46 (16.67) | 51 (18.55) |
CNY 15,001–20,000 | 44 (7.99) | 17 (6.16) | 27 (9.82) |
Over CNY 20,000 | 29 (5.26) | 15 (5.43) | 14 (5.09) |
Not reported | 23 (4.17) | 12 (4.35) | 11 (4.00) |
Procedure
Trained research assistants distributed survey questionnaires in classroom settings during regular school hours. Participants were briefly introduced to the survey and were then invited to self-report their levels of mindfulness, resilience, anxiety, depression, and positive and negative affect. The survey was administered in classroom settings and lasted around 30 min. Research assistants and the teacher-in-charge were present to clarify questions from the participants. Parents were also invited to rate their children’s emotional and behavioral difficulties and prosocial behaviors through a take-home questionnaire. A total of 511 parents completed the questionnaire and returned it to school. No survey incentives were provided to students and parents.
Data Analyses
First, we conducted descriptive analysis to examine the FFMQ-20 items’ psychometric properties. The data normality assumption was examined using the criteria that absolute skewness values above 3 and absolute kurtosis values above 10 indicate violations to normality (Kline,
2005). We used Little’s Missing Completely At Random (MCAR) test to assess whether missing data were randomly distributed across observations. The item difficulty index based on the total score was used to assess whether the items were too difficult or too easy for the participants to understand, while the item discrimination index was used to assess to what extent did the items differentiate respondents on the mindfulness construct. By convention, item difficulty values between 0.20 and 0.80 and item discrimination values above 0.20 are considered acceptable (Lüdecke,
2024).
We then conducted confirmatory factor analysis (CFA) using the entire sample (
n = 551) to examine the FFMQ-20’s factor structure. The robust maximum likelihood (MLR) estimation method was used because this method relies less on the multivariate normal distribution assumption and is more suitable for analyzing smaller sample sizes (Li,
2016). We evaluated the structural validity based on four model fit indices, including the Comparative Fit Index (CFI), the Tucker-Lewis Index (TLI), the Standardized Root Mean Square Residual (SRMR), and the Root Mean Square Error of Approximation (RMSEA). Respectively, acceptable and good model fits are indicated by CFI values of 0.90 and 0.95 or above, RMSEA values of 0.08 and 0.06 or lower, SRMR values of 0.10 and 0.08 or lower, and TLI values of 0.90 and 0.95 or higher (Kline,
2016). To improve the model fit, we made adjustments to allow error variances to covary between items according to the modification indices, given the expected overlap between items.
As the original FFMQ-20’s five-factor structure did not fit well with our data, we conducted exploratory factor analysis (EFA) to further explore the factor structures. To ensure that the factor structure derived from EFA could be confirmed through CFA of an independent dataset and to avoid overfitting, participants were divided into two equivalent subsamples using the Solomon technique (Lorenzo-Seva,
2021). The sample size met the minimum requirement of at least 10 participants for each item in factor analysis (Boateng et al.,
2018). Prior to conducting EFA, Bartlett’s test of sphericity and Kaiser–Meyer–Olkin’s (KMO) measure of sampling adequacy were used to assess the dataset’s appropriateness for factor analysis, which is indicated by a significant Bartlett’s test result (
p < 0.05) and a KMO value above 0.70 (Shrestha,
2021). In addition, parallel analysis (Horn,
1965) was used to determine the number of factors to retain. The polychoric correlation matrix was used as it is appropriate for the FFMQ’s 5-point Likert scale structure. We then conducted EFA with subsample 1 (
n = 276) using principal axis factoring as the extraction method. Oblique rotation (i.e., Promax) was applied to allow the extracted factors to correlate with each other. We used an iterative process to remove items one by one, starting with items with factor loadings below 0.30, followed by items showing the smallest difference in their cross-loadings across factors (Güvendir & Özkan,
2022). We then conducted CFA with subsample 2 (
n = 275) to test the structural validity of the factor model derived from EFA.
Upon achieving an acceptable model fit with CFA, we applied an exploratory structural equation modeling (ESEM) model to the entire sample (
n = 551) to confirm the factor structure with maximized statistical power. ESEM integrates the advantages of EFA and CFA and confirms a prior factor structure by allowing for cross-loadings between items, which represents the data structure more realistically and may mitigate cultural differences related to item interpretation (Marsh et al.,
2014). A target rotation method was employed in the ESEM models.
In addition, we tested measurement invariance to evaluate the factor structure’s stability across gender (boys vs. girls) and grade (4th vs. 5th graders). The following hierarchical invariance testing strategy was used: (1) configural invariance; (2) weak (or metric) invariance; (3) strong (or scalar) invariance; (4) strict (or residual) invariance. As recommended by the literature (Chen,
2007; Cheung & Rensvold,
2002), changes in CFI and TLI not exceeding 0.01 and changes in RMSEA not exceeding 0.015 (i.e., |ΔCFI|≤ 0.01, |ΔTLI|≤ 0.01, |ΔRMSEA|≤ 0.015) indicate evidence of acceptable measurement invariance across groups.
The FFMQ and its subscales’ internal consistency was assessed by Cronbach’s
α and McDonald’s
ω. Convergent validity was examined through the correlations between FFMQ and related measures (i.e., CD-RISC, PANAS, MAAS, SDQ). We also conducted an independent
t-test of whether students at risk for anxiety and depression (as evaluated by the RCADS) scored differently on FFMQ and its subscales. We used IBM SPSS Statistics (Version 26) for data cleaning, R Statistical Software (v4.3.1; R Core Team,
2023) for main analyses, and Mplus (Version 8; Muthén & Muthén,
2017) for ESEM analysis.
Results
Item Descriptive Results
Table
2 summarizes the FFMQ-20’s descriptive statistics. The items’ skewness ranged from − 0.18 to 0.72, and kurtosis values ranged from − 1.18 to − 0.14, indicating that the distributions did not significantly deviate from normality. The item difficulty indices in this study ranged from 0.44 to 0.64, suggesting that all items demonstrated medium difficulty for our sample. The item discrimination indices ranged from 0.32 to 0.51, indicating acceptable performance in differentiating between respondents with high and low levels of mindfulness. The percentage of missing cases across all variables ranged from 0 to 2.72%, with missing values representing 1.04% of the total data. Specifically, the percentage of missing data for the FFMQ-20 items ranged from 0.18 to 1.63%. Little’s MCAR test results indicated that the data were not MCAR (
χ2 = 12,812.26,
df = 11,520,
p < 0.001). Therefore, multiple imputation was used to handle missing values.
Table 2
Descriptive statistics of FFMQ-20
1 | 550 | 0.18 | 2.39 | 1.10 | 0.56 | − 0.14 | 0.48 | 0.32 |
2 | 546 | 0.91 | 2.22 | 1.16 | 0.72 | − 0.32 | 0.44 | 0.32 |
3 | 546 | 0.91 | 2.77 | 1.36 | 0.23 | − 1.10 | 0.55 | 0.43 |
4 | 542 | 1.63 | 2.51 | 1.24 | 0.49 | − 0.69 | 0.50 | 0.33 |
5 | 545 | 1.09 | 2.54 | 1.34 | 0.48 | − 0.91 | 0.51 | 0.43 |
6 | 545 | 1.09 | 3.16 | 1.37 | − 0.16 | − 1.13 | 0.63 | 0.51 |
7 | 547 | 0.73 | 2.51 | 1.27 | 0.39 | − 0.84 | 0.50 | 0.33 |
8 | 547 | 0.73 | 2.27 | 1.22 | 0.70 | − 0.43 | 0.45 | 0.38 |
9 | 549 | 0.36 | 2.94 | 1.35 | 0.07 | − 1.15 | 0.59 | 0.43 |
10 | 547 | 0.73 | 3.20 | 1.38 | − 0.18 | − 1.18 | 0.64 | 0.46 |
11 | 546 | 0.91 | 3.10 | 1.30 | 0.00 | − 1.08 | 0.62 | 0.41 |
12 | 544 | 1.27 | 2.63 | 1.26 | 0.37 | − 0.79 | 0.53 | 0.33 |
13 | 547 | 0.73 | 3.05 | 1.31 | 0.02 | − 1.06 | 0.61 | 0.36 |
14 | 545 | 1.09 | 2.82 | 1.34 | 0.26 | − 1.04 | 0.56 | 0.41 |
15 | 546 | 0.91 | 3.20 | 1.34 | − 0.11 | − 1.15 | 0.64 | 0.45 |
16 | 545 | 1.09 | 2.83 | 1.33 | 0.21 | − 1.07 | 0.57 | 0.46 |
17 | 546 | 0.91 | 3.21 | 1.38 | − 0.12 | − 1.18 | 0.64 | 0.43 |
18 | 546 | 0.91 | 2.88 | 1.27 | 0.18 | − 0.92 | 0.58 | 0.41 |
19 | 547 | 0.73 | 2.81 | 1.33 | 0.18 | − 1.07 | 0.56 | 0.48 |
20 | 549 | 0.36 | 2.86 | 1.30 | 0.22 | − 0.94 | 0.57 | 0.39 |
Evaluating the Original FFMQ-20 Factor Structure
Based on the FFMQ’s five dimensions as developed by Baer et al. (
2006), we tested the FFMQ-20’s factor structures through CFA and ESEM (see Table
3 for model summary). The models tested are as follows: (1) Model 1: a 20-item CFA model with five correlated factors, where all items load on their respective factors; (2) Model 2: a 20-item CFA model with one second-order factor and five first-order factors, where the five first-order factors load on the second-order factor and all items load on their respective first-order factors; (3) Model 3: a 16-item CFA model with four correlated factors that excluded the 4-item Describe subscale, where the remaining 16 items load onto their respective factors; and (4) Model 4: a 16-item ESEM model with four factors that excluded the 4-item Describe subscale, where the remaining 16 items load onto their respective factors and cross-loadings between items are allowed.
Table 3
Model comparison of FFMQ-20 (Models 1–4)
Model fit |
χ2 | 463.21*** | 521.76*** | 180.53*** | 82.45* |
df | 160 | 165 | 98 | 62 |
CFI | 0.84 | 0.81 | 0.94 | 0.99 |
TLI | 0.81 | 0.78 | 0.93 | 0.97 |
RMSEA | 0.06 | 0.06 | 0.04 | 0.02 |
SRMR | 0.09 | 0.09 | 0.06 | 0.02 |
Number of items | 20 | 20 | 16 | 16 |
Items loaded on FFMQ-20a |
Observe | Yes | Yes | Yes | Yes |
Describe | Yes | Yes | No | No |
Actaware | Yes | Yes | Yes | Yes |
Nonjudge | Yes | Yes | Yes | Yes |
Nonreact | Yes | Yes | Yes | Yes |
Second-order factor | No | Yes | No | No |
Data analysis | CFA | CFA | CFA | ESEM |
n | 551 | 551 | 551 | 551 |
As shown in Table
3, the correlated five-factor CFA model (Model 1) and the hierarchical five-factor CFA model (Model 2) both showed inadequate model fit, although Model 1 showed better fit,
χ2 = 463.21,
df = 160,
p < 0.001; CFI = 0.84, TLI = 0.81, RMSEA = 0.06, SRMR = 0.09. Based on Model 1, Item 7 (standardized factor loading = 0.09,
SE = 0.14,
p = 0.55) and Item 12 (standardized factor loading = 0.22,
SE = 0.12,
p = 0.06) showed low loadings on the Describe factor. As researchers recommended that each factor should contain at least 3 items (Raubenheimer,
2004), we removed the Describe subscale and reconducted the CFA. The resulting correlated four-factor model (Model 3) showed good model fit,
χ2 = 180.53,
df = 98,
p < 0.001; CFI = 0.94, TLI = 0.93, RMSEA = 0.04, SRMR = 0.06. We then conducted ESEM using the four-factor model (Model 4) to confirm this factor structure. In this model, Item 13 on Nonreact (standardized factor loading = 0.23,
SE = 0.15,
p = 0.12) and Item 16 on Nonjudge (standardized factor loading = 0.29,
SE = 0.09,
p < 0.01) had loadings below 0.30, indicating that Nonreact and Nonjudge were not well-defined in the four-factor structure.
Exploring a Three-Factor FFMQ-15
Given that the original five-factor FFMQ-20 showed inadequate goodness-of-fit in the CFA models, and the four-factor structure (after excluding Describe) showed 2 items had low loadings in the ESEM model, we conducted EFA to reexamine the underlying factor structure that best fits our sample. Our parallel analysis and EFA of subsample 1 (n = 276) suggested a three-factor model. The KMO value (KMO = 0.85) and significant Bartlett’s test (χ2 = 1432.73, p < 0.001) suggested suitable sampling and correlation for factor analysis. The EFA results showed that the three-factor model explained 45% of the variance.
As shown in Table
4, the 15 items loaded on three factors. Five items (Items 1, 2, 4, 7, and 8) loaded on the first factor. Among them, Item 1 (Mind wanders off), Item 2 (Don’t pay attention), Item 4 (Easily distracted), and Item 8 (Difficult to stay focused) belonged to the original FFMQ Actaware facet. Item 7 (Have trouble thinking of right words) belonged to the original FFMQ’s Describe facet. As these items focus on one’s inability to direct attention and express feelings in the present experience, we renamed this factor as Attention.
Table 4
Factor loadings of exploratory factor analysis
ffmq2_R. Don’t pay attention | 0.76 | 0.08 | − 0.11 |
ffmq1_R. Mind wanders off | 0.75 | 0.15 | − 0.26 |
ffmq4_R. Easily distracted | 0.74 | − 0.11 | 0.09 |
ffmq8_R. Difficult to stay focused | 0.70 | − 0.10 | 0.13 |
ffmq7_R. Have trouble thinking of right words | 0.63 | − 0.03 | 0.05 |
ffmq10. Pay attention to sounds | − 0.08 | 0.84 | − 0.13 |
ffmq6. Pay attention to sensations | 0.05 | 0.59 | 0.06 |
ffmq17. Notice visual elements | − 0.15 | 0.57 | 0.23 |
ffmq15. Notice smells and aromas | − 0.01 | 0.57 | 0.12 |
ffmq3. Watch feelings | 0.18 | 0.40 | 0.01 |
ffmq20. Describe feelings at the moment | − 0.20 | 0.01 | 0.72 |
ffmq18. Put experiences into words | − 0.18 | 0.12 | 0.66 |
ffmq14_R. Shouldn’t think that way | 0.19 | − 0.10 | 0.53 |
ffmq19_R. Judge distressing thoughts | 0.22 | 0.07 | 0.49 |
ffmq16_R. Bad or inappropriate emotions | 0.08 | 0.18 | 0.43 |
Another 5 items (Items 3, 6, 10, 15, and 17) loaded on the second factor, including four original Observe items: Item 6 (Pay attention to sensations), Item 10 (Pay attention to sounds), Item 15 (Notice smells and aromas), and Item 17 (Notice visual elements). Item 3 (Watch feelings) belonged to the original Nonreact facet. As Item 3 relates to the observation of feelings, we retained Observe as the factor name.
The remaining 5 items (Items 14, 16, 18, 19, and 20) loaded on the third factor, including three Nonjudge items: Item 14 (Shouldn’t think that way), Item 16 (Bad or inappropriate emotions), and Item 19 (Judge distressing thoughts), and 2 Describe Items: Item 18 (Put experiences into words) and Item 20 (Describe feelings at the moment). As these items focus on awareness of emotions and thoughts, we referred to this factor as Internal Awareness.
Validating the New Three-Factor Model
We then used CFA and ESEM to confirm the FFMQ-15’s three-factor structure obtained from the EFA results. As shown in Table
5, the models tested include (1) Model 5: a 15-item CFA model with three correlated factors, where all items load on their respective factors; (2) Model 6: a 15-item CFA model with three correlated factors, where all items load on their respective factors and residual covariance between Item 7 and Item 8 is allowed as suggested by the Modification Index; and (3) Model 7: a 15-item ESEM model with three factors, where all items load onto their respective factors and cross-loadings between items are allowed.
Table 5
Model comparison of FFMQ-15 (Models 5–7)
Model fit |
χ2 | 147.28*** | 139.96*** | 97.26*** |
df | 87 | 86 | 63 |
CFI | 0.91 | 0.92 | 0.98 |
TLI | 0.89 | 0.90 | 0.96 |
RMSEA | 0.05 | 0.05 | 0.03 |
SRMR | 0.08 | 0.07 | 0.03 |
Data analysis | CFA | CFA adjusted | ESEM |
n | 275 | 275 | 551 |
The correlated three-factor model (Model 5) showed acceptable fit,
χ2 = 147.28,
df = 87,
p < 0.001; CFI = 0.91, TLI = 0.89, RMSEA = 0.05, SRMR = 0.08. As suggested by the Modification Index, we allowed residual covariance between Item 7 and Item 8. This adjustment yielded an improved fit for the correlated three-factor model (Model 6),
χ2 = 139.96,
df = 86,
p < 0.001; CFI = 0.92, TLI = 0.90, RMSEA = 0.05, SRMR = 0.07 (Online Resource 1, Table
S2). This three-factor structure was confirmed through ESEM (Model 7), which showed excellent model fit,
χ2 = 97.26,
df = 63,
p < 0.001; CFI = 0.98, TLI = 0.96, RMSEA = 0.03, SRMR = 0.03. According to the ESEM model’s factor loadings, all the three factors were well defined, and no significant cross-loadings were observed (Online Resource 1, Table
S3).
Measurement Invariance
Online Resource 1 (Table
S4) presents the measurement invariance tests between gender and grades. The results suggested an equivalence between girls and boys, as indicated by changes in CFI, TLI, and RMSEA when comparing weak invariance with configural invariance (ΔCFI = 0.003, ΔTLI = 0.004, ΔRMSEA = 0.001), strong invariance with weak invariance (ΔCFI = 0.006, ΔTLI = 0.001, ΔRMSEA = − 0.001), and strict invariance with strong invariance (ΔCFI = 0.001, ΔTLI = − 0.006, ΔRMSEA = 0.002). These results indicate gender invariance in factor loadings, item intercepts, and residual variances.
For the measurement invariance test between 4 and 5th graders, changes in CFI, TLI, and RMSEA when comparing weak invariance with configural invariance (ΔCFI = 0.001, ΔTLI = − 0.005, ΔRMSEA = 0.001) and strong invariance with weak invariance (ΔCFI = 0.000, ΔTLI = − 0.005, ΔRMSEA = 0.002) were within the acceptable range, indicating that the factor loadings and item intercepts were equivalent between 4 and 5th graders. In addition, when comparing the strict invariance model with the strong invariance model, the changes in TLI (ΔTLI = 0.007) and RMSEA (ΔRMSEA = − 0.002) were within the recommended value. The change in CFI was marginally above 0.01 (ΔCFI = 0.013).
Convergent Validity
We assessed convergent validity through the correlations between FFMQ-15 and related measures. Respectively, Pearson’s correlation coefficients of 0.5–1.0, 0.3–0.5, and below 0.3 indicate strong, moderate, and small effect sizes (Cohen,
1992). As shown in Table
6, the FFMQ-15 total score was negatively correlated with Total Difficulty (
r = − 0.27) and Negative Affect (
r = − 0.30) but was positively correlated with Prosocial Behavior (
r = 0.12), Resilience (
r = 0.33), Positive Affect (
r = 0.21), and trait mindfulness as measured by MAAS (
r = 0.47).
Table 6
Correlation analysis of FFMQ-15 and subscales with criterion measures
FFMQ-15 | − 0.27*** | 0.12** | 0.33*** | 0.21*** | − 0.30*** | 0.47*** |
Attention | − 0.22*** | 0.05 | 0.21*** | 0.16*** | − 0.38*** | 0.56*** |
Observe | − 0.13** | 0.09 | 0.21*** | 0.10* | 0.01 | 0.56*** |
Internal Awareness | − 0.11* | 0.08 | 0.10* | 0.10* | − 0.17*** | 0.26*** |
Among the FFMQ-15 subscales, Attention was negatively correlated with Total Difficulty (r = − 0.22) and Negative Affect (r = − 0.38), while being positively correlated with Resilience (r = 0.21), Positive Affect (r = 0.16), and MAAS trait mindfulness (r = 0.56). The Observe subscale was negatively correlated with Total Difficulty (r = − 0.13) while being positively correlated with Resilience (r = 0.21), Positive Affect (r = 0.10), and MAAS trait mindfulness (r = 0.56). However, Observe was not correlated with Negative Affect. Last, the Internal Awareness subscale was positively correlated with Resilience (r = 0.10), Positive Affect (r = 0.10), and trait mindfulness as measured by MAAS (r = 0.26), while being negatively correlated with Total Difficulty (r = − 0.11) and Negative Affect (r = − 0.17).
In addition, as research suggests that mindfulness is closely related to depression and anxiety (e.g., Barcaccia et al.,
2022), we used independent-sample
t-tests to compare the FFMQ-15 scores between participants with at-risk (
n = 46) and normal-range (
n = 505) symptom levels as categorized by RCADS scores. We converted raw RCADS scores to
T-scores adjusted for gender and grade based on the scoring manual.
T-scores of 70 or higher indicate being at risk for depression and/or anxiety (Chorpita et al.,
2015). As shown in Table
7, participants at risk for depression and/or anxiety scored significantly lower on FFMQ-15 total score, Attention, and Internal Awareness (FFMQ-15:
t = 4.19,
p < 0.001,
d = 0.65; Attention:
t = 7.57,
p < 0.001,
d = 1.17; Internal Awareness:
t = 2.98.
p = 0.003,
d = 0.46). However, participants at risk for depression and/or anxiety scored significantly higher on the Observe subscale (
t = − 2.53,
p = 0.01,
d = − 0.39).
Table 7
Independent t-test of samples with depressive and anxiety symptom levels in both normal and at-risk ranges
FFMQ-15 | Normal | 505 | 49.25 (6.81) | 4.19 | < 0.001 | 0.65 | [0.34, 0.95] |
At-risk | 46 | 44.93 (5.21) | | | | |
Attention | Normal | 505 | 18.48 (4.17) | 7.57 | < 0.001 | 1.17 | [0.86, 1.48] |
At-risk | 46 | 13.59 (4.46) | | | | |
Observe | Normal | 505 | 15.39 (4.67) | − 2.53 | 0.01 | − 0.39 | [− 0.69, − 0.09] |
At-risk | 46 | 17.20 (4.37) | | | | |
Internal Awareness | Normal | 505 | 15.37 (2.72) | 2.98 | 0.003 | 0.46 | [0.34, 0.95] |
At-risk | 46 | 14.14 (2.35) | | | | |
Internal Consistency
The FFMQ-15 showed acceptable internal consistency, Cronbach’s
α = 0.82, McDonald’s
ω = 0.86. For the subscales, Attention
α = 0.79,
ω = 0.82; Observe
α = 0.71,
ω = 0.74; Internal Awareness
α = 0.66,
ω = 0.71. As shown in Online Resource 1 (Table
S5), the three subscales all showed significant positive correlations with the FFMQ-15 total score (Attention:
r = 0.62,
p < 0.001; Observe:
r = 0.80,
p < 0.001; Internal Awareness:
r = 0.76,
p < 0.001).
Discussion
This study validated a short-form FFMQ among Chinese early adolescents. The results showed satisfactory internal consistency, convergent validity, and measurement invariance across gender and grade of the Chinese version of a 15-item scale. Overall, the FFMQ-15 appeared to be an appropriate measure for Chinese early adolescents. Our analyses supported a three-factor structure that includes Attention, Observe, and Internal Awareness, each consisting of five items.
Previous research (Lecuona et al.,
2021) suggested that the Actaware facet in FFMQ may be split into two subfacets: Distractibility (the tendency of distraction) and Mindless actions (the tendency to perform actions without consciousness of doing them). In our study, all the Actaware items in the Attention subscale were reverse-coded items from the Distractibility subfacet. The correlation coefficients indicated that Attention had a small correlation with Observe and Internal Awareness, whereas Observe and Internal Awareness had a moderate correlation with each other. In addition, Attention showed the highest internal consistency among the three subscales.
These results appeared to suggest that Attention measures a unique aspect of mindfulness within our scale that was somewhat distinctive from Observe and Internal Awareness. In the Buddhist teaching, the basis of mindfulness is “steady perception” or “the establishing of mindfulness of the body” (Gethin,
2011, p. 272). In other words, attention is an essential function of consciousness that is the seed of mindfulness (Gethin,
2011). Therefore, we speculate that a sustained presence of mind (i.e., non-distractibility as measured by the Attention subscale) may be a foundational element of mindfulness, which then fosters an awareness of internal and external experiences (i.e., items that are measured by the Observe and Internal Awareness subscales), whereas additional qualities such as Observe (measuring how people extend their attention to external environments) may not always simply come with Attention.
The importance of attention may also explain why adolescents at risk for depression and/or anxiety showed the largest difference in Attention among all FFMQ subscales from adolescents not at risk for depression or anxiety. Given that adolescents with depression and anxiety may experience more difficulties with concentration and increased distractibility (American Psychiatric Association,
2013; Hallion et al.,
2018), our findings highlighted the potential importance of alleviating depression and anxiety symptoms in early adolescents by improving their attentional ability.
It is noteworthy that adolescents at risk for depression and/or anxiety symptoms showed higher Observe scores than those not at risk, which aligns with the argument that the Observe facet often shows unexpected relationships with psychological symptoms (e.g., Bergomi et al.,
2013). This issue has been noted since initial studies of FFMQ by Baer et al. (
2006,
2008) and supported by subsequent research, including FFMQ validation studies in adult (e.g., Brady et al.,
2019; Tran et al.,
2013) and adolescent samples (e.g., Abujaradeh et al.,
2020), as well as studies of the relationships between mindfulness facets and psychological well-being (e.g., Brown et al.,
2015; Royuela-Colomer & Calvete,
2016). Previous researchers have proposed various explanations. Some suggested that the inconsistent relationships between the Observe facet and psychological symptoms might be related to participants’ meditation experience (Baer et al.,
2011). For non-meditators who lack the ability or skills to relate to external and internal experiences with an open and nonjudgmental attitude, the Observe facet may result in maladaptive attention to their experiences (Bergomi et al.,
2013), which may lead to more psychological symptoms such as depression and rumination (Royuela-Colomer & Calvete,
2016).
Moreover, some Observe items use the wording “I notice” and “I pay attention to,” which may contribute to self-criticism (Brady et al.,
2019). In the Chinese version, these phrases are translated to “
zhu-yi,” which has a connotation of “be careful.” For example, a common Chinese expression of regret and intention to avoid similar future mistakes is “I will be careful (
zhu-yi) next time.” Future research could employ qualitative methods to further explore the meaning of these phrases to Chinese youth.
Furthermore, research has suggested that the original Observe facet primarily includes observation of the external environment, and that adding items related to emotional observation would benefit this facet’s psychometric properties (Reffi et al.,
2021; Rudkin et al.,
2018). Thus, Rudkin et al. (
2018) developed a new Observe measure that included three subfacets — Body Observing, Emotion Awareness, and External Perception — which has been validated in a Chinese college student population (Sun & Chen,
2023). Consistently, our findings showed that Observe encompassed four items from the original Observe facet and one item from the Nonreact facet (i.e., I watch my feelings without getting lost in them), thereby involving observation of both external and internal experiences. Future research should replicate our findings in other age groups and explore additional culturally relevant experiences that may be added to the Observe facet.
Our validated FFMQ-15 retained only 3 items from the original Describe facet, which were regrouped into two distinct subscales (2 items in Internal Awareness, 1 item in Attention). This may be because Chinese children often experience difficulties in articulating emotions, as research has consistently shown that children from cultures that emphasize communal interdependence, such as Chinese culture, often exhibit more restrained emotional expressions than those from individualistic cultures (Camras et al.,
2006; Louie et al.,
2015). This is particularly evident in the context of negative emotions. Within Chinese culture’s communal paradigm, suppressing negative emotions is valued for its contribution to social harmony and collective solidarity (Chen,
2000). Additionally, in contrast to the direct and explicit emotional expression in low-context cultures (Hall,
1976), high-context cultures, such as Chinese culture, often rely on contextual cues to convey and interpret emotional states (Yang et al.,
2021). Corroborating this cultural pattern, Chinese parents are often reluctant to discuss emotions with their children (Doan & Wang,
2010). Therefore, the lack of conversation and expression of emotions may have limited the Describe items’ applicability in our sampled Chinese youth.
The FFMQ-15 also only retained 3 items from the original Nonjudge facet, which now constitutes the Internal Awareness subscale that shows lower internal consistency compared with the other two factors. This may be because youths from different cultural backgrounds interpret the Nonjudge items differently. The Nonjudge subscale consists of reverse-scored items such as “I criticize myself for having irrational or inappropriate emotions.” In the Chinese context, these items may be perceived as self-reflection that benefits one’s emotional development rather than harsh self-blame. Deeply influenced by Confucianism, Chinese culture encourages self-reflection as an important way for personal growth (Cheng,
2004; Li & Wegerif,
2014). Widely known traditional teachings that embody this philosophy include “On seeing a man of virtue, try to become his equal; on seeing a man without virtue, examine yourself to not have the same defects” (Confucius, ca. 500 B.C.E., The Analects); “A man studies broadly and examines himself daily, then he becomes wise and acts without fault” (Xunzi, ca. 316 B.C.E.–237 B.C.E., Exhortation to Learning). Supporting this argument, research conducted in Eastern contexts has found negative correlations between Nonjudge and other FFMQ subscales. For example, Meng et al. (
2020) validated FFMQ among Chinese adults and found that Nonjudge was negatively correlated with Nonreact, Observe, and Describe, but it was positively correlated with Actaware. Therefore, our findings suggested that Chinese adolescents may perceive and engage in self-judgment as a way of self-reflection rather than a way to respond to their internal experiences. These cultural differences may lead Chinese youth to respond to Nonjudge items differently from Western youth, which may explain the lower internal consistency of Internal Awareness that includes the three Nonjudge items.
Empirical evidence also suggested that Chinese participants may adopt a dialectical approach in understanding nonjudgment. For example, a recent focus-group study (Zhao et al.,
2021) found that Chinese young adults reported both positive and negative perspectives of nonjudgment and self-judgment. While some participants considered being nonjudgmental to oneself could “soothe negative emotions after failure,” others perceived nonjudgment as “making excuses for failure” and impeding self-improvement. Similarly, half of the participants in that study considered self-judgment undermining, whereas the other half perceived self-judgment as an adaptive strategy to “reflect on your problems” and “avoid similar mistakes in the future.” These findings call for qualitative studies to further explore Chinese youths’ understanding of nonjudgment and its impact on their well-being.
Notably, the FFMQ-15 only retained one item from the original Nonreact facet (i.e., I watch my feelings without getting lost in them). We speculate this may be a result of children’s difficulty to comprehend some of the items. According to Piaget’s (
1971) cognitive development theory, children develop abstract reasoning at age 11–14. Therefore, it might be difficult for children younger than 11 years to understand abstract terms such as “step back” when having distressing thoughts (Item 9) and “pause” in difficult situations (Item 11), whereas items that include specific examples such as those in the Observe subscale (e.g., “I pay attention to sensations, such as the wind in my hair or the sun on my face”) might be easier to understand. Similarly, Tran et al. (
2013) found Nonreact a weak indicator of its intended construct and proposed to revise this facet to improve item comprehensibility and item discrimination. Van Dam et al., (
2009,
2012) also suggested that individuals’ understanding of the FFMQ items would influence their response. Future studies may consider clarifying these items’ meaning or providing specific examples when measuring mindfulness among children and early adolescents.
Limitations and Future Research
Our study has several limitations. First, we used self-report measures to assess participants’ mindfulness. Self-report measures have been widely used in mindfulness studies because mindfulness often assesses internal experiences that are hardly observable by others. However, self-reporting may influence measurement validity and reliability, especially when participants’ abstract reasoning ability is limited in understanding the items’ meaning, which may yield bias in their responses. Future research aiming to develop and validate child mindfulness measures could explore how parent- or teacher-reported measures, neurobiological tests, and behavioral observation may be used to triangulate with children’s self-reported data. For example, heart rate variability, a physiological marker associated with stress, may supplement the assessment of children’s stress response (Christodoulou et al.,
2020). In addition, behavioral observation, such as the Electronically Activated Recorder method (Kaplan et al.,
2018) may be used to assess youth mindfulness behaviors in daily life.
Second, given that our participants were early adolescents, we chose a relatively short measure that is more suitable for screening younger populations. We made this compromise to avoid cognitive overload and to accommodate the limited testing time, considering early adolescents’ ability to use Likert-point scales (Mellor & Moore,
2014) and their attention spans. While we established the validity of the short version proposed by Tran et al. (
2013), several other short versions exist (Medvedev et al.,
2018), although their samples were adults aged around 40 years. Bender et al. (
2023) and Goodman et al. (
2017) reviewed mindfulness measures for children and youth, highlighting the importance of using measures validated in youth sample. Therefore, we considered the version by Tran et al., (
2013), which has been validated in youth samples, more appropriate for our sample’s developmental stage. However, future research should validate other FFMQ versions, including the original full scale, in children and adolescents if possible.
Third, although the three-factor structure showed adequate validity and acceptable internal consistency, the Internal Awareness subscale’s relatively low internal consistency suggests room for improvement in its content. This may be a result of early adolescents’ developmental stage and sociocultural differences between East Asian and Western societies. More research is needed to further validate the FFMQ subscales across age groups and cultural contexts. Qualitative interviews may help researchers understand how early adolescents interpret the items, and the translation–back translation approach (the measurement is first translated into the target language and then translated back into the original language, such as that used in Bozkurt et al.,
2024) may help identify discrepancies in item description.
Fourth, our sample only included three participating school students from a nonclinical setting in one city. In addition, 65% of our participants came from families with below-local-average income (defined as monthly household income below CNY 10,000, approximately USD $1,450, based on local dual-income household wage; Shenzhen Statistics Bureau,
2022). Moreover, our sample was drawn from a school-wide population and all consenting students were eligible for inclusion. Including all students provided a more comprehensive understanding of student populations in school settings. However, students’ levels of depressive and anxiety symptoms spanned across normal and at-risk ranges, which introduced heterogeneity to our sample. Due to these database and sample limitations, our findings cannot be generalized to other geographic regions, clinical populations, and higher-income populations. Future studies should replicate our results with larger and more diverse samples of Chinese adolescents from varied regions, socioeconomic backgrounds, and clinical settings.
Last, our data were from a single measurement point, which precludes the evaluation of test–retest reliability and the longitudinal associations between mindfulness and other psychosocial constructs. Future research should use longitudinal designs to test FFMQ’s structural stability over time. Such findings will also significantly enhance the understanding of the predictive role of mindfulness facets in Chinese early adolescents’ well-being, thereby offering valuable insights for developing more targeted MBIs.
Despite these limitations, to our knowledge, this is the first study to validate the commonly used multifaceted mindfulness measure, FFMQ, among Chinese early adolescents. The increasing interest in MBIs among children and adolescents requires the continuous refinement of reliable and valid mindfulness measurements for these populations. A short-form Chinese-version FFMQ may be used to further identify the underlying mechanisms of the various aspects of mindfulness among young participants.
Our discrepancies with previous studies also warrant further exploration, such as the potential cultural difference in children’s experiences of the Describe items, the meaning of Nonjudge items, and the limited relevance of the Nonreact items among Chinese early adolescents. Notably, a recent cross-cultural comparison suggested that FFMQ may capture the conceptualization of mindfulness prevalent in Western and individualistic cultures more so than that in collectivistic cultures. Furthermore, previous research suggested a substantially better fit of the FFMQ in individualistic cultures and a lack of metric equivalence of FFMQ items across cultures (Karl et al.,
2020). Therefore, more research is needed to explore the conceptualization and operationalization of the varied facets of mindfulness in non-Western cultures.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.