Applied Behavior Analysis in Children and Youth with Autism Spectrum Disorders: A Scoping Review

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Associated Data

ESM 1: (DOCX 502 kb) GUID: 76B92E4D-FE49-4708-9C83-7E6A928D53E1 ESM 2: (DOCX 19.7 kb) GUID: A46D230C-5E6D-41BB-B35D-E43BE5E2C3FA ESM 3: (DOCX 166 kb) GUID: E9823C01-45DB-4E34-9F1B-EC4611A123F2 ESM 4: (DOCX 110 kb) GUID: 76C10A5B-EE33-492F-A70E-70924B949870 ESM 5: (DOCX 38 kb) GUID: 0CBB0B5A-E5C1-46CC-9C58-3F7136116C3A

Abstract

This manuscript provides a comprehensive overview of the impact of applied behavior analysis (ABA) on children and youth with autism spectrum disorders (ASD). Seven online databases and identified systematic reviews were searched for published, peer-reviewed, English-language studies examining the impact of ABA on health outcomes. Measured outcomes were classified into eight categories: cognitive, language, social/communication, problem behavior, adaptive behavior, emotional, autism symptoms, and quality of life (QoL) outcomes. Improvements were observed across seven of the eight outcome measures. There were no included studies that measured subject QoL. Moreover, of 770 included study records, only 32 (4%) assessed ABA impact, had a comparison to a control or other intervention, and did not rely on mastery of specific skills to mark improvement. Results reinforce the need for large-scale prospective studies that compare ABA with other non-ABA interventions and include measurements of subject QoL to provide policy makers with valuable information on the impacts of ABA and other existing and emerging interventions.

Supplementary Information

The online version contains supplementary material available at 10.1007/s40614-022-00338-x.

Keywords: children and youth, neurodevelopmental disabilities and disorders, applied behavior analysis, autism spectrum disorder

Introduction

Neurodevelopmental Disorders and Disabilities (NDD/D)

NDD/D consist of a range of diagnoses and functional impairments of a neurological origin that can present as functional deficits in developmental milestones such as language, communication, social skills, intellect, executive functioning, and motor development (American Psychiatric Association, 2013; Miller et al., 2013; World Health Organization [WHO], 2001, 2020). The prevalence of NDD/D across developed countries in children and youth 18 years of age and younger ranges from 8% to 15% (Arim et al., 2017; Boyle et al., 2011; Olusanya et al., 2018). Many different conditions and functional limitations are included within the scope of NDD/D, including autism spectrum disorders (ASD), attention deficit/hyperactivity disorder (ADHD), Down syndrome, and intellectual disabilities (ID). In particular, ASD has garnered much attention worldwide due to its high prevalence and associated socioeconomic and familial costs (Reichow et al., 2018).

ASD is a spectrum of diagnosable neurodevelopmental disorders that include pervasive developmental disorders (PDD), Asperger’s syndrome (AS) and autism. ASD typically presents during the developmental period and includes social communication and interaction difficulties, along with restricted and repetitive behaviors, interests, or activities (WHO, 2020). The prevalence of these disorders has increased over the past 20 years due to many combining factors. The global estimated prevalence in children and youth 18 years of age or younger is 0.62%–0.70% but could be as high as 1%–2% (Elsabbagh et al., 2012; Fombonne, 2009; Idring et al., 2012; Russell et al., 2014). The lifetime cost for families with a member diagnosed with ASD can range from approximately US$1.4 million in the United States and the United Kingdom, when diagnosed without an additional ID, to US$2.4million in the United States and US$2.2million in the United Kingdom if diagnosed concurrently with an ID (Buescher et al., 2014). Due to its increasing prevalence, the need for effective, evidence-based interventions for ASD has grown exponentially. Applied behavior analysis (ABA) and the interventions that are developed from its principles are some of the most often cited evidence-based interventions developed for the treatment of those diagnosed with ASD. As such, ASD will be the primary diagnosis of consideration within the current scoping review.

Applied Behavior Analysis

At its core, ABA is the practice of utilizing the psychological principles of learning theory to enact change on the behaviors seen commonly in individuals diagnosed with ASD (Lovaas et al., 1974). Ole Ivar Lovaas produced a method based on the principles of B. F. Skinner’s theory of operant conditioning in the 1970s to help treat children diagnosed with ASD (or “autism” at the time) with the goal of altering their behaviors to improve their social interactions (Lovaas et al., 1973; Skinner, 1953; Smith & Eikeseth, 2011). To evaluate this method, the University of California at Los Angeles (UCLA) Young Autism Project model was developed and empirically tested by measuring the effects of the intervention when administered one-to-one to children diagnosed with ASD for 40 hr per week over the span of 2–3 years (Lovaas, 1987). The remarkable findings revealed that 47% of the children who participated in this treatment reached normal intellectual and educational functioning compared to only 2% of a control group (Lovaas, 1987).

ABA has evolved over the past 60 years from the core principles established in the early Lovaas model and subsequent UCLA Young Autism Project into many comprehensive treatment models and focused intervention practices, methods, and teaching strategies, all of which aim to address deficits for children and youth with ASD across all levels of functioning, including cognition, language, social skills, problem behavior, and daily living skills (Reichow et al., 2018). One notable and often cited foundational model is “antecedents, behavior, and consequences,” otherwise known as the ABC model, in which manipulating either or both the antecedents and consequences of behavior is intended to increase, decrease, or modify the behavior, thus resulting in a transferrable tool to target behaviors of interest effectively (Bijou et al., 1968; Dyer, 2013). There are also a number of techniques commonly associated with ABA that are worth noting, including reinforcement, extinction, prompting, video modeling, as well as the Picture Exchange Communication System (PECS), though many of these are widely used in other intervention and education settings (Granpeesheh et al., 2009; Sandbank et al., 2020; Stahmer et al., 2005).

Some specific comprehensive ABA-based treatment models that are investigated in this review include early intensive behavioral intervention (EIBI), Early Start Denver Model (ESDM), and Learning Experiences: An Alternative Program for Preschoolers and Their Parents (LEAP). EIBI is an intensive, comprehensive ABA-based treatment model for young children diagnosed with ASD. EIBI targets children under the age of 5 and is often administered 20–40 hr per week for multiple consecutive years (Matson & Smith, 2008; Reichow et al., 2018). It is conducted one-to-one in a structured setting such as in the home or school, and often utilizes the discrete trial training (DTT) method (Cohen et al., 2006; Smith, 2001) in conjunction with other, less structured teaching methods such as natural environment training (Granpeesheh et al., 2009). Because this is a comprehensive treatment model, the target of the intervention is across all aspects of functioning such as independent living skills, social skills, motor skills, pre-academic and academic skills, and language (Granpeesheh et al., 2009). Another comprehensive ABA-based treatment model is ESDM. This model was developed for children with ASD that fall within the age range of 12–60 months. This intervention builds upon the naturalistic teaching methods within ABA to provide a comprehensive, developmental, and relationship-based behavioral intervention targeted at children early in development (Dawson et al., 2010). More recently, some comprehensive ABA treatment models have further shifted away from intensive, operant conditioning based one-to-one models into more naturalistic and generalizable programming. LEAP is one such model for children with ASD because it takes place in public school settings (Strain & Bovey, 2011). LEAP was developed from fundamental principles of ABA and includes a variety of methods commonly used in ABA such as Pivotal Response Training (PRT), time delay and incidental teaching, in addition to utilizing peer-mediated interventions and the PECS (Strain & Bovey, 2011). It is significant that a core principle of LEAP is to strongly emphasize parental and peer involvement with respect to teaching behavioral strategies and relies on naturally occurring, incidental teaching arrangements, in contrast to the directional, adult-driven instruction used in most other segregated ABA intervention strategies (Hoyson et al., 1984; Strain & Bovey, 2011).

Within these comprehensive treatment models, focused intervention practices that are often utilized and independently investigated can include, but are not limited to, DTT and naturalistic teaching strategies such as PRT and functional communication training (FCT). DTT is one of the most fundamental focused intervention practices of ABA and utilizes sequences of instruction and repetition in a distraction free, one-to-one setting (Smith, 2001). The primary focus of DTT is to teach children new behaviors and discriminations. These new behaviors encompass any behavior that was not previously performed by the child knowingly or unknowingly (Smith, 2001). Naturalistic teaching forms of ABA have sought to improve the ability to generalize and maintain the positive effects of behavioral interventions while upholding many of the fundamental principles and behaviorism of ABA (Schreibman et al., 2015). One such method of naturalistic teaching is through the focused intervention practice of PRT, developed by Koegel and Koegel (2006), which is focused on improving the self-initiative and motivation of a child to communicate effectively in common real-life settings (Mohammadzaheri et al., 2015). Of note, most of these treatments can involve a professional, though many of the more recent studies and iterations of these treatments seek to involve peers, siblings and family members to encourage generalization to real-world settings and people in the child’s personal life (Mohammadzaheri et al., 2015; Steiner et al., 2012). Another focused intervention practice and naturalistic teaching method is FCT, a differential reinforcement-based procedure developed by Carr and Durand (1985) that reduces problem behaviors by replacing them with more appropriate communicative responses. This training is commonly used in conjunction with other ABA methods.

Given the history and range in interventions, there is a degree of variability and confusion in the definition of ABA as a system. Definitions range from rigid protocols for some ABA-based programs to collections of specific techniques associated with ABA, to ABA as a system to evaluate practices rather than as an intervention itself. Granpeesheh et al. (2009) define ABA as “the application of principles of learning and motivation to the solution of problems of social significance” (p. 163). This definition of ABA as a research strategy echoes that of Baer et al. (1968) through the later 20th century, in particular in terms of behavior study being: (1) applied, (2) behavioral, (3) analytic, (4) technological, (5) conceptually systematic, (6) effective, and (7) capable of generalized outcomes. Agency definitions tend to define it as a therapy, likewise noted by Schreibman et al. (2015), with different approaches listed as types. For instance, the Centers for Disease Control and Prevention (CDC) defines ABA as a treatment approach, with examples such as DTT, EIBI, ESDM, PRT, and verbal behavior intervention (VBI; CDC & National Center on Birth Defects & Developmental Disabilities, 2019). The National Institute of Child Health and Human Development (NIH) lists positive behavioral support (PBS), PRT, EIBI, and DTT as types of ABA (Eunice Kennedy Shriver National Institute of Child Health & Human Development, 2021). The Autism Society(n.d.) follows the same definition as Baer et al., whereas other intervention types such as PRT and extinction are described as ABA procedures or as sharing principles of ABA. Many ABA-derived programs define certain expectations of their practices specifically, such as EIBI setting, intensity, duration, and personnel, although their methods list a variety of techniques deemed ABA-based, such as DTT, precision teaching, and incidental teaching. As combined approaches become more common, it is becoming more difficult to differentiate interventions considered to be ABA-derived from other non-ABA labeled interventions (Smith, 2012).

All of the research into these methods, programs, and comprehensive models, combined with the continued investigations into the traditional applications of the ABA-based interventions, results in a wealth of research about the impact of ABA on children and youth with ASD, in particular with respect to improvements in cognitive measures, language skills, and adaptive skills (Eldevik et al., 2009; Virués-Ortega, 2010). The ensuing amount of scientific evidence has resulted in ABA being considered a “best practice” and thus endorsed by the governments of Canada and the United States for the treatment of children and youth with ASD (Government of Canada, 2018; U.S. Department of Health & Human Services, 1999).

Rationale for Current Scoping Review

As ABA is a broad intervention which includes many different methods and programs, reviews of the entire scope of the current research are uncommon. To our knowledge, a comprehensive review of the current ABA literature that spans all ABA methods and outcomes for children and youth with ASD, and that includes randomized controlled trials (RCT), clinical controlled trials (CCT), and single-case experimental design (SCED) studies, has not been completed. The current literature consists primarily of systematic reviews and meta-analyses that have investigated the quantifiable and qualitative outcomes of ABA on children with ASD, but few of these studies include SCED, and the results across the reviews inconsistently show significant improvement with ABA interventions.

For example, in a meta-analysis by Virués-Ortega (2010), the effectiveness of ABA was investigated across 22 included studies with respect to as many outcomes as possible, including language development, social functioning, intellectual functioning, and daily living skills, for those diagnosed with ASD (Virués-Ortega, 2010). The results of this meta-analysis suggested that ABA interventions that were implemented in early childhood and were long-term and comprehensive in design did result in a positive medium to large effect in the areas of language development (pooled effect size of 1.48 for receptive language, 1.47 for expressive language), intellectual functioning (pooled effect size 1.19), acquisition of daily living skills (pooled effect size 0.62), and social functioning (pooled effect size 0.95), when compared to a control group that did not receive ABA intervention. This mirrors the meta-analysis of 29 articles conducted by Makrygianni et al. (2018), where it was found that ABA programs for children with ASD resulted in moderate to very effective improvements in expressive and receptive language skills, communication skills, nonverbal IQ scores, total adaptive behavior, and socialization, but lesser improvements in daily living skills. In a 2018 meta-analysis by Reichow et al. (2018), the changes in autism severity, functional behaviors and skills, intelligence, and communication skills were investigated across five articles that included one RCT and four CCTs for EIBI. After conducting meta-analyses of these studies, it was found that the evidence for EIBI improving adaptive behavior compared to treatment as usual comparison groups was positive but weak (mean difference [MD] = 9.58; 95% confidence interval (CI) 5.57–13.60), whereas there was no evidence that EIBI improved autism symptom severity (standardized mean difference [SMD] = −0.34; 95% CI −0.79–0.11; Reichow et al., 2018). Therefore, the current literature appears to indicate inconsistent results with respect to the magnitude of improvements seen as a result of ABA interventions for children and youth with ASD.

With respect to the wealth of SCEDs included throughout the ABA literature, Wong et al. (2013) have noted that existing reviews rarely capture these types of studies, with two notable exceptions conducted by the National Autism Center (2009) and the National Professional Development Center on ASD (NPDC; Odom et al., 2010). These studies still had some key exclusions: the National Autism report excluded articles that (1) did not have statistical analyses, (2) did not include linear graphical presentation of the data for SCEDs, or (3) used qualitative methods, whereas the NPDC report searched for studies on behavioral strategies that fulfilled the requirements of being an evidence-based practice, as defined by the authors (National Autism Center, 2009, 2015; Odom et al., 2010). Neither of these reports evaluated the entire scope of the available ABA research with respect to children and youth with ASD, potentially missing the value of the studies that were excluded.

The purpose of the current review therefore is to evaluate the available literature on ABA as an intervention approach in the treatment of ASD in children and youth in an effort to help instruct the scientific community on the most beneficial directions for future research. Moreover, as ABA is commonly recognized at a governmental level as evidence-based, a review of the current ABA literature will help inform other existing and emerging therapies and interventions, researchers, policy makers, and the public of the standard to which established, evidence-based interventions are held. This is accomplished by collecting, compiling, and discussing the available data on the most common outcomes and methods. This includes the most common journals of publication, population metrics, and the transferability of this prominent therapy approach to the real world. As such, the objectives of this scoping review are to examine the extent, range, and nature of research activities regarding the impact of ABA on children and youth with ASD and to identify any gaps in the existing literature regarding ABA outcomes and research designs.

Methods

A scoping review study design was selected for the current investigation. According to Colquhoun et al. (2014), “a scoping review is a form of knowledge synthesis that addresses an exploratory research question aimed at mapping key concepts, types of evidence, and gaps in research related to a defined area or field by systematically searching, selecting, and synthesizing existing knowledge” (p. 1293). Scoping reviews differ from systematic reviews in that they provide an overview of existing evidence regardless of the quality (Tricco et al., 2016), and may not formally assess study rigor (Arksey & O’Malley, 2005).

The current scoping review was conducted to gather an understanding of the scope of available research regarding the use of ABA as an intervention for children and youth living with NDD/D, and in particular ASD. For the purposes of the current review, ABA will be defined as an intervention informed and developed from behavioral analytic approaches for the treatment of children and youth with ASD. The effect of ABA is defined as the measurable changes in a participant's various outcomes as a result of receiving ABA intervention. These outcomes were not predefined to prevent missing any possible impact. The review comprised a database search, as well as a reference search of selected reviews. A second phase of the literature search was conducted to update the sample to reflect more recent literature. A guiding document by Tricco et al. (2016) was used for direction and as a reference for conducting this review.

Search Strategy

An initial search was conducted across PubMed, MEDLINE (EBSCOHost), Cumulative Index to Nursing and Allied Health Literature (CINAHL), PsychINFO, Educational Resources Information Center (ERIC), Cochrane Central Register of Controlled Trials (CENTRAL), and Cochrane Database of Systematic Reviews (CDSR) utilizing medical subject heading (MeSH) search terms and limitations to describe the relevant population in the initial search (children and youth with NDD/D) and intervention (ABA) (see Appendix 1 for a full list of search terms for each database). Additional limitations of the search were English language publications, subject age range of 0–18 years, and publication date range. The search was conducted in two phases: January 1, 1997 through December 31, 2017, and January 1, 2018 through December 31, 2020.

Several reviews were selected for a further text search. Data were not extracted directly from eligible reviews. Instead, their selected articles were screened and added to the sample if they were not already included in the initial search. This process was repeated for any secondary reviews that occurred as well. These additions were excluded from the publication date limitation, resulting in the inclusion of a number of studies outside of the initial search date range. Review and meta-analysis results were not coded.

Selection Criteria

A PICO (population, intervention, comparison, outcome) framework was used to guide the selection of articles. Population and intervention were used as eligibility criteria. Although the intervention was restricted to ABA, the population was originally defined broadly as NDD/D in an effort to capture as much of the applicable literature as possible, and later revised to focus on ASD and mixed diagnoses (ASD and other). This included populations where some subjects had other non-ASD diagnoses, such as ADHD, Down syndrome, or ID, whether they co-occurred with ASD within subjects or presented across subjects. Non-ASD diagnoses observed in the mixed-diagnoses category of the current review are described in the results (“Results: Description of Included Studies”) and in Appendix 2. Outcome was not considered because one objective of the current scoping review was to identify the measured outcomes. Comparison was not used so as not to limit the scope of the review. Study design was not limited in the initial search.

Inclusion criteria for article selection during the initial search comprised (1) English language articles that are (2) about the effects of ABA on (3) children and youth (birth to 18 years) with NDD/D, within (4) the timeframe of January 1, 1997 through December 31, 2020. As described above, screened articles included from selected reviews and secondary reviews were exempt from the date range limitations.

Exclusion criteria comprised (1) hospital-based (inpatient) settings and mixed-setting studies (i.e., those including some inpatient subjects); (2) use of qualitative research methods; (3) publications that are not “research-based” (e.g., newsletters, books); (4) populations exceeding 18 years of age; and (5) combined interventions if not looking specifically at the effectiveness of ABA intervention. In cases of mixed age (i.e. including subjects over 18 years of age) or mixed population (i.e., including typically developing subjects), studies were excluded if it was not possible to extract results for the target population separately. Inpatient settings were excluded because the focus of the current scoping review was on community offerings, not hospital services. A small number of studies were excluded when the methods did not align with typical ABA outcome measures, such as those training response hierarchies or attempting to condition new reinforcers. A library search was conducted for studies that could not be accessed in full online, and any that could not be found were subsequently excluded.

When the diagnostic criteria were narrowed to focus primarily on ASD, articles that contained only non-ASD diagnoses were excluded.

Screen Process and Study Selection

Articles from the original search of online databases were exported to Mendeley® Desktop versions 1.19–2.62.0, a reference management software, where most duplicate studies were automatically identified and removed. Any remaining duplicates from both the database and review search were removed manually. Titles and abstracts of all retrieved articles were then independently reviewed by two researchers following the outlined inclusion and exclusion criteria. Studies were included if the independent reviewers reached agreement, or after further discussion with a third reviewer. Retained articles then underwent full text review for inclusion, following the same steps.

Data Extraction

Articles included following the full text review then underwent data extraction. Extracted data comprised first author, title, year of publication, origin of study, funding sources, study aim, study design, duration of intervention, duration of study, population size, population description, setting, measurement outcomes, measurement tools, and key findings. In cases where results were reported individually for each subject, they were extracted as such. In larger scale studies where only group results were reported, group results were extracted, so long as the group included only the target population.

Data Coding and Synthesis

Coding

In general, the entire sample of records included for coding and synthesis was subdivided into three sections concerned with: (1) general ABA Impact, (2) Comparisons of ABA Techniques, and (3) Between-Groups Comparisons of ABA to control or other interventions. These divisions are visually summarized in Figure Figure1 1 and are described below. All records underwent general data coding of basic study information, as well as specific outcome coding, also described below. (Details about coding definitions can be found in Appendix 2.) Simplified extraction tables for these three subdivisions are available in Appendix 3 (Tables S1, S2, and S3).

An external file that holds a picture, illustration, etc. Object name is 40614_2022_338_Fig1_HTML.jpg

Flowchart Describing the Process of the Current Scoping Review Search, Screening, Data Extraction, and Coding. Note. From an initial search comprising 2,948 records, after screening studies and subdividing multipart studies, a total of 770 study records remained. These were coded in three categories: Comparisons of ABA Techniques, ABA Impact, and Between-Groups Comparisons. Designed with reference to Tricco et al. (2016) and created using diagrams.net™/draw.io® from JGraph Ltd. Note that three study records were included in both the ABA Impact section and the Comparisons of ABA Techniques section (Mello et al., 2018; Rad et al., 2019; Vietze & Lax, 2020), and three study records were included in all three coding sections (Dugan, 2006; Kalgotra et al., 2019; Kovshoff et al., 2011).

During the process of coding, articles containing multiple concurrent or consecutive studies were separated into discrete rows, and will hereafter be treated as self-contained studies in this review. In all figures and further text, all coded rows are referred to as “study records.” Once separated, researchers identified and excluded (1) functional analyses or studies focused on their use, (2) preference assessments or studies focused on their use, and (3) predictive studies. Study records were coded independently by two researchers and then discussed to obtain agreement, or referred to a third researcher to obtain agreement. During coding, any further study records found to satisfy the exclusion criteria were excluded.

Items selected for general data coding included publication details, population metrics, and several specific study methods. The population metrics were age, sex, and diagnosis of participants. (Detail on the population coding values can be found in Appendix 2). Study records were additionally coded and compared by two independent researchers to identify inclusion of the following methods: (1) follow-up or maintenance, (2) mastery or criterion measures, (3) generalization. Studies including comparison groups were further coded by one researcher to identify the presence of (1) a control group (typically consisting of “eclectic” or treatment as usual), (2) comparisons to other non-ABA intervention/s, or (3) a mix of these.

After general data coding, the sample was separated into two groups for outcome coding: ABA Impact and Comparisons of ABA Techniques. The majority of study records fell into the ABA Impact section, in which study records measured the change in outcomes (e.g., amount improved) as a result of exposure to ABA intervention. In contrast, study records that were primarily concerned with comparing multiple techniques or intensities of ABA were reserved for the Comparisons of ABA Techniques section, because general ABA impact could not easily be determined for the entire study population in these studies. Finally, a select number of study records from the ABA Impact section where ABA interventions were also compared to a control or different intervention were coded a second time to describe these comparisons in the Between-Groups Comparisons section. As noted in Fig. Fig.1, 1 , some studies from the ABA Impact section also fell into the Comparisons of ABA Techniques section, or into all three sections.

Although the search was not restricted, the observed outcome measures were classified into eight categories: cognitive, language, social/communication, problem behavior, adaptive behavior, emotional, autism symptoms, and quality of life (QoL) outcomes. At first, QoL was included to help describe the generalizability and real-life utility of ABA interventions, following the example of Reichow et al. (2018). However, as no instances of subject QoL measures occurred in this search, this outcome is not included in the subsequent synthesis. Within each category, outcomes were generally classified as improvement, regression, mix, or no change, as can be seen in the extraction tables (Tables S1, S2, and S3 in Appendix 3).

When more than two variables or interventions were compared, which sometimes occurred in the Comparisons of ABA Techniques and Between-Groups Comparison sections, study records were discussed and split into discrete rows by two researchers to represent simplified or single-variable comparisons in each row. These are termed “comparison records” for the purpose of coding and synthesis. As seen in Tables S2 and S3 in Appendix 3, further detail was extracted regarding the category of techniques or interventions compared and the relative effectiveness of each.

Prior to coding, researchers categorized outcome measures, measurement scales or strategies, and intervention categories observed during the extraction process into tables in an effort to mitigate potential inconsistencies in coding. For example, in the Comparisons of ABA Techniques section, categories were broadly defined as Teaching, Stimulus Characteristics, Reinforcement, Subject/Setting Characteristics, and Comparisons of ABA Interventions. Further descriptions of these and other categories can be found in Appendix 2.

Further details on general data coding, as well as outcome coding for ABA Impact, Comparisons of ABA Techniques, and Between-Groups Comparisons can be found in Appendix 2. Extractions for all three sections can be found in Tables S1, S2, and S3, respectively, in Appendix 3.

Synthesis

All statistical analyses, compilations, and tabulations were completed using Microsoft® Excel® versions 1805-2111. Descriptive analyses (means, medians, etc.) were calculated using native Excel® functions. Pivot tables were utilized to tabulate frequencies. Figures were generated using Microsoft® Excel® version 2016 MSO, Microsoft® Word® versions 2011–2111, and diagrams.net™/draw.io® by JGraph Ltd.

In addition, some qualitative characteristics were explored as well, such as observations about the types of methods used in the interventions encountered, the degree of mastery and generalization measures, and how targeted the interventions and measurement tools were.

Results

Identified Studies

As shown in Fig. Fig.1, 1 , the record selection process differed slightly between the two searches spanning 1997–2017 and 2018–2020. This is because the diagnostic criteria for the current manuscript were updated to exclude populations that only contained non-ASD diagnoses, and the removal of records satisfying the new criteria took place at different points for each search.

The database searches yielded a total of 2,074 entries after import to Mendeley®, and 874 entries from selected reviews and secondary reviews. Ten systematic reviews were identified and investigated for the literature search (Brunner & Seung, 2009; Dawson & Bernier, 2013; Makrygianni et al., 2018; Mohammadzaheri et al., 2015; Reichow et al., 2014, 2018; Rodgers et al., 2020; Shabani & Lam, 2013; Spreckley & Boyd, 2009; Virués-Ortega, 2010). After pulling references from the first five (Brunner & Seung, 2009; Dawson & Bernier, 2013; Makrygianni et al., 2018; Rodgers et al., 2020; Shabani & Lam, 2013), it was found that the references in the remaining five reviews were duplicates of previously identified references. Secondary reviews from Seida et al. (2009) and Dawson and Burner (2011), both cited by Dawson and Bernier (2013), were also investigated for references (Bassett et al., 2000; Bellini & Akullian, 2007; Delano, 2007; Diggle et al., 2002; Horner et al., 2002; Hwang & Hughes, 2000; Lee et al., 2007; McConachie & Diggle, 2007; Odom et al., 2003; Reichow & Volkmar, 2010; Smith, 1999). Records from Brunner and Seung (2009) that were categorized into treatment models that did not fulfill the definition of ABA as per the current review were not considered. In addition, the secondary review by Vismara and Rogers (2010) was not considered because it was a narrative review. After removing duplicates or entries already existing in the database search, 1,577 entries remained from the database search and 525 from reviews, for a total of 2,102 records.

A total of 1,337 records were removed during title, abstract, and full-text screening because they met the exclusion criteria, were duplicate records, were reviews, or contained only non-ASD diagnoses. Multipart studies were separated into discrete records, yielding a total of 849 study records. A further 34 were excluded at this stage as they were preference assessments, functional analyses, or were concerned with training response hierarchies or conditioning reinforcers, leaving 815 study records. When the diagnostic inclusion criteria were revised, any remaining records containing only non-ASD diagnoses were excluded.

Thus, the total sample included in the quantitative and qualitative synthesis comprised 770 study records. This entire sample was analyzed for general data metrics (see Fig. Fig.1). 1 ). References for the 709 included articles can be found in Appendix 4.

Description of Included Studies

Overall, agreement between raters was approximately 80% across all coding categories. The range of included outcome categories was selected in order not to limit the scope of the literature search and synthesis for this review so that a comprehensive review of the application of ABA for ASD and mixed-diagnosis populations across the entire time span and age range of the search could be conducted. Frequently occurring other diagnoses in the mixed-diagnoses category included ADHD; ID; global developmental delay (GDD) or other developmental delays; oppositional defiant disorder (ODD); Down syndrome; cerebral palsy (CP); fetal alcohol spectrum disorders (FASD); Angelman syndrome; Fragile X; obsessive-compulsive disorder (OCD); Tourette syndrome; traumatic brain injury (TBI); epilepsy or seizure disorders; sensory integration or processing disorders; speech/language delays; learning disabilities; and behavior, emotional, or mood disorders.

The most frequently occurring publication year was 2020. The earliest publication reviewed was from 1977 and the most recent from 2020. Thirty percent were from 2000–2009 and 61% were from 2010–2020. The remaining years comprised 9% of the journals reviewed.

The 5-year impact factor (IF) characteristics were determined by removing duplicate journals prior to calculation. IFs were accessed from Journal Citation Reports, via Clarivate™. The unique median IF was 2.56. The lowest impact journal had an IF of 0.71 and the highest had an IF of 9.92. Most of the reviewed study records were from the Journal of Applied Behavior Analysis (55%). The next most frequent journal was the Journal of Autism and Developmental Disorders, representing 4% of the journal cohort. Dissertations accounted for 4% of the cohort. Analysis of Verbal Behavior and Behavioral Interventions each made up 3% of our journal cohort, and the remaining journals contributed 1%–2% each. Journals contributing less than 1% were grouped as “Other,” making up 16% of the total cohort. Within the cohort of study records, 48% of records had participants that were solely male, 45% were of mixed sex, and 4% of the publications had solely female participants. Seventy-six percent of study records had participants with only ASD, and 24% had participants in the mixed-diagnoses category.

In the study records reviewed, 33% had one or two participants, whereas 31% of the publications had three participants, and 13% had four. Study records with 5 to 9 participants accounted for 11% of the total and 13% had more than 10 participants. The median number of participants was 3, whereas the mean number of participants was 8.12.

Overall, it was found that study records that included a smaller sample size (e.g., N ≤ 4) often investigated specific skills, tasks, or responses that varied based on each specific child (Gongola, 2009; Plavnick & Ferreri, 2011; Sullivan et al., 2020). Many studies modified the intervention or the definition of mastery dependent on the child or task given (Charlop-Christy & Daneshvar, 2003; Charlop et al., 1985; Ezzeddine et al., 2020; Lyons et al., 2007; Romaniuk et al., 2002).

Within the cohort of study records, 41% had some follow-up measure, 40% had some criterion or mastery measure, and 31% of publications had some generalization measure.

Study Outcomes and Findings

ABA Impact

After the general data coding stage, any study records from the total sample (N = 770) looking only at ABA Impact were coded for outcomes (N = 551), i.e., improvement, regression, mix, or no change in the eight outlined outcome categories. Any study records comparing different ABA techniques (N = 225) were designated for the next section (see “Comparisons of ABA Techniques,” below). The eight outcomes considered were cognitive, language, social/communication, problem behavior, adaptive behavior, emotional, autism symptoms, and QoL outcomes. Subject QoL is not reported in any tables, as there were no instances of this outcome being measured in the current cohort of study records.

The majority of study records reported improvement across all outcome categories, with 63%–88% of study records reporting improvement across the various outcome measures. In contrast, 0%–2% reported regression, 13%–36% reported mixed results, and 0%–13% reported no change (Fig. (Fig.2 2 ).

An external file that holds a picture, illustration, etc. Object name is 40614_2022_338_Fig2_HTML.jpg

Distribution of Improved, Regressed, Mixed, and Unchanged Results in the ABA Impact Section across the Measured Outcomes (N = 551 study records)

When observing outcome measures by age group (see Appendix 5, Table S4), among study records conducted with participants between ages 0–5 years, cognitive, language, and social/communication were the most commonly studied outcomes, at 22%, 23%, and 23% respectively. Of these, 66%, 68%, and 57% reported an improvement, respectively. Meanwhile, for ages 6–12, problem behavior and language were the most commonly studied outcomes at 25% each. Among these respective outcomes, 86% and 71% reported improvement. For ages 13–18, the most commonly studied outcome was cognitive (26%), followed by adaptive behavior (20%). Of these, 83% and 86% reported improvement, respectively. Finally, in the mixed-age groups, the most commonly studied outcome was language (28%), followed by social/communication (20%) and cognitive (20%). Of these three most studied outcomes, improvement was reported at 61%, 65%, and 62%, respectively. Detailed findings are available in Table S4 of Appendix 5.

Outcome measures were also divided by sex. Among the study records that only observed females, the most commonly studied outcome was problem behavior at 33%, with social/communication following at 23%. Improvement was recorded 85% and 67% of the time, respectively, for these outcomes. Among records looking at only males, language was the most studied outcome at 26%, followed by cognitive and social/communication at 21% each. These improved at 62%, 66%, and 59%, respectively. Among publications with mixed sexes, the most studied outcome measures were language (25%), cognitive (22%), and social/communication (21%). Of these, 65%, 71%, and 67% showed improvement, respectively.

Outcome measures were then divided by diagnosis (Tables S5 and S6). Among study records solely studying ASD, the most commonly studied outcomes were language, cognitive, and social/communication, making up 25%, 22%, and 22% respectively. Among these respective outcome measures, 68%, 68%, and 63% reported improvement. In the mixed-diagnoses category, the most studied outcomes were problem behavior (31%) and language (22%), with 70% and 58% reporting improvements, respectively. Detailed findings are available in Tables S5 and S6 in Appendix 5.

Next, secondary measures were classified. These included the presence of follow-up, whether interventions assessed mastery or criterion, and whether interventions assessed generalization. Out of the ABA Impact cohort, 41% had some follow-up, 40% had some measure of mastery/criterion, and 31% had some measure of generalization. Among study records that showed improvement within the various outcome measures, use of follow-up measures varied. Records that recorded improvements in cognitive, language, social/communication, and problem behavior outcomes had follow-up measures 47%–59% of the time. Records recording improvement in adaptive behavior and emotional outcomes had follow-up measures 67% and 64% of the time, respectively. Studies reporting improvement in autism symptoms had follow-up measures 100% of the time (see Appendix 5, Table S7). Within the current cohort, out of the study records that signified some improvement, the frequency of mastery/criterion measures varied. Measures of mastery/criterion ranged from 0% and 14%, respectively, for autism symptoms and problem behavior improved outcomes, to 25% and 29%, respectively, for adaptive behavior and social/communication, and 43%–49% for cognitive, language, and emotional improved outcomes (Table S7). With regard to generalization, no study records showing improvements in autism symptoms assessed any measure of generalization. Among other outcomes, generalization measures ranged from 14% for emotional improved outcomes, 24%–29% for problem behavior, adaptive behavior, and cognitive improved outcomes, and 39% and 46%, respectively, for language and social/communication improved outcomes (Table S7).

Comparisons of ABA Techniques

Many records from the current search investigated the effectiveness of different ABA methods or variables in delivery. This section of study records was further divided into discrete records wherever more than two variables were compared, for a total of 307 comparison records, which were then coded for outcomes. In this case, coding included which category of comparison was studied, and indicated whether one ABA method performed better, or if the results were mixed or had no change.

Five categories of variables were defined: Teaching, Stimulus Characteristics, Reinforcement, Subject/Setting Characteristics, and Comparing ABA Interventions. These are further described in Appendix 2. Within these categories, most comparison records were unique in the methods examined and thus could not be easily compared across this selection of records. That said, some trends were identified. First, many different teaching procedures were compared, such as how instructions were provided, tact versus listener training, or serial versus concurrent training (Arntzen & Almås, 2002; Delfs et al., 2014; Lee & Singer-Dudek, 2012). Several comparison records investigated the quality of the teaching procedures, commonly with respect to the integrity of reinforcement or teaching techniques (Carroll et al., 2013; Odluyurt et al., 2012). Others investigated the differences in personnel delivering the ABA interventions, such as a parent or clinician (Hayward et al., 2009; Lindgren et al., 2016), or differences in program delivery, such as via specific modeling, reinforcing, or prompting techniques (Campanaro et al., 2020; Jessel et al., 2020; Quigley et al., 2018). A number of comparison records compared time characteristics, such as reinforcement schedules or delays (Majdalany et al., 2016; Sy & Vollmer, 2012). Factors related to reinforcement in general were commonly compared and diverse in nature, spanning the quality, preference, presentation, and other aspects of reinforcement (Allison et al., 2012; Carroll et al., 2016; Fisher et al., 2000; Groskreutz et al., 2011). A few comparison records examined subject characteristics, such as the effectiveness of an ABA intervention based on the age of participant entry into the program or their diagnosis (Luiselli et al., 2000; Schreck et al., 2000), but slightly more commonly measured was the effectiveness of interventions administered in different settings such as at school, at a clinic, or at home (Hayward et al., 2009; Sallows & Graupner, 2005; Schreck et al., 2000). Some comparison records compared specific ABA intervention techniques, such as PRT, the Lovaas/UCLA model, or response interruption and redirection (RIRD), to one another (Dwiggins, 2009; Fernell et al., 2011; Lydon et al., 2011; Mohammadzaheri et al., 2014; Saini et al., 2015).

Table S8 (located in Appendix 5) displays the Comparisons of ABA Techniques group analysis of various intervention categories compared in the outcome measures. Teaching was the most commonly compared intervention category across six outcome measures, ranging from 38% to 64%, except for emotional (25%), and autism symptoms (10%). Comparing ABA interventions was the most commonly studied comparison in the emotional outcome (50%; 2 out of 4 comparison records), and subject/setting characteristics was the most commonly studied comparison in the autism symptom outcome (70%; 7 out of 10 comparison records). The improvement of one method over another was not always prevalent (Fig. (Fig.3). 3 ). Within the cognitive, language, and social/communication outcomes, 37%–40% of comparison records found that one method exhibited greater improvement than the other, whereas 47%–56% had mixed outcomes. This is similar for adaptive behavior, where 52% found that one method exhibited greater improvement and 39% were mixed. On the other hand, outcome measures for problem behavior and autism symptoms more clearly showed that one method exhibited greater improvement, at 65% and 70% (7 out of 10 records), respectively.