Anthony E. Kelly, Office of Educational Technology, US Department of Education, and George Mason University
This post builds on analyses of two use cases by Barbara Means of SRI. One case describes the testing of a multimedia intervention in a college-level course. The other case describes course redesign using longitudinal data analysis.
Drawing on these two cases, I propose some new approaches to design-based research, including increased collaboration through shared design and data repositories. I seek feedback on these nascent analyses, and welcome similar cross-case analyses using these and other use cases.
A Within-Case Principle: In the Koedinger use case, we learn of an initial design of a multimedia intervention informed by Mayer’s design principles (see related readings). The intervention sought to examine the “effects of adding diagrams to instruction on chemical equilibrium.” The first study, supplementing the existing text-based content with diagrams, resulted in no significant differences in performance.
The revised diagram design was informed by think-aloud protocol analyses of students’ understanding of chemical interactions. As the authors noted, “This qualitative work found that many students did not fully understand the temporal aspects of chemical change over time. They thought of chemistry equations as machines running from left to right, and did not appreciate the fact that the same variable would have different values before, during, and after the achievement of chemical equilibrium. This new understanding of typical student thinking helped to inform a more targeted application of the multimedia principle.” That is, to inform diagram design with cognitive science findings of conceptual understanding of difficult content, especially by low-performing students.
Given the efficiency of the within-course testing seen in this use case, and also in the Kaplan use case, it seems plausible to conduct efficient, simultaneous experiments comparing multiple mental models of equilibrium expressed by different multimedia objects (e.g., using split-plot factorial designs).
Distributed Design-Based Research
Co-Informing Use Cases: The Kaplan use case similarly involves manipulating learning variables. Thus, if one or more research groups shared a similar experimental goal (e.g., if Kaplan also asked, What is the impact of multimedia variability in learning?), it might be mutually advantageous to coordinate replications of the intervention in co-informing studies targeted at different learners in different contexts.
Such coordinated replication studies, with carefully mapped hypotheses, shared literature reviews, shared protocol, and outcome data, would be a rich resource of data for both individual projects’ analyses and secondary data analysis, including educational data mining. Further, coordinated studies would allow the scaling of results not simply by testing a hypothesis with a larger N to obtain statistical power within a study, but by purposively testing the hypothesis in different studies, in different contexts, and under different constraints—surely a more stringent standard for establishing external validity. Consider studies in dyslexia across different language orthographies and morphologies as an exemplar.
The timely sharing of intermediate results during the various stages of research could inform hypothesis generation, outcome measure design, and hypothesis testing in partner studies. For example, the finding by Koedinger on chemistry may be valuable for parallel efforts in physics or biology education studies. Such cross fertilization would mitigate the problem of “siloed” research in each content area. It could also dramatically shorten the time required for one research community to learn from another.
It may make sense, therefore, to establish a Design-Exchange for Scholars. The Design-Exchange could serve as a portal, a social space where researchers or commercial developers could seek advice from other researchers and developers, and share resources and best practices. The goal would be to create, as it were, an “eHarmony” for researchers and developers.
A successful Design-Exchange would create new, shared professional communities, lead to the aggregation of results, the validation of instruments, and a welcome growth in methodological sophistication. It would also amplify the return of investment for funding agencies.
Design-Exchange participants could also contribute to a shared data repository, such as Koedinger’s LearnLab’s PSLC DataShop. As data are added, alerts could be sent to networked primary and secondary data analysts.
With appropriate funding, a Design-Exchange could align data interoperability standards across projects, through shareable protocols. Research projects could then extend their reach to the growing research in K-12 projects such as ASSISTments, and also connect to the growing digital student data records in schools (e.g., the Schools Interoperability Framework (SIF).
As project leaders begin to understand the heuristics or design principles that informed their work (e.g., precede multimedia manipulation with think-aloud protocol analyses), these principles could be added to a Design-Principles Database. Such a data base already exists for design-based research (see Yael Kali’s work).
As the Design-Exchange database grows, its protocols could be tied to databases of funders such as the repository of NSF’s Research and Evaluation of Education in Science and Engineering (REESE) program. Thus, evidence-informed principles grounded in individual and co-informed studies could be specified and tied to a growing matrix of research findings searchable by content area, grade level, and so forth.
The findings and principles could be applied to learning objects by academic and commercial developers of learning technologies, designers of digital textbooks, and of other digital materials. With appropriate privacy protections, effective design principles in commercial apps and other consumer digital products, all of which are increasingly open to tracking and study, could further co-inform educational research in formal and informal settings.
Further, by emulating the practices of Koedinger and Kaplan, more complex research methods, such as complex A/B randomized testing, could be conducted in other studies (see Roschelle, Chung, and Popović’s discussion). A/B testing could be augmented by interrupted time series designs (e.g., Sloane & Kelly, 2008; Sloane, Helding & Kelly, 2008) or by ideas drawn from single case analyses, as in the What Works Clearinghouse Procedures and Standards Handbook and Shadish and Cook (2009).
The growing shared digital data resources sketched here could also improve undergraduate research and graduate training. There is a pressing need for analysts who can design and analyze complex data resources in education. Regrettably, a recent review of psychology doctoral programs decries the declining statistical and methodological preparation of doctoral students (see Aiken, West & Milsap, 2008, and a 2008 APA Task Force Report).
The Kaplan use case points to how the number and kind of student outcome measures can be diversified. For example, Pritchard from MIT drew attention to the return on learning for each dollar of tuition paid for by parents as a fair standard for judging university instruction at a recent National Academies’ Adaptive Educational Technologies Project meeting. In fact, the data flow for course design and redesign could be applied to the lamentable picture of students dropping out of the STEM pipeline, as discussed by Stephens and Richey (2011). Ideas for more complex item formats with features potentially sharable across studies may be found in the pre-publication edition of WestEd’s Technology and Engineering Literacy Framework.
An additional advantage of the proposed Design-Exchanges is that the findings could be made available in a timely fashion to authors of digital textbooks. Thus, the findings from Koedinger’s use case about better designs for multimedia interventions could be ported to digital textbooks for immediate use. Presumably, digital textbook firms could track the benefit of such changes using A/B testing or another suitable methodology. Further, digital textbook writers could keep up with ongoing changes in science knowledge (e.g., the demotion of Pluto as a planet, and the number of genes in the human genome, as Monica Hesse discussed in a recent Washington Post article).
Lastly, researchers could share test items. As Steve Midgley pointed out, this would allow researchers to learn about the behavior of items (some working well, some poorly), and describe and post better items to the design exchange for scholars. Too often, item writing is done off-line, sometimes by intuition. Further, many items may not survive IRT standards, but are otherwise informative (with a more varied sample of learners; or suitable for formative assessment if not for summative). The behavior of items (“in the wild”) of instruction may reveal important item characteristics (exposed by educational data mining, for example) that are not picked up by more standards approaches. For a related discussion, see the NAEP Technology and Engineering Literacy Framework mentioned above.
I welcome your comments and additions to these musings.
Aiken, L. S., West, S. G., & Millsap, R. E. (2008). Doctoral Training in Statistics, Measurement, and Methodology in Psychology: Replication and Extension of the Aiken, West, Sechrest, and Reno (1990) Survey of Ph.D. Programs in North America. American Psychologist, 62.
Shadish, W. R., & Cook, T. D. (2009). The renaissance of field experimentation in evaluating interventions. Annual Review of Psychology, 60, 607–629.
Sloane, F., Helding, B., & Kelly, A. E. (2008). Longitudinal analysis and interrupted time series designs: Opportunities for the practice of design research. In A. E. Kelly, R. Lesh, and J. Baek (Eds.), Handbook of design research in education: Innovations in science, technology, mathematics and engineering. New York: Routledge.
Sloane, F., & Kelly, A. E. (2008). Design research and the study of change: Conceptualizing individual growth in designed settings. In A. E. Kelly, R. Lesh, and J. Baek (Eds.), Handbook of design research in education: Innovations in science, technology, mathematics and engineering. New York: Routledge.