As was pointed out in the foregoing, CLT revolves around the notion of a working memory that is limited in capacity and duration [
1‐
3]. With cognitive load understood to be the sum of intrinsic and extraneous cognitive load, which to prevent cognitive overload should not exceed the narrow limits of working memory, researchers have been eager to get to grips with its essence and to find ways to measure it. The following sections will spotlight these endeavours over the course of the past decades.
Empirical evidence for CLT principles
Empirical findings supporting CLT principles come from four types of measures: (1) indirect measures of cognitive load through task performance accuracy [
9‐
11] or time needed for task performance [
12,
13]; (2) dual-task performance measures [
14,
15]; (3) bio-measures such as functional magnetic resonance imaging (fMRI) [
16] or specific electroencephalographic (EEG) [
17,
18] or eye-tracking variables [
19]; and (4) subjective rating scales [
4,
20,
21]. For a more detailed overview of findings of each of these types of measures, the reader is referred to Leppink, Van Gog, Paas, and Sweller [
22]; the current review briefly highlights findings that have had a profound impact on CLT’s evolution.
Indirect measures have provided support for the previous assertion that less working memory capacity remains if more capacity is taken up by either extraneous or intrinsic cognitive load. Firstly, one found that students tend to demonstrate more accurate task performance [
9,
10] and need less time for task performance [
12,
13] if a strategy employed demands less problem-solving search. Especially among novice learners, engaging in problem-solving search contributes to extraneous cognitive load, leaving less room for dealing with intrinsic cognitive load, resulting, in turn, in less accurate task performance or an increase in time needed for task performance. Conversely, another study found that increased error rates in task performance could point to an elevated intrinsic complexity of information [
11]. Excessive administration of intrinsic cognitive load, even with extraneous cognitive load being kept to a minimum, resulted in a very high overall cognitive load—and perhaps cognitive overload—with an increase in error rates as a logical consequence.
Further evidence in favour of a limited working memory resources model comes from dual-task studies, in which participants are instructed to simultaneously perform a primary task and a secondary task that is typically unrelated to the primary task. Learners who need to allocate fewer working memory resources to the primary task have more working memory resources available for more accurate or faster performance on the secondary task [
14,
15].
Researchers using bio-measures have found, amongst others, increased activity in the dorsolateral prefrontal cortex during information processing [
16], and in eye-tracking studies, negative correlations of saccade rate and amplitude with cognitive load [
23,
24] and a positive correlation between fixation length and cognitive load [
25].
One of the difficulties with particular bio-measures, for instance pupil dilation, is that its relation to cognitive load may vary with age [
26]. Further, the approaches hitherto employed are not always practicable in the sense that they are heavily task-related, and mostly require special equipment and an even more careful study design and planning. In an educational setting this is not always feasible, which provides a logical explanation for their infrequent use so far. Subjective rating scales that measure cognitive load
experienced by the learner are much easier to use and frequently encountered in the literature. The first scale made its appearance in the early 1990s in the form of a 9-point one-dimensional mental effort rating scale [
4]. A more inclusive variation of this scale may be represented by the NASA task load index (TLX) [
27], which seeks to capture five dimensions: mental, physical and temporal demands, own performance, and effort and frustration.
Apart from practical challenges, some conceptual challenges appeared. The two-factor framework one had relied on so far, that is, the division of cognitive load into intrinsic and extraneous cognitive load, did not appear to hold when considering that in some cases an increase in cognitive load could bolster learning. It therefore appeared plausible that a third type of cognitive load was involved, which in some sort of way was beneficial to learning. This concept came to be known as
germane cognitive load [
28]. Incorporating germane cognitive load into the framework, however, did not solve the riddle. On the contrary, it incited the desire to find a way to measure each type of cognitive load separately. So far, the existence of distinct types of cognitive load had been largely theoretical; although attempts to measure cognitive load had been plentiful, none of these had sought to measure each cognitive load separately.
Attempts to measure the distinct types of cognitive load
At this point, CLT rested on the assumptions that (1) extraneous cognitive load should be kept to a minimum; and (2) germane cognitive load could arise only if intrinsic cognitive load had reached a specific level [
28]. In view of these beliefs, the question arose as to how a third type of cognitive load could be assimilated into a limited working memory, and more importantly, how each of the specified cognitive loads could be quantified.
This unresolved issue sparked several efforts to devise an instrument with which the various types of cognitive load could be measured [
29‐
33]. Unfortunately, in these studies, single items instead of multiple items were used for one or more types of cognitive load. The use of multiple indicators of the separate types of cognitive load might yield a more precise measurement and might enable researchers to separate types of cognitive load more clearly than the use of a single indicator for each. Besides, none of the studies could address the so-called
expertise reversal effect [
34,
35]. Succinctly put, instructional support in a learning task that is beneficial for novice learners loses its effectiveness or even becomes detrimental as learners become more proficient in that type of task.
Unfortunately, these conceptual and methodological issues were left largely unrecognized and, instead, a return to former principles was deemed imminent [
36‐
38]. Germane cognitive load was reconceptualised as a subtype of intrinsic cognitive load, resulting in a plea for a move back to a two-factor intrinsic/extraneous cognitive load framework. To prevent any return to former principles from being rooted in methodological flaws in previous studies, one final attempt was made to develop a psychometric instrument that might distinguish between the three types of cognitive load [
20] as defined in the late 1990s [
28]. A set of four coherent studies [
20] appeared to provide evidence in favour of the three-factor framework. Though being the first time since the inception of CLT that such an instrument measuring different types of cognitive load received empirical support, two follow-up studies [
21] failed to provide further evidence for the germane cognitive load factor. It was therefore suggested that the three factors in the instrument be interpreted to represent intrinsic cognitive load, extraneous cognitive load, and a subjective judgment of learning [
21,
22]. Table
1 presents the eight items of the questionnaire that reflect intrinsic cognitive load (items 1–4) and extraneous cognitive load (items 5–8) respectively; the questionnaire can also be used without items 4 and 8 [
20‐
22].
Table 1
A new psychometric instrument for the measurement of intrinsic cognitive load (i.e., items 1–4) and extraneous cognitive load (i.e., items 5–8)
All of the following eight questions refer to the activity that just finished. Please take your time to read each of the questions carefully and respond to each of the questions on the presented scale from 0 to 10, in which ‘0’ indicates not at all the case and ‘10’ indicates completely the case: |
0 1 2 3 4 5 6 7 8 9 10 |
[1] The content of this activity was very complex |
[2] The problem/s covered in this activity was/were very complex |
[3] In this activity, very complex terms were mentioned |
[4] I invested a very high mental effort in the complexity of this activity |
[5] The explanations and instructions in this activity were very unclear |
[6] The explanations and instructions in this activity were full of unclear language |
[7] The explanations and instructions in this activity were, in terms of learning, very ineffective |
[8] I invested a very high mental effort in unclear and ineffective explanations and instructions in this activity |
A case for a two-factor intrinsic/extraneous cognitive load framework
To rephrase the gist of CLT, learning in fact revolves around dealing with intrinsic cognitive load [
22,
39]. Given the number of information elements in a task, a more proficient learner will experience a lower intrinsic cognitive load than a novice learner, because some of the information elements in the task are already part of the cognitive schema of the more proficient learner, leaving fewer new elements needing to be processed. Providing learners with a task that comprises more information elements that are not yet part of their schemas will impose a higher intrinsic cognitive load on their minds.
The contention that scaling up intrinsic cognitive load can bolster learning found support in a recent randomized controlled experiment using mixed methods [
40]. In this study, Lafleur and colleagues contrasted a typical Objective Structured Clinical Examination (OSCE) with a so-called Hypothesis-Driven Physical Exam (HDPE [
41]). Whereas the OSCE concerned a
part-taskexamination, the latter required students to perform a
whole-task physical examination. In this latter group, students first made a list of anticipated findings related to several competing diagnoses; they then selected and performed physical examination manoeuvres to elicit findings, interpreted these, corrected initial manoeuvres, to finally arrive at a motivated diagnosis. Throughout the experiment, extraneous cognitive load had been kept at a minimum in both groups. Intrinsic cognitive load, however, resulted somewhat higher in the whole-task HDPE condition. Curiously, this group also revealed a better performance with regard to backward and forward diagnostic reasoning, indicating that a higher administration of intrinsic cognitive load could actually be beneficial to learning.
Of course, this reasoning does not hold till cognitive overload and beyond. As we have seen before, excessive administration of intrinsic cognitive load can do more harm than good. More precisely, when the sum of intrinsic and extraneous cognitive load exhausts working memory capacity, it cannot reasonably be expected that any integration of new information elements into existing knowledge will occur. The same holds true for trivial administration of intrinsic cognitive load: when tasks are made very easy for learners in the light of the knowledge they already possess, boredom may prevail over learning [
42]. By extension, we have seen that the assimilation of an additional germane cognitive load into a limited working memory was not so easy to envisage. A two-factor framework that meets all the conditions previously addressed seems much more plausible, especially so if one considers that the concept of germane cognitive load has never really found support in empirical research. The two-factor framework is therefore a logical starting point from which to proceed. How insights yielded so far can inform medical education design and research is the topic of the remainder of this paper.