{"title":"Situatedness in educational research","authors":"Kai S. Cortina","doi":"10.1111/bjep.70010","DOIUrl":null,"url":null,"abstract":"<p>In educational psychology, emphasizing the situational context is clearly ‘du jour’, becoming arguably most apparent in the renaming of Eccles' Expectancy Value Model to ‘Situated Expectancy-value Model’ (SEVT), outlined in several papers she coauthored (Eccles & Wigfield, <span>2020</span>, <span>2024</span>; Gladstone et al., <span>2022</span>). According to Eccles and Wigfield (<span>2024</span>), the programmatic shift was necessary to reflect the expansion of the theory since its beginnings as a framework to explain gender differences in learning motivation and educational choices of students, to a now full-fledged socio-cognitive developmental theory. As such, the model is explicit about the recursive nature of the underlying processes and acknowledges the idiosyncratic circumstances of each behavioural moment, be it students' decision what classes to take or a teacher's decision about the feedback they give each student. While this makes a lot of sense conceptually, the new framing of the model comes with two challenges. One is of epistemological nature, related to the fact that the emphasis of the ‘situatedness’ weakens the generalizability of empirical finding to other, even very similar contexts. The second challenge lies in the translation of the expanded model to adequate empirical research strategies that reflect the new model complexity or, put more simply: How do we overcome the limitations of questionnaires as the most commonly used tool to collect data in this line of research? It feels inadequate now to pack the ‘situatedness’ in the item stem, for example, ‘When doing your math homework…’ or ‘In general, I love being a science teacher’. This might logically make the response somewhat context-specific, but situation-specific enough in the sense of SEVT.</p><p>Overcoming this limitation is the common theme throughout the six papers which, each in its unique way, are pushing towards a more convincing empirical approach to illustrate and understand the relevance of the situational context and to identify aspects of it that allow us to carefully generalize findings to a similar class of situations. The latter is important as ‘situatedness’ in the SEVT model is not meant to be merely a new label for otherwise unexplained variance in an analysis that uses stable teacher and student characteristics as predictors. Instead, it suggests characterizing the context in order to integrate relevant features into a predictive model.</p><p>For example, Stark, Camburn & Kaler (this volume) demonstrate that teacher motivation varies across different but typical work activities. But instead of the ‘classic approach’ to rely on item construction for a cross-sectional study (‘When I teach in the classroom…’, ‘When I interact with colleagues…’, ‘When I grade papers…’, etc.), they use the ‘day reconstruction method’ (DRM) to get not only a more valid measurement of the motivational state of teachers in a given context but also a precise account of how often teachers encounter those qualitatively different, but nevertheless typical, professional situations. It is obvious that a teacher's motivational state during actual teaching is not predictive of their long-term experience of burnout, for example, if this context represents only a small fraction of the professional contexts a teacher navigates on a daily basis. They are able to demonstrate that roughly two thirds of the variance in teacher motivation lies between periods, that is, distinct situations throughout a workday, negligible variance between days (controlling for periods) and roughly a third of the variance resides (stably) between teachers.</p><p>Wang, Thompson-Lee and Klassen (this volume) combine the emphasis on ‘situatedness’ with the advances made in classroom simulations as a tool for teacher training, which is steadily moving towards the use of virtual reality as a standard tool (see Huang et al., <span>2023</span>). Wang et al. demonstrate that even in the reduced complexity setting of a simulation in an online training setting, the success in adequately reacting in a set of 15 situations a teacher typically encounters on a daily basis has a consistent impact on student teachers self-efficacy beliefs and their assessment of how good they see themselves aligned with the affordances of the job. Unintentionally, the exposure to the scenarios tends to have a somewhat sobering effect since self-efficacy and career intentions trended down on average. However, one could argue that this is reflective of a more accurate self-assessment of the students regarding their readiness to be a teacher. They can take this either as a call to intensify their learning efforts or as a critical appraisal of their decision to become a teacher. As long as the 15 scenarios authentically reflect the professional life of a teacher, this study implicitly reflects the situational variability of the profession, and one is invited to speculate how this might impact a teacher's motivation in the long run.</p><p>Similar to Stark, Camburn and Kaler (this volume), Bross, Frenzel and Nett (this volume) consider ‘day’ the key temporal unit of observation for a longitudinal study on teacher motivation or, in this case, the emotion regulation of teachers. Emotion regulation is strongly related to teacher motivation as successful regulation of negative emotions an important predictor of maintained teacher motivation is (Wang et al., <span>2023</span>). The interesting twist in their study is the use of latent profile analysis that allows them, in addition to identifying coping patterns for two emotions in different situational settings, to reveal flexibility/consistency of teachers' emotion regulation across situations as a trait-like characteristic. Even if the authors do not discuss this explicitly, their approach introduces an interesting expansion of the SEVT model: While it is true that situations matter for the response of teachers, only some teachers actually vary in their response to negative emotions while the majority of teachers show very similar emotion regulation patterns. This could be understood as a situation by person interaction: Only 17.4% of the teacher sample used different combinations of flexibility across situations. The approach also reveals that the remaining three patterns consist of teachers who differ in their coping profiles but not across situations. This opens the door for further investigation beyond the emotion regulation research because it is conceivable that similar ‘meta patterns’, that is, stability of different patterns across situations for some teachers but not for others, exist for other motivational constructs as well.</p><p>Moving to the papers that focus on the instructional process, we again see the need to resort to more complex statistical tools if ‘situatedness’ is of particular interest. Oschwald, Moeller, Kracke, Viljaranta and Dietrich (this volume) present probably the most fine-grained analysis of ‘situatedness’ in the context of motivational research to date, analysing the ‘micro-cycles’ of instructional quality on college students motivation in 9-min intervals (combining three ratings of 3 min). The basic idea was to illustrate that change/variation in the instructional clarity (detail, variation, consistency) has an immediate/short-term lag effect on student motivation. While the authors are very circumspect in considering methodological and conceptual shortcomings of their Null findings, I am more inclined to take them at face value: Motivational dispositions of students, as conceptualized in the SEVT context, are more inert than the study design implies. If this is true, it is good news for future research in the sense that it is not necessary to choose such a high-resolution (and hence expensive) research design. Most likely, a low-clarity teaching style simply does not dampen college students motivation immediately and maybe not even from 1 day to the next. However, if a teacher consistently over days and weeks teaches with low clarity, students become gradually frustrated, start to question their own competence, etc.</p><p>The idea that zooming out the time-frame somewhat is corroborated by the Rubach and von Keyserlink paper (this volume) which used 5 weeks within the semester as the elapsed time to investigate longitudinal trends. The consistency of the student assessment of the quality of the instruction dominated observation specificity when the course was held constant. However, at a given time point, students rated different courses differently, suggesting that their assessment reflected substantial differences in their perception of the different courses. Also important is their finding that roughly 30% of variance is a stable difference between students who adds substantial noise to any statistical analysis that aims at identifying causal impact over time. Accordingly, Rubach and von Keyserlink acknowledge that their study is limited as it is a single source study, that is, students rated the instructional quality as well as their interest and expectations.</p><p>But that consistency of instructional quality throughout the semester is a limiting factor to demonstrate ‘situatedness’ of student motivation comes from other research contexts as well, for example, the research on the often replicated ‘thin-slice-effect’ (Ambady & Rosenthal, <span>1993</span>): Student evaluations at the end of the semester can be extremely well predicted by the assessment of the first 10 min of the first lecture of the semester. While this is often taken as proof of the importance of the first impression, our own (experimental) research suggests that this high correlation is mainly due to the consistency of teacher behaviour throughout the semester (Samudra et al., <span>2016</span>). The first impression is a good indicator of the teaching quality for the teacher's behaviour/quality of the rest of the semester. A final course evaluation may well be more or less an accurate average of the experience throughout the semester and therefore a valid measure of instructional quality. With the caveat that student assessment and student motivation are different constructs, this observation would suggest for the Oschwald et al. study that the authors would find more robust effects if the time unit was not 9-min intervals, but daily or weekly aggregates of instructional quality.</p><p>For both, the Oschwald et al. as well as the Rubach and von Keyserlink study, the measurement of instructional quality becomes a critical issue when we want to avoid artefacts of common-source bias or too short-cycled causal models. Göllner, Lazarides and Stark (this volume) make a foray into new territory by exploring the validity of large language models (LLMs) to assess teaching quality which, in the future, could eliminate the human factor in coding entirely. If a holistic semantic analysis could be able to capture relevant aspects of teaching quality reliably, human coding through expert or student assessment would become obsolete. Quality could even be assessed in real time as the teaching is still happening or shortly thereafter, opening the opportunity to use it as immediate feedback in teacher training. In a more rudimentary fashion, we used the same idea for specific teacher training purposes a decade ago. A voice-recording device (LENA) that distinguished teachers' and students' speaking turns identified in-class discourse segments the teachers were learning to use more frequently in their mathematics classes. Teachers received feedback within 24 h, and for some (not all), it was helpful for improving their teaching (Wang et al., <span>2014</span>).</p><p>Göllner et al.'s cutting-edge exploratory study shows that LLMs have potential in this regard, but we have still ways to go. The semantic representations are ‘sensitive enough’ to reflect variation between segments, lessons and teacher. They also were associated with human-coded quality assessment, but a ballpark 20% of shared variance is not even close to the level where the human–AI interrater reliability could reach the level of human–human reliability after efficient coder training. However, they used a zero-shot GPT model which mean that no additional information was provided to guide the semantic analysis, and the PCA-based dimensionality reduction is indicative of the exploratory nature of the approach with its inherent difficulty to interpret the dimensions and questions of replicability. However, the prompted transcript analysis is a first step towards a use of LLMs that is closer aligned with theoretical concepts and hence a promising step to the next level. After all, the LLM can identify the strength of instructional dialogue best when it can use samples of human-identified examples of dialogue that represent the quality dimension in question (multi-shot GPT). There is no doubt that LLMs will in the near future take over a lot of (if not all) coding tasks of texts and video footage. But what and how the AI codes material will always depend on theoretical considerations about student–teacher and student–student interactions and how they facilitate academic learning. The tool does not come with a guiding theory and Göllner et al.'s contribution makes that clear.</p><p>In her reflections on the situative approach to research in educational psychology, Nolen (<span>2024</span>) points out that the situative view leads to an emphasis on understanding the processes that underlie change. This, in turn, leads to a reflection on what kind of change is to be analysed and what kind of change is considered desirable. Academic learning in educational psychology is, for the most part, conceptualized as a cumulative process, as relatively stable gains over an observed time period, adding to the prior knowledge level. Weeks or months as temporal units of analysis seem appropriate as standard in the learning context of curriculum-based schooling, unless the learning of smaller units is the focus, like learning the content of one particular mathematics lesson.</p><p>In contrast, the underlying idea in the Stark, Camburn and Kaler contribution on teacher motivation is that high teacher motivation is desirable and a potential goal for interventions. Or it stimulates a teacher's self-directed action by minimizing exposure to situations that are demotivating or to change the quality of the social interactions to avoid the demotivating impact. It is not a cumulative, but rather a protective change model. At least implicitly, the self-efficacy belief of student teachers in Wang, Thompson-Lee and Klassen similarly is a variable one would wish to be and remain high, based on the normative assumption that high self-efficacy beliefs are a characteristic of a good teacher. But different from academic learning, there is a logical ceiling for self-efficacy beliefs. Therefore, it is not a cumulative change model, but an optimization model. The environment should lead teacher to—and keep them at—a ‘5 out of 5’ level of self-efficacy.</p><p>While those two papers have similar underlying change models, the theory of emotion regulation in Bross, Frenzel and Nett is based on a qualitatively different conceptualization of change: homeostasis. For a teacher, anger is arguably a dysfunctional state and it is desirable to quickly and effectively regulate it down to an emotional set point if the situational trigger cannot be avoided. It is apparent that the logical temporal unit of analysis in this context is probably minutes, if the goal is to investigate the process as such. This, of course, is not the intention of the authors as their focus lies on the coping patterns of teachers across situations encountered throughout the day. The assumption is, in fact, cumulative in the sense that exposure to a lot of anger-inducing situations paired with a suboptimal coping pattern will wear teachers down in the long run and reduce their professional motivation.</p><p>The intention of the Oschwald et al. study was to demonstrate that instructional clarity has an immediate positive impact on college students learning motivation—again not as a cumulative model but with the normative goal to reach and maintain a high level of learning motivation. At least implicitly, the assumption is that a somewhat consistent lack of clarity over a longer period of time, that is, not 9 min but several weeks of low-clarity instruction, will wear a student's learning motivation down. Even if the short-term lag effect could not be shown, the long-term effect might still—and it is likely to—exist.</p><p>The measurement used in the Rubach and von Keyserlink study is Likert-scale based, which means that it comes with a maximal value despite the fact that theoretically, at least, interest is logically unlimited and could therefore follow a cumulative model. If the quality of the instruction is extremely high every week I am in class, my interest might continuously grow until the end of the term. I might reach the scale's ceiling, but that would be an artefact of the measurement scale.</p><p>Why are these considerations important? They identify the epistemological challenge of an overly situation-focused perspective. While it might be relevant in some research contexts to understand features of the situation and not treat it as error variance (Nolen, <span>2024</span>), we will still need to transcend the insights gained from these analyses to a more general level in order to be of educational relevance. At least for the run-of-the-mill K-12 schooling context, it would be difficult to drop the traditional positivistic rationale when we consider the practical relevance of our research: Once causal mechanisms are identified as tentative truths, they are of practical relevance only if they show long-term impact on academic learning and psychosocial development across a fairly broad class of situational contexts. The more specific the context is defined in the research, the more limited the practical implications. For example, it might be of psychological interest to demonstrate that a student's academic self-concept dips down after 20 instances of unclear instruction. But if the teacher simply was underprepared on that day and otherwise presented the material clearly and accessibly throughout the semester, treating this as a random ‘error’ is probably justified. When long-term development is the main focus of our research (here motivation), the minute-to-minute fluctuations in the clarity of instructions are unlikely to be important. The reason is that, in the back of our heads, we have a model of how motivation affects learning. A student who is—more or less—stably interested in the content of the class will be more likely to work happily on assignments, etc. in the evening and on weekends. As a general rule (non-situative), research has shown that unclear instruction has a negative impact on self-concept and interest in the long run, and we assume that this is true for a broad array of situations, student characteristics, grade levels, etc. This is why it is reasonable that teacher training works with student teachers on instructional clarity as a skill set. If done well, across a variety of situations and contexts a teacher will experience. Even if we recognize that every situation is different and mechanisms are ‘complex’, ‘situatedness’ cannot mean that educational psychology, as an applied science, loses sight of the long-term developmental goals that are the ultimate ‘dependent variables’ of educational processes. It means to be more attentive to conditions of the learning environment that are necessary for the assumed impact of certain independent variables on learning success. Situatedness means to acknowledge the contextual embeddedness of teaching, but this becomes a non-trivial paradigm only if the goal is to identify situational characteristics that allow generalizations. The Stark, Camburn and Kaler paper provides a strong example for this idea because their diary method allowed the teachers themselves to identify situations and their similarities over time and how they felt in those contexts. While every meeting with other teachers might be different from the next, they share as a group many features that contrast with other situations in the daily professional routine, for example, actual instruction in the classroom. Reflecting on the situations we navigate on a daily basis seems to be a good starting point to translate ‘situatedness’ into a research paradigm that does not lose sight of the core focus of our discipline—the process of learning in institutional settings.</p>","PeriodicalId":51367,"journal":{"name":"British Journal of Educational Psychology","volume":"95 S1","pages":"S337-S342"},"PeriodicalIF":3.6000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bjep.70010","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Educational Psychology","FirstCategoryId":"102","ListUrlMain":"https://bpspsychub.onlinelibrary.wiley.com/doi/10.1111/bjep.70010","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EDUCATIONAL","Score":null,"Total":0}
引用次数: 0
Abstract
In educational psychology, emphasizing the situational context is clearly ‘du jour’, becoming arguably most apparent in the renaming of Eccles' Expectancy Value Model to ‘Situated Expectancy-value Model’ (SEVT), outlined in several papers she coauthored (Eccles & Wigfield, 2020, 2024; Gladstone et al., 2022). According to Eccles and Wigfield (2024), the programmatic shift was necessary to reflect the expansion of the theory since its beginnings as a framework to explain gender differences in learning motivation and educational choices of students, to a now full-fledged socio-cognitive developmental theory. As such, the model is explicit about the recursive nature of the underlying processes and acknowledges the idiosyncratic circumstances of each behavioural moment, be it students' decision what classes to take or a teacher's decision about the feedback they give each student. While this makes a lot of sense conceptually, the new framing of the model comes with two challenges. One is of epistemological nature, related to the fact that the emphasis of the ‘situatedness’ weakens the generalizability of empirical finding to other, even very similar contexts. The second challenge lies in the translation of the expanded model to adequate empirical research strategies that reflect the new model complexity or, put more simply: How do we overcome the limitations of questionnaires as the most commonly used tool to collect data in this line of research? It feels inadequate now to pack the ‘situatedness’ in the item stem, for example, ‘When doing your math homework…’ or ‘In general, I love being a science teacher’. This might logically make the response somewhat context-specific, but situation-specific enough in the sense of SEVT.
Overcoming this limitation is the common theme throughout the six papers which, each in its unique way, are pushing towards a more convincing empirical approach to illustrate and understand the relevance of the situational context and to identify aspects of it that allow us to carefully generalize findings to a similar class of situations. The latter is important as ‘situatedness’ in the SEVT model is not meant to be merely a new label for otherwise unexplained variance in an analysis that uses stable teacher and student characteristics as predictors. Instead, it suggests characterizing the context in order to integrate relevant features into a predictive model.
For example, Stark, Camburn & Kaler (this volume) demonstrate that teacher motivation varies across different but typical work activities. But instead of the ‘classic approach’ to rely on item construction for a cross-sectional study (‘When I teach in the classroom…’, ‘When I interact with colleagues…’, ‘When I grade papers…’, etc.), they use the ‘day reconstruction method’ (DRM) to get not only a more valid measurement of the motivational state of teachers in a given context but also a precise account of how often teachers encounter those qualitatively different, but nevertheless typical, professional situations. It is obvious that a teacher's motivational state during actual teaching is not predictive of their long-term experience of burnout, for example, if this context represents only a small fraction of the professional contexts a teacher navigates on a daily basis. They are able to demonstrate that roughly two thirds of the variance in teacher motivation lies between periods, that is, distinct situations throughout a workday, negligible variance between days (controlling for periods) and roughly a third of the variance resides (stably) between teachers.
Wang, Thompson-Lee and Klassen (this volume) combine the emphasis on ‘situatedness’ with the advances made in classroom simulations as a tool for teacher training, which is steadily moving towards the use of virtual reality as a standard tool (see Huang et al., 2023). Wang et al. demonstrate that even in the reduced complexity setting of a simulation in an online training setting, the success in adequately reacting in a set of 15 situations a teacher typically encounters on a daily basis has a consistent impact on student teachers self-efficacy beliefs and their assessment of how good they see themselves aligned with the affordances of the job. Unintentionally, the exposure to the scenarios tends to have a somewhat sobering effect since self-efficacy and career intentions trended down on average. However, one could argue that this is reflective of a more accurate self-assessment of the students regarding their readiness to be a teacher. They can take this either as a call to intensify their learning efforts or as a critical appraisal of their decision to become a teacher. As long as the 15 scenarios authentically reflect the professional life of a teacher, this study implicitly reflects the situational variability of the profession, and one is invited to speculate how this might impact a teacher's motivation in the long run.
Similar to Stark, Camburn and Kaler (this volume), Bross, Frenzel and Nett (this volume) consider ‘day’ the key temporal unit of observation for a longitudinal study on teacher motivation or, in this case, the emotion regulation of teachers. Emotion regulation is strongly related to teacher motivation as successful regulation of negative emotions an important predictor of maintained teacher motivation is (Wang et al., 2023). The interesting twist in their study is the use of latent profile analysis that allows them, in addition to identifying coping patterns for two emotions in different situational settings, to reveal flexibility/consistency of teachers' emotion regulation across situations as a trait-like characteristic. Even if the authors do not discuss this explicitly, their approach introduces an interesting expansion of the SEVT model: While it is true that situations matter for the response of teachers, only some teachers actually vary in their response to negative emotions while the majority of teachers show very similar emotion regulation patterns. This could be understood as a situation by person interaction: Only 17.4% of the teacher sample used different combinations of flexibility across situations. The approach also reveals that the remaining three patterns consist of teachers who differ in their coping profiles but not across situations. This opens the door for further investigation beyond the emotion regulation research because it is conceivable that similar ‘meta patterns’, that is, stability of different patterns across situations for some teachers but not for others, exist for other motivational constructs as well.
Moving to the papers that focus on the instructional process, we again see the need to resort to more complex statistical tools if ‘situatedness’ is of particular interest. Oschwald, Moeller, Kracke, Viljaranta and Dietrich (this volume) present probably the most fine-grained analysis of ‘situatedness’ in the context of motivational research to date, analysing the ‘micro-cycles’ of instructional quality on college students motivation in 9-min intervals (combining three ratings of 3 min). The basic idea was to illustrate that change/variation in the instructional clarity (detail, variation, consistency) has an immediate/short-term lag effect on student motivation. While the authors are very circumspect in considering methodological and conceptual shortcomings of their Null findings, I am more inclined to take them at face value: Motivational dispositions of students, as conceptualized in the SEVT context, are more inert than the study design implies. If this is true, it is good news for future research in the sense that it is not necessary to choose such a high-resolution (and hence expensive) research design. Most likely, a low-clarity teaching style simply does not dampen college students motivation immediately and maybe not even from 1 day to the next. However, if a teacher consistently over days and weeks teaches with low clarity, students become gradually frustrated, start to question their own competence, etc.
The idea that zooming out the time-frame somewhat is corroborated by the Rubach and von Keyserlink paper (this volume) which used 5 weeks within the semester as the elapsed time to investigate longitudinal trends. The consistency of the student assessment of the quality of the instruction dominated observation specificity when the course was held constant. However, at a given time point, students rated different courses differently, suggesting that their assessment reflected substantial differences in their perception of the different courses. Also important is their finding that roughly 30% of variance is a stable difference between students who adds substantial noise to any statistical analysis that aims at identifying causal impact over time. Accordingly, Rubach and von Keyserlink acknowledge that their study is limited as it is a single source study, that is, students rated the instructional quality as well as their interest and expectations.
But that consistency of instructional quality throughout the semester is a limiting factor to demonstrate ‘situatedness’ of student motivation comes from other research contexts as well, for example, the research on the often replicated ‘thin-slice-effect’ (Ambady & Rosenthal, 1993): Student evaluations at the end of the semester can be extremely well predicted by the assessment of the first 10 min of the first lecture of the semester. While this is often taken as proof of the importance of the first impression, our own (experimental) research suggests that this high correlation is mainly due to the consistency of teacher behaviour throughout the semester (Samudra et al., 2016). The first impression is a good indicator of the teaching quality for the teacher's behaviour/quality of the rest of the semester. A final course evaluation may well be more or less an accurate average of the experience throughout the semester and therefore a valid measure of instructional quality. With the caveat that student assessment and student motivation are different constructs, this observation would suggest for the Oschwald et al. study that the authors would find more robust effects if the time unit was not 9-min intervals, but daily or weekly aggregates of instructional quality.
For both, the Oschwald et al. as well as the Rubach and von Keyserlink study, the measurement of instructional quality becomes a critical issue when we want to avoid artefacts of common-source bias or too short-cycled causal models. Göllner, Lazarides and Stark (this volume) make a foray into new territory by exploring the validity of large language models (LLMs) to assess teaching quality which, in the future, could eliminate the human factor in coding entirely. If a holistic semantic analysis could be able to capture relevant aspects of teaching quality reliably, human coding through expert or student assessment would become obsolete. Quality could even be assessed in real time as the teaching is still happening or shortly thereafter, opening the opportunity to use it as immediate feedback in teacher training. In a more rudimentary fashion, we used the same idea for specific teacher training purposes a decade ago. A voice-recording device (LENA) that distinguished teachers' and students' speaking turns identified in-class discourse segments the teachers were learning to use more frequently in their mathematics classes. Teachers received feedback within 24 h, and for some (not all), it was helpful for improving their teaching (Wang et al., 2014).
Göllner et al.'s cutting-edge exploratory study shows that LLMs have potential in this regard, but we have still ways to go. The semantic representations are ‘sensitive enough’ to reflect variation between segments, lessons and teacher. They also were associated with human-coded quality assessment, but a ballpark 20% of shared variance is not even close to the level where the human–AI interrater reliability could reach the level of human–human reliability after efficient coder training. However, they used a zero-shot GPT model which mean that no additional information was provided to guide the semantic analysis, and the PCA-based dimensionality reduction is indicative of the exploratory nature of the approach with its inherent difficulty to interpret the dimensions and questions of replicability. However, the prompted transcript analysis is a first step towards a use of LLMs that is closer aligned with theoretical concepts and hence a promising step to the next level. After all, the LLM can identify the strength of instructional dialogue best when it can use samples of human-identified examples of dialogue that represent the quality dimension in question (multi-shot GPT). There is no doubt that LLMs will in the near future take over a lot of (if not all) coding tasks of texts and video footage. But what and how the AI codes material will always depend on theoretical considerations about student–teacher and student–student interactions and how they facilitate academic learning. The tool does not come with a guiding theory and Göllner et al.'s contribution makes that clear.
In her reflections on the situative approach to research in educational psychology, Nolen (2024) points out that the situative view leads to an emphasis on understanding the processes that underlie change. This, in turn, leads to a reflection on what kind of change is to be analysed and what kind of change is considered desirable. Academic learning in educational psychology is, for the most part, conceptualized as a cumulative process, as relatively stable gains over an observed time period, adding to the prior knowledge level. Weeks or months as temporal units of analysis seem appropriate as standard in the learning context of curriculum-based schooling, unless the learning of smaller units is the focus, like learning the content of one particular mathematics lesson.
In contrast, the underlying idea in the Stark, Camburn and Kaler contribution on teacher motivation is that high teacher motivation is desirable and a potential goal for interventions. Or it stimulates a teacher's self-directed action by minimizing exposure to situations that are demotivating or to change the quality of the social interactions to avoid the demotivating impact. It is not a cumulative, but rather a protective change model. At least implicitly, the self-efficacy belief of student teachers in Wang, Thompson-Lee and Klassen similarly is a variable one would wish to be and remain high, based on the normative assumption that high self-efficacy beliefs are a characteristic of a good teacher. But different from academic learning, there is a logical ceiling for self-efficacy beliefs. Therefore, it is not a cumulative change model, but an optimization model. The environment should lead teacher to—and keep them at—a ‘5 out of 5’ level of self-efficacy.
While those two papers have similar underlying change models, the theory of emotion regulation in Bross, Frenzel and Nett is based on a qualitatively different conceptualization of change: homeostasis. For a teacher, anger is arguably a dysfunctional state and it is desirable to quickly and effectively regulate it down to an emotional set point if the situational trigger cannot be avoided. It is apparent that the logical temporal unit of analysis in this context is probably minutes, if the goal is to investigate the process as such. This, of course, is not the intention of the authors as their focus lies on the coping patterns of teachers across situations encountered throughout the day. The assumption is, in fact, cumulative in the sense that exposure to a lot of anger-inducing situations paired with a suboptimal coping pattern will wear teachers down in the long run and reduce their professional motivation.
The intention of the Oschwald et al. study was to demonstrate that instructional clarity has an immediate positive impact on college students learning motivation—again not as a cumulative model but with the normative goal to reach and maintain a high level of learning motivation. At least implicitly, the assumption is that a somewhat consistent lack of clarity over a longer period of time, that is, not 9 min but several weeks of low-clarity instruction, will wear a student's learning motivation down. Even if the short-term lag effect could not be shown, the long-term effect might still—and it is likely to—exist.
The measurement used in the Rubach and von Keyserlink study is Likert-scale based, which means that it comes with a maximal value despite the fact that theoretically, at least, interest is logically unlimited and could therefore follow a cumulative model. If the quality of the instruction is extremely high every week I am in class, my interest might continuously grow until the end of the term. I might reach the scale's ceiling, but that would be an artefact of the measurement scale.
Why are these considerations important? They identify the epistemological challenge of an overly situation-focused perspective. While it might be relevant in some research contexts to understand features of the situation and not treat it as error variance (Nolen, 2024), we will still need to transcend the insights gained from these analyses to a more general level in order to be of educational relevance. At least for the run-of-the-mill K-12 schooling context, it would be difficult to drop the traditional positivistic rationale when we consider the practical relevance of our research: Once causal mechanisms are identified as tentative truths, they are of practical relevance only if they show long-term impact on academic learning and psychosocial development across a fairly broad class of situational contexts. The more specific the context is defined in the research, the more limited the practical implications. For example, it might be of psychological interest to demonstrate that a student's academic self-concept dips down after 20 instances of unclear instruction. But if the teacher simply was underprepared on that day and otherwise presented the material clearly and accessibly throughout the semester, treating this as a random ‘error’ is probably justified. When long-term development is the main focus of our research (here motivation), the minute-to-minute fluctuations in the clarity of instructions are unlikely to be important. The reason is that, in the back of our heads, we have a model of how motivation affects learning. A student who is—more or less—stably interested in the content of the class will be more likely to work happily on assignments, etc. in the evening and on weekends. As a general rule (non-situative), research has shown that unclear instruction has a negative impact on self-concept and interest in the long run, and we assume that this is true for a broad array of situations, student characteristics, grade levels, etc. This is why it is reasonable that teacher training works with student teachers on instructional clarity as a skill set. If done well, across a variety of situations and contexts a teacher will experience. Even if we recognize that every situation is different and mechanisms are ‘complex’, ‘situatedness’ cannot mean that educational psychology, as an applied science, loses sight of the long-term developmental goals that are the ultimate ‘dependent variables’ of educational processes. It means to be more attentive to conditions of the learning environment that are necessary for the assumed impact of certain independent variables on learning success. Situatedness means to acknowledge the contextual embeddedness of teaching, but this becomes a non-trivial paradigm only if the goal is to identify situational characteristics that allow generalizations. The Stark, Camburn and Kaler paper provides a strong example for this idea because their diary method allowed the teachers themselves to identify situations and their similarities over time and how they felt in those contexts. While every meeting with other teachers might be different from the next, they share as a group many features that contrast with other situations in the daily professional routine, for example, actual instruction in the classroom. Reflecting on the situations we navigate on a daily basis seems to be a good starting point to translate ‘situatedness’ into a research paradigm that does not lose sight of the core focus of our discipline—the process of learning in institutional settings.
期刊介绍:
The British Journal of Educational Psychology publishes original psychological research pertaining to education across all ages and educational levels including: - cognition - learning - motivation - literacy - numeracy and language - behaviour - social-emotional development - developmental difficulties linked to educational psychology or the psychology of education