谁在购物中心迷路了？错误记忆计数和分类的挑战

IF 2.1 3区心理学 Q2 PSYCHOLOGY, EXPERIMENTAL

Applied Cognitive Psychology Pub Date : 2025-03-13 DOI:10.1002/acp.70044

Gillian Murphy, Ciara M. Greene

{"title":"谁在购物中心迷路了？错误记忆计数和分类的挑战","authors":"Gillian Murphy, Ciara M. Greene","doi":"10.1002/acp.70044","DOIUrl":null,"url":null,"abstract":"What is a memory? Can an outside observer really ascertain whether someone is remembering an event? How can they do so reliably? These are challenging questions that we face as memory researchers, particularly when we try to tease apart true and false memories and beliefs. In this issue, Andrews and Brewin (2024) reanalysed a portion of the data from our recent rich false memory study (Murphy, Dawson, et al. 2023) and developed a novel coding scheme based on counting reported details. They also applied further, more stringent criteria to classify the false memories reported in our study and concluded that this method yields a different false memory rate from the scheme used in our original paper, and that this rate is different again from participants' own self-reported memories. These findings do not surprise us—in our experience, different coding schemes will always yield different rates—but we do disagree with both the methods used and the conclusions that Andrews and Brewin drew from these findings.To provide some context, we first offer a brief overview of the replication study. This was conducted by a team of students as a collaborative project (Murphy and Greene 2023) and closely adhered to the methods of the classic Lost in the Mall study (Loftus and Pickrell 1995). Participants signed up for a study about how we remember our childhoods and their informant (usually their mother) completed an online survey telling us about some true childhood events as well as some information about shopping trips when the participant was a child. We then sent participants a survey in which they were shown three true memory descriptions (taken from their informant's account) and one false memory prompt that described the participant getting lost in a shopping mall as a child; this false event was created by slotting the informant-provided details into a pre-prepared narrative in which the participant was described as getting lost for a short period of time, becoming upset and then being found by an elderly woman before being reunited with their parent. Participants were then interviewed on two separate occasions, for 20–30 min, where they were encouraged to try to remember as much as they could about the event. The transcripts of these conversations were then coded for the presence of a memory using a pre-registered coding scheme. At the conclusion of the second interview, participants self-reported whether or not they remembered each of the events, before being debriefed. Participants and informants reported enjoying the study and largely did not object to the deception employed (Murphy, Maher, et al. 2023). In a follow-up study, we confirmed that our debriefing methods were effective at retracting these false memories (Greene et al. 2024).It is important to first note that we welcome scrutiny and discussion of our results. Our participants (and their parents) generously volunteered a lot of time to complete our study and, as a research team, we exerted significant effort to make our anonymised data open and accessible to other researchers. It is gratifying to see our hard-won data informing the work of other researchers. So, while we do not agree with the methods or conclusions put forward by Andrews and Brewin, we do wholeheartedly support their careful inspection of our work. Many of the questions raised by Andrews and Brewin relate to foundational principles in false memory research, principles we considered at length in running this study and which we feel would benefit from further discussion. Our methodological choices (and our pre-registered hypotheses) reflect our conviction that all coding schemes are imperfect and there is no absolute rate of false memory formation that we could or should expect to observe in any one study—we therefore set out to record multiple different measures of memory that could be considered in the round. Here, we take the opportunity to unpack some of these thorny issues further and offer our perspective, which aligns in many ways with the commentary by Wade et al. (2025). We make four arguments related to the coding of false memories: 1. There is no one perfect false memory coding scheme, 2. There is no absolute false memory rate, 3. Memory distortion is an active process, not a passive ‘hacking’ of one's memory, and 4. Interviews are a noisy means of assessing memories.We agree with Andrews and Brewin's (2024) basic finding that any given coding scheme is likely to give different results from another. As we reported in our original paper, at the conclusion of the study during the second interview, we observed a false memory rate of 35% when applying the Loftus and Pickrell (1995) coding scheme to the interview transcript, but we also observed a self-reported false memory rate of 14% (alongside an additional 52% of participants who self-reported believing that the event had occurred). Furthermore, when we showed excerpts from the transcripts to a mock jury and asked them whether the interviewee was remembering an event, they demonstrated only moderate agreement with the other methods of coding false memories (55% agreement with the coding scheme, 70% agreement with self-report), with mock jurors adopting more liberal thresholds for classifying memories.These findings, while noteworthy, were very much in line with prior published work that has established that memory rates can vary hugely when different schemes are applied. For example, Wade et al. (2018) reanalysed a rich false memory study conducted by Shaw and Porter (2015), reporting a false memory rate of just 30%, in contrast to the originally reported 70%. These discrepancies are interesting, both theoretically and practically, and speak to the enormous challenge inherent in trying to classify a memory. There have been extensive debates in the memory literature regarding how best to code transcripts for the presence of a false memory, including arguments that we should rely on participant self-reports rather than researcher-produced coding schemes. These arguments have provoked nuanced discussions of the nature of (false) memory and how we should define it (Shaw 2018).The selection of these ‘key details’ is arbitrary, and Andrews and Brewin's analysis suggests that some were quite poorly chosen. For example, 0% of those classed as having a full memory in our analysis noted their age when recalling the event. This is not surprising to us, given that the false event was the third event being discussed in that interview, and all of the events were from around that period of childhood. As we will later discuss, the natural rhythms of conversation are such that many people do not mention the age they were when an event took place, even if it was a detail we provided them with to help them imagine when it might have taken place. Others ‘key details’, such as the detail about the elderly woman, are actually multiple details bundled together, whereby the participant was only coded as explicitly recalling that detail if they mentioned the person was 1. old, 2. female, and 3. performed a helpful act. It is arbitrary to declare certain details to be so central to the memory that a failure to mention one specific detail results in a memory being downgraded or discounted entirely.At the start of the study, this participant stated that she did not remember this event at all, but by interview two here she gives a rich and detailed account of getting lost in a shopping mall, which was coded as a full memory using the Loftus and Pickrell scheme. When asked, she self-reported a clear memory for this event and said she would be extremely willing to testify that it happened (9/10 on a Likert scale of willingness). However, she does not mention her age, being upset, or the name of the shopping mall (though she does name the shop itself), so would only score a 3/6 on Andrews and Brewin's novel coding scheme. Indeed, they observed that not a single participant reported more than four of their six core details. Despite this, the participant offers rich sensory details and insists she has a very clear and trusted memory of the event. This example also highlights another problem with this counting approach, in that each detail is given equal weighting. Reporting the name of the shopping mall is equally important as remembering being lost.This is a memory of a true event (and an important one at that), and yet the participant only explicitly reports two out of the six details (visiting her mother in the hospital and, with uncertainty, receiving a Dora doll). She does not note her age or the name of the hospital, or mention that her mother hugged her when she came in. She does mention noticing how small the baby was but does not specifically recall commenting on his ears or fingers. Note too that this participant reports a slightly different version of the event (stating it was her cousin and not her grandmother who accompanied her) and though she says she only remembers the detail about the doll because it was contained in the original prompt, she has seemingly fleshed out that image so that she now notes the doll was wrapped in a blanket rather than wrapped with wrapping paper. She does report additional details that were not in the prompt (e.g., the silver Toyota, the physical location of everyone in the room), but these do not form part of Andrews and Brewin's scheme, which seems to assume that, in order to be considered a rich and detailed memory, the prompt should be repeated back verbatim. Yet this participant reports remembering this event, and we are confident that if you asked a layperson (or a jury member) whether this participant was remembering this event, they would say yes. Counting the presence of ‘key’ details from a prompt is one way to assess the richness of false memories, but it is not the only way—and in fact, we would argue it is one of the least valuable in terms of understanding memory (re)construction.Whether false memories occur in a given paradigm 5% of the time or 55% of the time does not change what these paradigms tell us about the nature of human memory, nor does it change the forensic implications. Even if we could settle on an agreed rate (a very difficult task given the variables involved), that would not tell an investigator or an expert witness whether a given memory is false (Smeets et al. 2017). The Lost in the Mall study is so well known because it established that false memories can happen, but neither the original Loftus and Pickrell paper nor our replication study made any claims about the absolute rate at which this should be expected to occur. Other work has also clearly demonstrated that though around a quarter of participants typically form a false memory in a given study (Scoboria et al. 2017), that does not mean that only a quarter of the population are susceptible to forming false memories (Murphy, Loftus, et al. 2023; Patihis 2018).For the self-report question, these participants were explicitly asked if they remembered being lost in a shopping mall, and they indicated that they did. To then remove those participants for not mentioning being lost earlier in the interview is clearly a highly restrictive way to classify memories.Andrews and Brewin also quite notably fail to mention the high rates of belief in these fabricated events. Altogether, the self-reported data suggested that 66% of participants remembered (14%) or believed (52%) that the event had occurred. Thus, the Loftus and Pickrell coding scheme provided higher estimates of false memories than self-report, but also failed to capture that the majority of participants came to believe the event happened and were willing to testify to that fact. As discussed by Scoboria and Mazzoni (2017), belief has been shown to be more than sufficient to cause changes in behaviour (Bernstein et al. 2015) and so false beliefs are an important outcome from rich false memory studies.Where we perhaps most strongly disagree with Andrews and Brewin is their assertion that ‘half the group described potentially true events’. The possibility that participants really did get lost in a shopping mall as children is of course a pertinent one in a study like this—hence why other studies have utilised less commonplace experiences (e.g., Hyman Jr. et al. 1995). While we identified three participants in our original study that we believed could have been reporting a true event (based on their persistent reporting of the event, from the initial survey through to the post-debrief follow-up), Andrews and Brewin declared half of the false memory reports to be ‘potentially true’, marking perhaps the most significant step on their journey from our 35% estimate to their 4% estimate.The rationale for these memories being potentially true raises an interesting theoretical point about the nature of memory. Andrews and Brewin note that these memories were likely real because, for example, the participant reported getting lost in a different shop from the one we prompted them with. However, mountains of evidence on the reconstructive nature of memory would predict exactly this, that participants would take our prompt and actively merge it with their own knowledge and experiences (Greene et al. 2022; Lindsay and Hyman Jr 2017; Loftus and Pickrell 1995; Murphy et al. 2019). The nature of the Lost in the Mall paradigm is particularly active—participants are explicitly encouraged to search their memories and have a discussion with the interviewer about what they can recall and what images they see in their mind. This is in contrast to the kind of process implied by Andrews and Brewin. The expectation that we would hand participants a prompt and they would then recite it back to us, verbatim, with no changes and all so-called ‘core details’ intact suggests a very passive process, almost a hacking of memory where a complete event is ‘uploaded’ to our participants' minds.Andrews and Brewin argue that the events that they have classified as potentially true were recalled with greater certainty and detail and less closely matched the details provided in the prompt (i.e., a different shopping mall was named). They suggest that these were true events that really happened and thus were reported with more certainty. However, it may also be that because the participant actively connected the fake story to other, real events from their lives, these participants built more detailed and convincing false memories—indeed, extant research clearly indicates that people do integrate real personal experiences into false memories in just this manner (Shaw and Porter 2015; Zaragoza et al. 2019). We do not have the data here to answer this question with any certainty, but we would welcome an experimental assessment of this point in the future. Regardless, we would not predict that participants would ever passively accept every detail supplied to them and note that the real-world harms that may arise from, say, suggestive therapy practices, are not contingent on wholesale adoption of every presented detail either.Perhaps our greatest lesson in carrying out this large-scale study was the fact that the interview transcripts are a product of natural conversation. When we devise coding schemes, we can sometimes fall into the logical trap of thinking we are applying the coding scheme to a participant's memory. As we cannot see inside their brain and scrutinise their recollections directly, we are in fact coding the way they speak about their memory. In a study like this, participants are not delivering a monologue; they are engaging in dialogue with an interviewer. Thus, their answers are contingent on the questions they are asked.This distinction was particularly pronounced in our study, as we had six student interviewers conducting this project and there was variation in their styles. Though they were well trained and all followed the same interview schedule, they had different personalities and also varying levels of rapport with the participants. We saw considerable variation in how rates of false memories changed between the coded booklet survey (before any contact with the researcher), the coded interview transcript, and the participants' own self-reported memory declaration. For example, the participants assigned to one researcher had a 10% false memory rate in the booklet survey, which rose to a 50% rate by the second interview, but returned to a 10% rate for self-report. Another interviewer's participants had an 18% false memory rate in the booklet survey that actually dropped to 10% during the interviews, then dropped again to 0% for self-report. This may have been due to differences between interviewers, as it seems they varied in the follow-up questions they asked and what kind of information they encouraged the participant to say ‘on the record’, as it were. We also note that the associative nature of the recall process would predict that slightly different details are recalled during different attempts (Odinot et al. 2013) – humans are not jukeboxes, and a similar prompt will not elicit an identical recollection on each occasion. In addition, participants' level of attention to the conversation and the recalled event is likely to wax and wane over the course of the conversation, and previous research suggests that the attentiveness of a listener can impact what details are recalled (Pasupathi and Oldroyd 2015).The role of the interviewer is particularly pertinent when applying a count-based scheme like that of Andrews and Brewin. They noted that very few of our participants recounted their age when discussing their false memory. Of course, this is not how conversation normally works. Imagine someone asking you about your first day of school, at the age of four. You would not typically begin your account by saying, ‘I was four years old when I started school’, unless that detail felt particularly pertinent to you (‘…so I was the youngest because everyone else was at least five’). Instead, you might talk about your memories of the classroom, the teacher, the other children etc. If age was to be considered an important detail a priori, it would be important to add a question to the interview schedule (‘and what age were you when this happened?’) to fairly judge whether participants recall that detail or not. Absence of evidence is not evidence of absence—we simply do not know whether participants came to remember that they were about five when this false event occurred, as we did not ask them and cannot draw conclusions from their failure to mention it. In our replication study, we saw the role of the interviewers as encouraging participants to talk and so they asked an array of open-ended questions (e.g., ‘and can you picture what it was like in the shop? Who would have been with you?’ etc.). They were facilitators of the conversation, not examiners of memory detail. Interviewers were also expected to maintain the study's ruse at this point; as participants were unaware we were studying false memories, it was important not to interrogate participants to the point where they might question if the event really happened.Many participants may not have actually stated that they got lost, because it was implied by the question. We note that media training often encourages interviewees to include the question in their answer, so that when extracted out of context in a soundbite the quote is more detailed (i.e., when asked when you will launch a product, rather than saying ‘December’, you might be encouraged to say ‘We will launch this product in December’). Training is required to learn to speak like this precisely because we do not naturally speak so repetitively in natural conversation. When coding memory, it is therefore important to remain cognisant of the specific prompts offered to an interviewee as clearly that gives context to what they do and do not say in their narration.It is useful for researchers to reflect on the role of interviewers in rich false memory studies and to consider in advance what their approach will be. Decisions about the interview style and coding scheme ought to be made in unison (and ideally, preregistered), as the interviewer has such an influence on what the participant is likely to speak about. As we have discussed, it is difficult to move the goalposts after the fact and employ a detail-based scheme when the interviews were not set up to assess the presence or absence of those specific details. In our study, we found it useful to combine the natural (imperfect) conversation between participant and interviewer with some standardised questions that they answered during and after the event (Do you remember this event? How vivid is your memory? etc.) and to consider the resulting data in a holistic manner.In our Lost in the Mall replication, we reported a top-line false memory rate of 35%, which is in line with the rates reported across a range of similar studies (see Scoboria et al. 2017 for a mega analysis of false memory implantation studies). Our position is certainly not to argue that any particular false memory rate is ‘correct’; as noted in our replication paper and in the above discussion, we advocate for the use of multiple coding methods, including self-report where appropriate. Just as importantly, we argue that memory reports should be evaluated holistically, with consideration of the context in which the reports were obtained (here, via a naturalistic conversation). We do not consider the use of reductive and over-simplistic count schemes to be a useful measure of memory (true or false) and reject the idea that memory prompts should be repeated back without alteration in order for a participant's recollection to be considered a memory.The clinical and forensic implications of the Lost in the Mall study (and our replication study) remain clear and important. We note that the implantation methods used in these studies are fairly light-touch. Though the studies are enormously burdensome to conduct, involving contact with parents and multiple interviews and online surveys per participant, the actual manipulation of memory is quite mild. Participants are presented with a very short summary of a supposed event from their childhood and are asked to reflect on whether they remember it. That is all. As Scoboria and Mazzoni (2017) noted, this pales in comparison to the kind of memory distortion that might occur over years of suggestive therapy. We therefore respectfully submit that to quibble over the precise rate of false memory in a given study is essentially to miss the point regarding the potential harms to therapeutic patients (Wade et al. 2025).Gillian Murphy: writing – original draft, conceptualization. Ciara M. Greene: conceptualization, writing – review and editing.The authors have nothing to report.The authors declare no conflicts of interest.","PeriodicalId":48281,"journal":{"name":"Applied Cognitive Psychology","volume":"39 2","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/acp.70044","citationCount":"0","resultStr":"{\"title\":\"Who Got Lost in the Mall? Challenges in Counting and Classifying False Memories\",\"authors\":\"Gillian Murphy, Ciara M. Greene\",\"doi\":\"10.1002/acp.70044\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"What is a memory? Can an outside observer really ascertain whether someone is remembering an event? How can they do so reliably? These are challenging questions that we face as memory researchers, particularly when we try to tease apart true and false memories and beliefs. In this issue, Andrews and Brewin (2024) reanalysed a portion of the data from our recent rich false memory study (Murphy, Dawson, et al. 2023) and developed a novel coding scheme based on counting reported details. They also applied further, more stringent criteria to classify the false memories reported in our study and concluded that this method yields a different false memory rate from the scheme used in our original paper, and that this rate is different again from participants' own self-reported memories. These findings do not surprise us—in our experience, different coding schemes will always yield different rates—but we do disagree with both the methods used and the conclusions that Andrews and Brewin drew from these findings.To provide some context, we first offer a brief overview of the replication study. This was conducted by a team of students as a collaborative project (Murphy and Greene 2023) and closely adhered to the methods of the classic Lost in the Mall study (Loftus and Pickrell 1995). Participants signed up for a study about how we remember our childhoods and their informant (usually their mother) completed an online survey telling us about some true childhood events as well as some information about shopping trips when the participant was a child. We then sent participants a survey in which they were shown three true memory descriptions (taken from their informant's account) and one false memory prompt that described the participant getting lost in a shopping mall as a child; this false event was created by slotting the informant-provided details into a pre-prepared narrative in which the participant was described as getting lost for a short period of time, becoming upset and then being found by an elderly woman before being reunited with their parent. Participants were then interviewed on two separate occasions, for 20–30 min, where they were encouraged to try to remember as much as they could about the event. The transcripts of these conversations were then coded for the presence of a memory using a pre-registered coding scheme. At the conclusion of the second interview, participants self-reported whether or not they remembered each of the events, before being debriefed. Participants and informants reported enjoying the study and largely did not object to the deception employed (Murphy, Maher, et al. 2023). In a follow-up study, we confirmed that our debriefing methods were effective at retracting these false memories (Greene et al. 2024).It is important to first note that we welcome scrutiny and discussion of our results. Our participants (and their parents) generously volunteered a lot of time to complete our study and, as a research team, we exerted significant effort to make our anonymised data open and accessible to other researchers. It is gratifying to see our hard-won data informing the work of other researchers. So, while we do not agree with the methods or conclusions put forward by Andrews and Brewin, we do wholeheartedly support their careful inspection of our work. Many of the questions raised by Andrews and Brewin relate to foundational principles in false memory research, principles we considered at length in running this study and which we feel would benefit from further discussion. Our methodological choices (and our pre-registered hypotheses) reflect our conviction that all coding schemes are imperfect and there is no absolute rate of false memory formation that we could or should expect to observe in any one study—we therefore set out to record multiple different measures of memory that could be considered in the round. Here, we take the opportunity to unpack some of these thorny issues further and offer our perspective, which aligns in many ways with the commentary by Wade et al. (2025). We make four arguments related to the coding of false memories: 1. There is no one perfect false memory coding scheme, 2. There is no absolute false memory rate, 3. Memory distortion is an active process, not a passive ‘hacking’ of one's memory, and 4. Interviews are a noisy means of assessing memories.We agree with Andrews and Brewin's (2024) basic finding that any given coding scheme is likely to give different results from another. As we reported in our original paper, at the conclusion of the study during the second interview, we observed a false memory rate of 35% when applying the Loftus and Pickrell (1995) coding scheme to the interview transcript, but we also observed a self-reported false memory rate of 14% (alongside an additional 52% of participants who self-reported believing that the event had occurred). Furthermore, when we showed excerpts from the transcripts to a mock jury and asked them whether the interviewee was remembering an event, they demonstrated only moderate agreement with the other methods of coding false memories (55% agreement with the coding scheme, 70% agreement with self-report), with mock jurors adopting more liberal thresholds for classifying memories.These findings, while noteworthy, were very much in line with prior published work that has established that memory rates can vary hugely when different schemes are applied. For example, Wade et al. (2018) reanalysed a rich false memory study conducted by Shaw and Porter (2015), reporting a false memory rate of just 30%, in contrast to the originally reported 70%. These discrepancies are interesting, both theoretically and practically, and speak to the enormous challenge inherent in trying to classify a memory. There have been extensive debates in the memory literature regarding how best to code transcripts for the presence of a false memory, including arguments that we should rely on participant self-reports rather than researcher-produced coding schemes. These arguments have provoked nuanced discussions of the nature of (false) memory and how we should define it (Shaw 2018).The selection of these ‘key details’ is arbitrary, and Andrews and Brewin's analysis suggests that some were quite poorly chosen. For example, 0% of those classed as having a full memory in our analysis noted their age when recalling the event. This is not surprising to us, given that the false event was the third event being discussed in that interview, and all of the events were from around that period of childhood. As we will later discuss, the natural rhythms of conversation are such that many people do not mention the age they were when an event took place, even if it was a detail we provided them with to help them imagine when it might have taken place. Others ‘key details’, such as the detail about the elderly woman, are actually multiple details bundled together, whereby the participant was only coded as explicitly recalling that detail if they mentioned the person was 1. old, 2. female, and 3. performed a helpful act. It is arbitrary to declare certain details to be so central to the memory that a failure to mention one specific detail results in a memory being downgraded or discounted entirely.At the start of the study, this participant stated that she did not remember this event at all, but by interview two here she gives a rich and detailed account of getting lost in a shopping mall, which was coded as a full memory using the Loftus and Pickrell scheme. When asked, she self-reported a clear memory for this event and said she would be extremely willing to testify that it happened (9/10 on a Likert scale of willingness). However, she does not mention her age, being upset, or the name of the shopping mall (though she does name the shop itself), so would only score a 3/6 on Andrews and Brewin's novel coding scheme. Indeed, they observed that not a single participant reported more than four of their six core details. Despite this, the participant offers rich sensory details and insists she has a very clear and trusted memory of the event. This example also highlights another problem with this counting approach, in that each detail is given equal weighting. Reporting the name of the shopping mall is equally important as remembering being lost.This is a memory of a true event (and an important one at that), and yet the participant only explicitly reports two out of the six details (visiting her mother in the hospital and, with uncertainty, receiving a Dora doll). She does not note her age or the name of the hospital, or mention that her mother hugged her when she came in. She does mention noticing how small the baby was but does not specifically recall commenting on his ears or fingers. Note too that this participant reports a slightly different version of the event (stating it was her cousin and not her grandmother who accompanied her) and though she says she only remembers the detail about the doll because it was contained in the original prompt, she has seemingly fleshed out that image so that she now notes the doll was wrapped in a blanket rather than wrapped with wrapping paper. She does report additional details that were not in the prompt (e.g., the silver Toyota, the physical location of everyone in the room), but these do not form part of Andrews and Brewin's scheme, which seems to assume that, in order to be considered a rich and detailed memory, the prompt should be repeated back verbatim. Yet this participant reports remembering this event, and we are confident that if you asked a layperson (or a jury member) whether this participant was remembering this event, they would say yes. Counting the presence of ‘key’ details from a prompt is one way to assess the richness of false memories, but it is not the only way—and in fact, we would argue it is one of the least valuable in terms of understanding memory (re)construction.Whether false memories occur in a given paradigm 5% of the time or 55% of the time does not change what these paradigms tell us about the nature of human memory, nor does it change the forensic implications. Even if we could settle on an agreed rate (a very difficult task given the variables involved), that would not tell an investigator or an expert witness whether a given memory is false (Smeets et al. 2017). The Lost in the Mall study is so well known because it established that false memories can happen, but neither the original Loftus and Pickrell paper nor our replication study made any claims about the absolute rate at which this should be expected to occur. Other work has also clearly demonstrated that though around a quarter of participants typically form a false memory in a given study (Scoboria et al. 2017), that does not mean that only a quarter of the population are susceptible to forming false memories (Murphy, Loftus, et al. 2023; Patihis 2018).For the self-report question, these participants were explicitly asked if they remembered being lost in a shopping mall, and they indicated that they did. To then remove those participants for not mentioning being lost earlier in the interview is clearly a highly restrictive way to classify memories.Andrews and Brewin also quite notably fail to mention the high rates of belief in these fabricated events. Altogether, the self-reported data suggested that 66% of participants remembered (14%) or believed (52%) that the event had occurred. Thus, the Loftus and Pickrell coding scheme provided higher estimates of false memories than self-report, but also failed to capture that the majority of participants came to believe the event happened and were willing to testify to that fact. As discussed by Scoboria and Mazzoni (2017), belief has been shown to be more than sufficient to cause changes in behaviour (Bernstein et al. 2015) and so false beliefs are an important outcome from rich false memory studies.Where we perhaps most strongly disagree with Andrews and Brewin is their assertion that ‘half the group described potentially true events’. The possibility that participants really did get lost in a shopping mall as children is of course a pertinent one in a study like this—hence why other studies have utilised less commonplace experiences (e.g., Hyman Jr. et al. 1995). While we identified three participants in our original study that we believed could have been reporting a true event (based on their persistent reporting of the event, from the initial survey through to the post-debrief follow-up), Andrews and Brewin declared half of the false memory reports to be ‘potentially true’, marking perhaps the most significant step on their journey from our 35% estimate to their 4% estimate.The rationale for these memories being potentially true raises an interesting theoretical point about the nature of memory. Andrews and Brewin note that these memories were likely real because, for example, the participant reported getting lost in a different shop from the one we prompted them with. However, mountains of evidence on the reconstructive nature of memory would predict exactly this, that participants would take our prompt and actively merge it with their own knowledge and experiences (Greene et al. 2022; Lindsay and Hyman Jr 2017; Loftus and Pickrell 1995; Murphy et al. 2019). The nature of the Lost in the Mall paradigm is particularly active—participants are explicitly encouraged to search their memories and have a discussion with the interviewer about what they can recall and what images they see in their mind. This is in contrast to the kind of process implied by Andrews and Brewin. The expectation that we would hand participants a prompt and they would then recite it back to us, verbatim, with no changes and all so-called ‘core details’ intact suggests a very passive process, almost a hacking of memory where a complete event is ‘uploaded’ to our participants' minds.Andrews and Brewin argue that the events that they have classified as potentially true were recalled with greater certainty and detail and less closely matched the details provided in the prompt (i.e., a different shopping mall was named). They suggest that these were true events that really happened and thus were reported with more certainty. However, it may also be that because the participant actively connected the fake story to other, real events from their lives, these participants built more detailed and convincing false memories—indeed, extant research clearly indicates that people do integrate real personal experiences into false memories in just this manner (Shaw and Porter 2015; Zaragoza et al. 2019). We do not have the data here to answer this question with any certainty, but we would welcome an experimental assessment of this point in the future. Regardless, we would not predict that participants would ever passively accept every detail supplied to them and note that the real-world harms that may arise from, say, suggestive therapy practices, are not contingent on wholesale adoption of every presented detail either.Perhaps our greatest lesson in carrying out this large-scale study was the fact that the interview transcripts are a product of natural conversation. When we devise coding schemes, we can sometimes fall into the logical trap of thinking we are applying the coding scheme to a participant's memory. As we cannot see inside their brain and scrutinise their recollections directly, we are in fact coding the way they speak about their memory. In a study like this, participants are not delivering a monologue; they are engaging in dialogue with an interviewer. Thus, their answers are contingent on the questions they are asked.This distinction was particularly pronounced in our study, as we had six student interviewers conducting this project and there was variation in their styles. Though they were well trained and all followed the same interview schedule, they had different personalities and also varying levels of rapport with the participants. We saw considerable variation in how rates of false memories changed between the coded booklet survey (before any contact with the researcher), the coded interview transcript, and the participants' own self-reported memory declaration. For example, the participants assigned to one researcher had a 10% false memory rate in the booklet survey, which rose to a 50% rate by the second interview, but returned to a 10% rate for self-report. Another interviewer's participants had an 18% false memory rate in the booklet survey that actually dropped to 10% during the interviews, then dropped again to 0% for self-report. This may have been due to differences between interviewers, as it seems they varied in the follow-up questions they asked and what kind of information they encouraged the participant to say ‘on the record’, as it were. We also note that the associative nature of the recall process would predict that slightly different details are recalled during different attempts (Odinot et al. 2013) – humans are not jukeboxes, and a similar prompt will not elicit an identical recollection on each occasion. In addition, participants' level of attention to the conversation and the recalled event is likely to wax and wane over the course of the conversation, and previous research suggests that the attentiveness of a listener can impact what details are recalled (Pasupathi and Oldroyd 2015).The role of the interviewer is particularly pertinent when applying a count-based scheme like that of Andrews and Brewin. They noted that very few of our participants recounted their age when discussing their false memory. Of course, this is not how conversation normally works. Imagine someone asking you about your first day of school, at the age of four. You would not typically begin your account by saying, ‘I was four years old when I started school’, unless that detail felt particularly pertinent to you (‘…so I was the youngest because everyone else was at least five’). Instead, you might talk about your memories of the classroom, the teacher, the other children etc. If age was to be considered an important detail a priori, it would be important to add a question to the interview schedule (‘and what age were you when this happened?’) to fairly judge whether participants recall that detail or not. Absence of evidence is not evidence of absence—we simply do not know whether participants came to remember that they were about five when this false event occurred, as we did not ask them and cannot draw conclusions from their failure to mention it. In our replication study, we saw the role of the interviewers as encouraging participants to talk and so they asked an array of open-ended questions (e.g., ‘and can you picture what it was like in the shop? Who would have been with you?’ etc.). They were facilitators of the conversation, not examiners of memory detail. Interviewers were also expected to maintain the study's ruse at this point; as participants were unaware we were studying false memories, it was important not to interrogate participants to the point where they might question if the event really happened.Many participants may not have actually stated that they got lost, because it was implied by the question. We note that media training often encourages interviewees to include the question in their answer, so that when extracted out of context in a soundbite the quote is more detailed (i.e., when asked when you will launch a product, rather than saying ‘December’, you might be encouraged to say ‘We will launch this product in December’). Training is required to learn to speak like this precisely because we do not naturally speak so repetitively in natural conversation. When coding memory, it is therefore important to remain cognisant of the specific prompts offered to an interviewee as clearly that gives context to what they do and do not say in their narration.It is useful for researchers to reflect on the role of interviewers in rich false memory studies and to consider in advance what their approach will be. Decisions about the interview style and coding scheme ought to be made in unison (and ideally, preregistered), as the interviewer has such an influence on what the participant is likely to speak about. As we have discussed, it is difficult to move the goalposts after the fact and employ a detail-based scheme when the interviews were not set up to assess the presence or absence of those specific details. In our study, we found it useful to combine the natural (imperfect) conversation between participant and interviewer with some standardised questions that they answered during and after the event (Do you remember this event? How vivid is your memory? etc.) and to consider the resulting data in a holistic manner.In our Lost in the Mall replication, we reported a top-line false memory rate of 35%, which is in line with the rates reported across a range of similar studies (see Scoboria et al. 2017 for a mega analysis of false memory implantation studies). Our position is certainly not to argue that any particular false memory rate is ‘correct’; as noted in our replication paper and in the above discussion, we advocate for the use of multiple coding methods, including self-report where appropriate. Just as importantly, we argue that memory reports should be evaluated holistically, with consideration of the context in which the reports were obtained (here, via a naturalistic conversation). We do not consider the use of reductive and over-simplistic count schemes to be a useful measure of memory (true or false) and reject the idea that memory prompts should be repeated back without alteration in order for a participant's recollection to be considered a memory.The clinical and forensic implications of the Lost in the Mall study (and our replication study) remain clear and important. We note that the implantation methods used in these studies are fairly light-touch. Though the studies are enormously burdensome to conduct, involving contact with parents and multiple interviews and online surveys per participant, the actual manipulation of memory is quite mild. Participants are presented with a very short summary of a supposed event from their childhood and are asked to reflect on whether they remember it. That is all. As Scoboria and Mazzoni (2017) noted, this pales in comparison to the kind of memory distortion that might occur over years of suggestive therapy. We therefore respectfully submit that to quibble over the precise rate of false memory in a given study is essentially to miss the point regarding the potential harms to therapeutic patients (Wade et al. 2025).Gillian Murphy: writing – original draft, conceptualization. Ciara M. Greene: conceptualization, writing – review and editing.The authors have nothing to report.The authors declare no conflicts of interest.\",\"PeriodicalId\":48281,\"journal\":{\"name\":\"Applied Cognitive Psychology\",\"volume\":\"39 2\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/acp.70044\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Cognitive Psychology\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/acp.70044\",\"RegionNum\":3,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PSYCHOLOGY, EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Cognitive Psychology","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/acp.70044","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

摘要

什么是记忆？外部观察者真的能确定某人是否在回忆一件事吗？他们是如何做到可靠的呢？这些都是我们作为记忆研究人员所面临的具有挑战性的问题，尤其是当我们试图区分真实和虚假的记忆和信念时。在这一期中，Andrews和Brewin（2024）重新分析了我们最近丰富的错误记忆研究的一部分数据（Murphy, Dawson, et al. 2023），并基于统计报告的细节开发了一种新的编码方案。他们还进一步应用了更严格的标准来对我们研究中报告的错误记忆进行分类，并得出结论，这种方法产生的错误记忆率与我们原始论文中使用的方案不同，而且这个比率也与参与者自己报告的记忆不同。这些发现并不让我们感到惊讶——根据我们的经验，不同的编码方案总是会产生不同的速率——但我们确实不同意Andrews和Brewin从这些发现中得出的方法和结论。为了提供一些背景，我们首先简要概述了复制研究。这是由一个学生团队作为一个合作项目进行的（Murphy and Greene 2023），并严格遵循经典的迷失在商场研究（Loftus and Pickrell 1995）的方法。参与者报名参加了一项关于我们如何记忆童年的研究，他们的线人（通常是他们的母亲）完成了一项在线调查，告诉我们一些真实的童年事件，以及参与者小时候购物旅行的一些信息。然后，我们给参与者发了一份调查，向他们展示了三个真实的记忆描述（取自他们的线人的描述）和一个错误的记忆提示，描述了参与者小时候在购物中心迷路；这个虚假的事件是通过将举报人提供的细节放入一个预先准备好的叙述中来制造的，在这个叙述中，参与者被描述为迷失了很短的一段时间，变得心烦意乱，然后被一位老妇人找到，然后与父母团聚。然后，参与者在两个不同的场合接受采访，每次20-30分钟，在此期间，他们被鼓励尽可能多地记住事件。然后使用预先注册的编码方案对这些对话的文本进行编码，以确定是否存在记忆。在第二次采访结束时，参与者在听取询问之前，自我报告他们是否记得每件事。参与者和告密者报告说他们很享受这项研究，并且大部分不反对所采用的欺骗（Murphy, Maher, et al. 2023）。在后续研究中，我们证实了我们的汇报方法在收回这些错误记忆方面是有效的（Greene et al. 2024）。首先必须指出，我们欢迎对我们的成果进行审查和讨论。我们的参与者（和他们的父母）慷慨地自愿花了很多时间来完成我们的研究，作为一个研究团队，我们付出了巨大的努力，使我们的匿名数据开放给其他研究人员。看到我们来之不易的数据为其他研究人员的工作提供信息是令人欣慰的。因此，虽然我们不同意Andrews和Brewin提出的方法或结论，但我们全心全意地支持他们对我们工作的仔细检查。Andrews和Brewin提出的许多问题都与错误记忆研究的基本原则有关，我们在进行这项研究时详细考虑了这些原则，我们认为这些原则将从进一步的讨论中受益。我们的方法选择（以及我们预先登记的假设）反映了我们的信念，即所有的编码方案都是不完美的，在任何一项研究中，我们都不可能或应该期望观察到错误记忆形成的绝对比率——因此，我们开始记录多种不同的记忆测量方法，这些方法可以被全面考虑。在这里，我们借此机会进一步解开这些棘手的问题，并提出我们的观点，这在许多方面与Wade等人（2025）的评论一致。关于错误记忆的编码，我们提出了四个论点：1。没有一个完美的错误记忆编码方案。没有绝对的错误记忆率。3 .记忆扭曲是一个主动的过程，而不是对一个人的记忆进行被动的“入侵”。访谈是评估记忆的一种嘈杂的方式。我们同意Andrews和Brewin（2024）的基本发现，即任何给定的编码方案都可能给出与另一个不同的结果。正如我们在原始论文中报道的那样，在第二次访谈的研究结束时，我们观察到在将Loftus和Pickrell（1995）编码方案应用于访谈记录时，错误记忆率为35%，但我们也观察到自我报告的错误记忆率为14%（另外还有52%的参与者自我报告认为事件发生过）。此外，当我们向模拟陪审团展示记录的节选，并询问他们受访者是否记得一个事件时，他们对其他编码错误记忆的方法只有适度的同意（55%同意编码方案，70%同意自我报告），模拟陪审员采用更自由的阈值来分类记忆。这些发现虽然值得注意，但与之前发表的研究结果非常一致，该研究已经确定，当应用不同的方案时，记忆速率会有很大差异。例如，Wade等人（2018）重新分析了Shaw和Porter（2015）进行的一项丰富的错误记忆研究，报告错误记忆率仅为30%，而不是最初报道的70%。这些差异在理论上和实践上都很有趣，说明了试图对记忆进行分类的巨大挑战。关于如何最好地为存在错误记忆的转录本编码，在记忆文献中存在广泛的争论，包括我们应该依赖参与者的自我报告而不是研究人员产生的编码方案的争论。这些争论引发了关于（错误）记忆的本质以及我们应该如何定义它的细致入微的讨论（Shaw 2018）。这些“关键细节”的选择是随意的，安德鲁斯和布鲁因的分析表明，有些细节的选择相当糟糕。例如，在我们的分析中，0%的被归类为记忆完整的人在回忆事件时记录了他们的年龄。这对我们来说并不奇怪，因为虚假事件是在那次采访中被讨论的第三件事，所有的事件都发生在那个童年时期。正如我们稍后将讨论的，谈话的自然节奏是这样的，许多人不会提到事件发生时他们的年龄，即使我们提供了一个细节来帮助他们想象事情可能发生的时间。其他的“关键细节”，比如关于老妇人的细节，实际上是多个细节捆绑在一起的，因此，只有当参与者提到那个人是1岁时，他们才会被编码为明确地回忆起那个细节。老了,2。女性，3岁。做了一件有益的事。将某些细节声明为内存的核心是武断的，以至于未能提及某个特定细节会导致内存被降级或完全打折扣。在研究开始时，这位参与者表示她根本不记得这件事，但在这里的第二次采访中，她给出了在购物中心迷路的丰富而详细的描述，使用lottus和Pickrell方案将其编码为完整的记忆。当被问及这件事时，她自我报告了对这件事的清晰记忆，并表示她非常愿意作证这件事的发生（李克特意愿量表为9/10）。然而，她没有提到她的年龄，心烦意乱，或者购物中心的名字（尽管她给商店起了名字），所以安德鲁斯和布鲁因的新颖编码方案只能打3/6分。事实上，他们观察到，没有一个参与者报告了六个核心细节中的四个以上。尽管如此，参与者提供了丰富的感官细节，并坚称她对事件有非常清晰和可信的记忆。这个例子还突出了这种计数方法的另一个问题，即每个细节都被赋予了相同的权重。报告购物中心的名字和记住迷路同样重要。这是一个真实事件的记忆（而且是一个重要的事件），然而参与者只明确地报告了六个细节中的两个（去医院看望她的母亲，以及不确定地收到一个朵拉娃娃）。她没有记下自己的年龄和医院的名字，也没有提到她的母亲在她进来时拥抱了她。她确实提到注意到婴儿有多小，但没有特别记得评论过他的耳朵或手指。还要注意的是，这个参与者报告的事件版本略有不同（她说陪伴她的是她的表妹，而不是她的祖母），尽管她说她只记得关于娃娃的细节，因为它包含在最初的提示中，她似乎已经充实了那个形象，所以她现在注意到娃娃是用毯子包裹的，而不是用包装纸包裹的。她确实报告了提示中没有的额外细节（例如，银色丰田，房间里每个人的实际位置），但这些并不构成Andrews和Brewin计划的一部分，这似乎是假设，为了被认为是丰富和详细的记忆，提示应该逐字重复。然而，这个参与者报告说他记得这个事件，我们相信，如果你问一个外行人（或陪审团成员）这个参与者是否记得这个事件，他们会说是的。从提示中计算“关键”细节的存在是评估错误记忆丰富程度的一种方法，但它不是唯一的方法——事实上，我们认为它是理解记忆（重建）构建方面最没有价值的方法之一。无论错误记忆在给定的范式中是5%还是55%都不会改变这些范式告诉我们的人类记忆的本质，也不会改变法医的含义。即使我们可以确定一个商定的比率（考虑到所涉及的变量，这是一项非常困难的任务），也无法告诉调查员或专家证人给定的记忆是否错误（Smeets et al. 2017）。《迷失在林间》的研究之所以广为人知，是因为它证实了错误记忆是可能发生的，但无论是洛夫特斯和皮克雷尔最初的论文，还是我们的重复研究，都没有对错误记忆发生的绝对几率做出任何声明。其他研究也清楚地表明，尽管在给定的研究中，大约四分之一的参与者通常会形成错误记忆（Scoboria等人，2017），但这并不意味着只有四分之一的人容易形成错误记忆(Murphy， Loftus等人，2023；Patihis 2018)。在自我报告的问题中，这些参与者被明确地询问他们是否记得在购物中心迷路，他们表示记得。然后，把那些在采访中没有提及失散的参与者排除在外，显然是一种高度限制性的记忆分类方法。安德鲁斯和布鲁因也很明显地没有提到对这些捏造事件的高信仰率。总的来说，自我报告的数据表明，66%的参与者记得（14%）或相信（52%）事件发生过。因此，lottus和Pickrell编码方案提供了比自我报告更高的错误记忆估计，但也未能捕捉到大多数参与者相信事件发生并愿意为这一事实作证。正如Scoboria和Mazzoni（2017）所讨论的那样，信念已被证明足以引起行为的变化（Bernstein等人，2015），因此错误信念是丰富的错误记忆研究的重要结果。我们可能最不同意安德鲁斯和布鲁因的观点是，“一半的人描述的事件可能是真实的”。当然，在这样的研究中，参与者小时候真的在购物中心迷路的可能性是相关的——因此，为什么其他研究使用了不那么常见的经历（例如，Hyman Jr. et al. 1995）。虽然我们在最初的研究中确定了三名参与者，我们认为他们可能报告了一个真实的事件（基于他们从最初的调查到汇报后的跟进，持续不断地报告事件），但安德鲁斯和布鲁因宣布一半的错误记忆报告“可能是真实的”，这可能是他们从35%的估计到4%的估计的旅程中最重要的一步。这些记忆可能是真实的理论基础提出了一个关于记忆本质的有趣的理论观点。安德鲁斯和布鲁温指出，这些记忆可能是真实的，因为，例如，参与者报告说，他们在一个不同的商店迷路了，而不是我们提示他们的商店。然而，大量关于记忆重建性质的证据可以准确地预测这一点，即参与者会迅速并积极地将我们的记忆与他们自己的知识和经验结合起来(Greene等人，2022；Lindsay and Hyman Jr 2017；lottus和Pickrell 1995；Murphy et al. 2019)。“迷失林荫道”范式的本质是特别积极的，参与者被明确鼓励去搜索他们的记忆，并与面试官讨论他们能回忆起的东西以及他们在脑海中看到的图像。这与Andrews和Brewin所暗示的过程形成了对比。我们给参与者一个提示，然后他们会一字不改地背诵给我们，所有所谓的“核心细节”都完好无损，这表明这是一个非常被动的过程，几乎是对记忆的黑客攻击，一个完整的事件被“上传”到参与者的脑海中。Andrews和Brewin认为，他们归类为潜在真实的事件被回忆起来更加确定和详细，并且与提示提供的细节不太吻合（例如，命名了一个不同的购物中心）。他们认为这些都是真实发生的事件，因此被报道得更加确定。然而，也有可能是因为参与者积极地将虚假故事与他们生活中的其他真实事件联系起来，这些参与者建立了更详细和令人信服的虚假记忆——事实上，现有的研究清楚地表明，人们确实以这种方式将真实的个人经历整合到虚假记忆中(Shaw and Porter 2015；Zaragoza et al. 2019)。学习这样说话需要训练，因为我们在自然对话中不会自然地重复说话。因此，在对记忆进行编码时，重要的是要保持对提供给受访者的特定提示的认识，因为这清楚地为他们在叙述中做了什么和不说了什么提供了背景。研究人员在丰富的错误记忆研究中反思采访者的角色，并提前考虑他们的方法是有用的。关于面试风格和编码方案的决定应该是一致的（最好是预先注册的），因为面试官对参与者可能会说什么有很大的影响。正如我们已经讨论过的那样，在没有安排面谈来评估是否存在这些具体细节的情况下，很难在事后移动门柱并采用基于细节的计划。在我们的研究中，我们发现将参与者和采访者之间的自然（不完美）对话与他们在活动期间和之后回答的一些标准化问题结合起来是很有用的(你还记得这件事吗？你的记忆力有多好？等)，并以整体的方式考虑所得数据。在我们的《迷失在商场》复制中，我们报告了35%的最高错误记忆率，这与一系列类似研究报告的错误记忆率一致（参见Scoboria等人2017年对错误记忆植入研究的大型分析）。我们的立场当然不是说任何特定的错误记忆率都是“正确的”；正如在我们的复制论文和上面的讨论中所指出的，我们提倡使用多种编码方法，包括适当的自我报告。同样重要的是，我们认为应该从整体上评估记忆报告，考虑到获得报告的上下文（这里，通过自然对话）。我们不认为使用简化和过于简单的计数方案是一种有用的记忆测量方法（正确或错误），并且拒绝这样的观点，即为了使参与者的回忆被认为是记忆，记忆提示应该在没有改变的情况下重复。迷失在林荫道的研究（以及我们的重复研究）的临床和法医意义仍然清晰而重要。我们注意到，在这些研究中使用的植入方法相当轻触。尽管这些研究的进行非常繁琐，涉及到与父母的接触、对每个参与者进行多次访谈和在线调查，但对记忆的实际操纵是相当温和的。研究人员向参与者简要介绍了他们童年时期的一件假想事件，并要求他们反思自己是否记得这件事。就这些。正如Scoboria和Mazzoni（2017）所指出的那样，与多年的暗示性治疗可能导致的记忆扭曲相比，这就相形见绌了。因此，我们恭敬地提出，在给定的研究中，对错误记忆的精确率吹毛求疵，本质上是忽略了对治疗患者的潜在危害（Wade et al. 2025）。吉莉安墨菲：写作-原稿，概念化。Ciara M. Greene：概念化，写作-评论和编辑。作者没有什么可报告的。作者声明无利益冲突。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Who Got Lost in the Mall? Challenges in Counting and Classifying False Memories

What is a memory? Can an outside observer really ascertain whether someone is remembering an event? How can they do so reliably? These are challenging questions that we face as memory researchers, particularly when we try to tease apart true and false memories and beliefs. In this issue, Andrews and Brewin (2024) reanalysed a portion of the data from our recent rich false memory study (Murphy, Dawson, et al. 2023) and developed a novel coding scheme based on counting reported details. They also applied further, more stringent criteria to classify the false memories reported in our study and concluded that this method yields a different false memory rate from the scheme used in our original paper, and that this rate is different again from participants' own self-reported memories. These findings do not surprise us—in our experience, different coding schemes will always yield different rates—but we do disagree with both the methods used and the conclusions that Andrews and Brewin drew from these findings.

To provide some context, we first offer a brief overview of the replication study. This was conducted by a team of students as a collaborative project (Murphy and Greene 2023) and closely adhered to the methods of the classic Lost in the Mall study (Loftus and Pickrell 1995). Participants signed up for a study about how we remember our childhoods and their informant (usually their mother) completed an online survey telling us about some true childhood events as well as some information about shopping trips when the participant was a child. We then sent participants a survey in which they were shown three true memory descriptions (taken from their informant's account) and one false memory prompt that described the participant getting lost in a shopping mall as a child; this false event was created by slotting the informant-provided details into a pre-prepared narrative in which the participant was described as getting lost for a short period of time, becoming upset and then being found by an elderly woman before being reunited with their parent. Participants were then interviewed on two separate occasions, for 20–30 min, where they were encouraged to try to remember as much as they could about the event. The transcripts of these conversations were then coded for the presence of a memory using a pre-registered coding scheme. At the conclusion of the second interview, participants self-reported whether or not they remembered each of the events, before being debriefed. Participants and informants reported enjoying the study and largely did not object to the deception employed (Murphy, Maher, et al. 2023). In a follow-up study, we confirmed that our debriefing methods were effective at retracting these false memories (Greene et al. 2024).

It is important to first note that we welcome scrutiny and discussion of our results. Our participants (and their parents) generously volunteered a lot of time to complete our study and, as a research team, we exerted significant effort to make our anonymised data open and accessible to other researchers. It is gratifying to see our hard-won data informing the work of other researchers. So, while we do not agree with the methods or conclusions put forward by Andrews and Brewin, we do wholeheartedly support their careful inspection of our work. Many of the questions raised by Andrews and Brewin relate to foundational principles in false memory research, principles we considered at length in running this study and which we feel would benefit from further discussion. Our methodological choices (and our pre-registered hypotheses) reflect our conviction that all coding schemes are imperfect and there is no absolute rate of false memory formation that we could or should expect to observe in any one study—we therefore set out to record multiple different measures of memory that could be considered in the round. Here, we take the opportunity to unpack some of these thorny issues further and offer our perspective, which aligns in many ways with the commentary by Wade et al. (2025). We make four arguments related to the coding of false memories: 1. There is no one perfect false memory coding scheme, 2. There is no absolute false memory rate, 3. Memory distortion is an active process, not a passive ‘hacking’ of one's memory, and 4. Interviews are a noisy means of assessing memories.

We agree with Andrews and Brewin's (2024) basic finding that any given coding scheme is likely to give different results from another. As we reported in our original paper, at the conclusion of the study during the second interview, we observed a false memory rate of 35% when applying the Loftus and Pickrell (1995) coding scheme to the interview transcript, but we also observed a self-reported false memory rate of 14% (alongside an additional 52% of participants who self-reported believing that the event had occurred). Furthermore, when we showed excerpts from the transcripts to a mock jury and asked them whether the interviewee was remembering an event, they demonstrated only moderate agreement with the other methods of coding false memories (55% agreement with the coding scheme, 70% agreement with self-report), with mock jurors adopting more liberal thresholds for classifying memories.

These findings, while noteworthy, were very much in line with prior published work that has established that memory rates can vary hugely when different schemes are applied. For example, Wade et al. (2018) reanalysed a rich false memory study conducted by Shaw and Porter (2015), reporting a false memory rate of just 30%, in contrast to the originally reported 70%. These discrepancies are interesting, both theoretically and practically, and speak to the enormous challenge inherent in trying to classify a memory. There have been extensive debates in the memory literature regarding how best to code transcripts for the presence of a false memory, including arguments that we should rely on participant self-reports rather than researcher-produced coding schemes. These arguments have provoked nuanced discussions of the nature of (false) memory and how we should define it (Shaw 2018).

The selection of these ‘key details’ is arbitrary, and Andrews and Brewin's analysis suggests that some were quite poorly chosen. For example, 0% of those classed as having a full memory in our analysis noted their age when recalling the event. This is not surprising to us, given that the false event was the third event being discussed in that interview, and all of the events were from around that period of childhood. As we will later discuss, the natural rhythms of conversation are such that many people do not mention the age they were when an event took place, even if it was a detail we provided them with to help them imagine when it might have taken place. Others ‘key details’, such as the detail about the elderly woman, are actually multiple details bundled together, whereby the participant was only coded as explicitly recalling that detail if they mentioned the person was 1. old, 2. female, and 3. performed a helpful act. It is arbitrary to declare certain details to be so central to the memory that a failure to mention one specific detail results in a memory being downgraded or discounted entirely.

At the start of the study, this participant stated that she did not remember this event at all, but by interview two here she gives a rich and detailed account of getting lost in a shopping mall, which was coded as a full memory using the Loftus and Pickrell scheme. When asked, she self-reported a clear memory for this event and said she would be extremely willing to testify that it happened (9/10 on a Likert scale of willingness). However, she does not mention her age, being upset, or the name of the shopping mall (though she does name the shop itself), so would only score a 3/6 on Andrews and Brewin's novel coding scheme. Indeed, they observed that not a single participant reported more than four of their six core details. Despite this, the participant offers rich sensory details and insists she has a very clear and trusted memory of the event. This example also highlights another problem with this counting approach, in that each detail is given equal weighting. Reporting the name of the shopping mall is equally important as remembering being lost.

This is a memory of a true event (and an important one at that), and yet the participant only explicitly reports two out of the six details (visiting her mother in the hospital and, with uncertainty, receiving a Dora doll). She does not note her age or the name of the hospital, or mention that her mother hugged her when she came in. She does mention noticing how small the baby was but does not specifically recall commenting on his ears or fingers. Note too that this participant reports a slightly different version of the event (stating it was her cousin and not her grandmother who accompanied her) and though she says she only remembers the detail about the doll because it was contained in the original prompt, she has seemingly fleshed out that image so that she now notes the doll was wrapped in a blanket rather than wrapped with wrapping paper. She does report additional details that were not in the prompt (e.g., the silver Toyota, the physical location of everyone in the room), but these do not form part of Andrews and Brewin's scheme, which seems to assume that, in order to be considered a rich and detailed memory, the prompt should be repeated back verbatim. Yet this participant reports remembering this event, and we are confident that if you asked a layperson (or a jury member) whether this participant was remembering this event, they would say yes. Counting the presence of ‘key’ details from a prompt is one way to assess the richness of false memories, but it is not the only way—and in fact, we would argue it is one of the least valuable in terms of understanding memory (re)construction.

Whether false memories occur in a given paradigm 5% of the time or 55% of the time does not change what these paradigms tell us about the nature of human memory, nor does it change the forensic implications. Even if we could settle on an agreed rate (a very difficult task given the variables involved), that would not tell an investigator or an expert witness whether a given memory is false (Smeets et al. 2017). The Lost in the Mall study is so well known because it established that false memories can happen, but neither the original Loftus and Pickrell paper nor our replication study made any claims about the absolute rate at which this should be expected to occur. Other work has also clearly demonstrated that though around a quarter of participants typically form a false memory in a given study (Scoboria et al. 2017), that does not mean that only a quarter of the population are susceptible to forming false memories (Murphy, Loftus, et al. 2023; Patihis 2018).

For the self-report question, these participants were explicitly asked if they remembered being lost in a shopping mall, and they indicated that they did. To then remove those participants for not mentioning being lost earlier in the interview is clearly a highly restrictive way to classify memories.

Andrews and Brewin also quite notably fail to mention the high rates of belief in these fabricated events. Altogether, the self-reported data suggested that 66% of participants remembered (14%) or believed (52%) that the event had occurred. Thus, the Loftus and Pickrell coding scheme provided higher estimates of false memories than self-report, but also failed to capture that the majority of participants came to believe the event happened and were willing to testify to that fact. As discussed by Scoboria and Mazzoni (2017), belief has been shown to be more than sufficient to cause changes in behaviour (Bernstein et al. 2015) and so false beliefs are an important outcome from rich false memory studies.

Where we perhaps most strongly disagree with Andrews and Brewin is their assertion that ‘half the group described potentially true events’. The possibility that participants really did get lost in a shopping mall as children is of course a pertinent one in a study like this—hence why other studies have utilised less commonplace experiences (e.g., Hyman Jr. et al. 1995). While we identified three participants in our original study that we believed could have been reporting a true event (based on their persistent reporting of the event, from the initial survey through to the post-debrief follow-up), Andrews and Brewin declared half of the false memory reports to be ‘potentially true’, marking perhaps the most significant step on their journey from our 35% estimate to their 4% estimate.

The rationale for these memories being potentially true raises an interesting theoretical point about the nature of memory. Andrews and Brewin note that these memories were likely real because, for example, the participant reported getting lost in a different shop from the one we prompted them with. However, mountains of evidence on the reconstructive nature of memory would predict exactly this, that participants would take our prompt and actively merge it with their own knowledge and experiences (Greene et al. 2022; Lindsay and Hyman Jr 2017; Loftus and Pickrell 1995; Murphy et al. 2019). The nature of the Lost in the Mall paradigm is particularly active—participants are explicitly encouraged to search their memories and have a discussion with the interviewer about what they can recall and what images they see in their mind. This is in contrast to the kind of process implied by Andrews and Brewin. The expectation that we would hand participants a prompt and they would then recite it back to us, verbatim, with no changes and all so-called ‘core details’ intact suggests a very passive process, almost a hacking of memory where a complete event is ‘uploaded’ to our participants' minds.

Andrews and Brewin argue that the events that they have classified as potentially true were recalled with greater certainty and detail and less closely matched the details provided in the prompt (i.e., a different shopping mall was named). They suggest that these were true events that really happened and thus were reported with more certainty. However, it may also be that because the participant actively connected the fake story to other, real events from their lives, these participants built more detailed and convincing false memories—indeed, extant research clearly indicates that people do integrate real personal experiences into false memories in just this manner (Shaw and Porter 2015; Zaragoza et al. 2019). We do not have the data here to answer this question with any certainty, but we would welcome an experimental assessment of this point in the future. Regardless, we would not predict that participants would ever passively accept every detail supplied to them and note that the real-world harms that may arise from, say, suggestive therapy practices, are not contingent on wholesale adoption of every presented detail either.

Perhaps our greatest lesson in carrying out this large-scale study was the fact that the interview transcripts are a product of natural conversation. When we devise coding schemes, we can sometimes fall into the logical trap of thinking we are applying the coding scheme to a participant's memory. As we cannot see inside their brain and scrutinise their recollections directly, we are in fact coding the way they speak about their memory. In a study like this, participants are not delivering a monologue; they are engaging in dialogue with an interviewer. Thus, their answers are contingent on the questions they are asked.

This distinction was particularly pronounced in our study, as we had six student interviewers conducting this project and there was variation in their styles. Though they were well trained and all followed the same interview schedule, they had different personalities and also varying levels of rapport with the participants. We saw considerable variation in how rates of false memories changed between the coded booklet survey (before any contact with the researcher), the coded interview transcript, and the participants' own self-reported memory declaration. For example, the participants assigned to one researcher had a 10% false memory rate in the booklet survey, which rose to a 50% rate by the second interview, but returned to a 10% rate for self-report. Another interviewer's participants had an 18% false memory rate in the booklet survey that actually dropped to 10% during the interviews, then dropped again to 0% for self-report. This may have been due to differences between interviewers, as it seems they varied in the follow-up questions they asked and what kind of information they encouraged the participant to say ‘on the record’, as it were. We also note that the associative nature of the recall process would predict that slightly different details are recalled during different attempts (Odinot et al. 2013) – humans are not jukeboxes, and a similar prompt will not elicit an identical recollection on each occasion. In addition, participants' level of attention to the conversation and the recalled event is likely to wax and wane over the course of the conversation, and previous research suggests that the attentiveness of a listener can impact what details are recalled (Pasupathi and Oldroyd 2015).

The role of the interviewer is particularly pertinent when applying a count-based scheme like that of Andrews and Brewin. They noted that very few of our participants recounted their age when discussing their false memory. Of course, this is not how conversation normally works. Imagine someone asking you about your first day of school, at the age of four. You would not typically begin your account by saying, ‘I was four years old when I started school’, unless that detail felt particularly pertinent to you (‘…so I was the youngest because everyone else was at least five’). Instead, you might talk about your memories of the classroom, the teacher, the other children etc. If age was to be considered an important detail a priori, it would be important to add a question to the interview schedule (‘and what age were you when this happened?’) to fairly judge whether participants recall that detail or not. Absence of evidence is not evidence of absence—we simply do not know whether participants came to remember that they were about five when this false event occurred, as we did not ask them and cannot draw conclusions from their failure to mention it. In our replication study, we saw the role of the interviewers as encouraging participants to talk and so they asked an array of open-ended questions (e.g., ‘and can you picture what it was like in the shop? Who would have been with you?’ etc.). They were facilitators of the conversation, not examiners of memory detail. Interviewers were also expected to maintain the study's ruse at this point; as participants were unaware we were studying false memories, it was important not to interrogate participants to the point where they might question if the event really happened.

Many participants may not have actually stated that they got lost, because it was implied by the question. We note that media training often encourages interviewees to include the question in their answer, so that when extracted out of context in a soundbite the quote is more detailed (i.e., when asked when you will launch a product, rather than saying ‘December’, you might be encouraged to say ‘We will launch this product in December’). Training is required to learn to speak like this precisely because we do not naturally speak so repetitively in natural conversation. When coding memory, it is therefore important to remain cognisant of the specific prompts offered to an interviewee as clearly that gives context to what they do and do not say in their narration.

It is useful for researchers to reflect on the role of interviewers in rich false memory studies and to consider in advance what their approach will be. Decisions about the interview style and coding scheme ought to be made in unison (and ideally, preregistered), as the interviewer has such an influence on what the participant is likely to speak about. As we have discussed, it is difficult to move the goalposts after the fact and employ a detail-based scheme when the interviews were not set up to assess the presence or absence of those specific details. In our study, we found it useful to combine the natural (imperfect) conversation between participant and interviewer with some standardised questions that they answered during and after the event (Do you remember this event? How vivid is your memory? etc.) and to consider the resulting data in a holistic manner.

In our Lost in the Mall replication, we reported a top-line false memory rate of 35%, which is in line with the rates reported across a range of similar studies (see Scoboria et al. 2017 for a mega analysis of false memory implantation studies). Our position is certainly not to argue that any particular false memory rate is ‘correct’; as noted in our replication paper and in the above discussion, we advocate for the use of multiple coding methods, including self-report where appropriate. Just as importantly, we argue that memory reports should be evaluated holistically, with consideration of the context in which the reports were obtained (here, via a naturalistic conversation). We do not consider the use of reductive and over-simplistic count schemes to be a useful measure of memory (true or false) and reject the idea that memory prompts should be repeated back without alteration in order for a participant's recollection to be considered a memory.

The clinical and forensic implications of the Lost in the Mall study (and our replication study) remain clear and important. We note that the implantation methods used in these studies are fairly light-touch. Though the studies are enormously burdensome to conduct, involving contact with parents and multiple interviews and online surveys per participant, the actual manipulation of memory is quite mild. Participants are presented with a very short summary of a supposed event from their childhood and are asked to reflect on whether they remember it. That is all. As Scoboria and Mazzoni (2017) noted, this pales in comparison to the kind of memory distortion that might occur over years of suggestive therapy. We therefore respectfully submit that to quibble over the precise rate of false memory in a given study is essentially to miss the point regarding the potential harms to therapeutic patients (Wade et al. 2025).

Gillian Murphy: writing – original draft, conceptualization. Ciara M. Greene: conceptualization, writing – review and editing.

The authors have nothing to report.

The authors declare no conflicts of interest.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Cognitive Psychology PSYCHOLOGY, EXPERIMENTAL-

CiteScore

4.30

自引率

8.30%

发文量

111

期刊介绍： Applied Cognitive Psychology seeks to publish the best papers dealing with psychological analyses of memory, learning, thinking, problem solving, language, and consciousness as they occur in the real world. Applied Cognitive Psychology will publish papers on a wide variety of issues and from diverse theoretical perspectives. The journal focuses on studies of human performance and basic cognitive skills in everyday environments including, but not restricted to, studies of eyewitness memory, autobiographical memory, spatial cognition, skill training, expertise and skilled behaviour. Articles will normally combine realistic investigations of real world events with appropriate theoretical analyses and proper appraisal of practical implications.