{"title":"Speaking Stata: Replacing missing values: The easiest problems","authors":"Nicholas J. Cox","doi":"10.1177/1536867x231196519","DOIUrl":null,"url":null,"abstract":"Missing values are common in real datasets, and what to do about them is a large and challenging question. This column focuses on the easiest problems in which a researcher is clear, or at least highly confident, about what missing values should be instead, implying a deterministic replacement. The main tricks are copying values from observation to observation and using the ipolate command. Both may often be extended simply to panel or longitudinal datasets or to other datasets with a group structure, such as data on individuals within families or households. This column includes how to satisfy constraints that interpolation is confined to filling gaps between values known to be equal or to observations moderately close to a known value in time or in some other sequence or position variable.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Stata Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/1536867x231196519","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Missing values are common in real datasets, and what to do about them is a large and challenging question. This column focuses on the easiest problems in which a researcher is clear, or at least highly confident, about what missing values should be instead, implying a deterministic replacement. The main tricks are copying values from observation to observation and using the ipolate command. Both may often be extended simply to panel or longitudinal datasets or to other datasets with a group structure, such as data on individuals within families or households. This column includes how to satisfy constraints that interpolation is confined to filling gaps between values known to be equal or to observations moderately close to a known value in time or in some other sequence or position variable.
期刊介绍:
The Stata Journal is a quarterly publication containing articles about statistics, data analysis, teaching methods, and effective use of Stata''s language. The Stata Journal publishes reviewed papers together with shorter notes and comments, regular columns, book reviews, and other material of interest to researchers applying statistics in a variety of disciplines.