Data cleaning

  • Assessment
  • Implementation
  • Monitoring & Evaluation
  • Data value chain or data life cycle

Definition

The process of improving the quality of data by correcting inaccurate records from a record set. The term specifically refers to detecting, modifying, replacing, or deleting incomplete, incorrect, improperly formatted, duplicated, or irrelevant records, otherwise referred to as “dirty data,” within a database.

References

Allen, The SAGE Encylopedia of Communication Research Methods, 2017