Montreal, Québec, Canada

Roberto Rocha

Data storyteller and educator

Menu
  • Home
  • CV
  • Blog
  • Data services
  • Training
  • Contact

ChatGPT

mop-sweeping-cleaning-hardwood-floors-house

Using ChatGPT to clean data: an experiment

April 23, 2023April 23, 2023 Roberto 4 Comments

One of the most annoying parts of data work is dealing with inconsistent entities: names of the same person spelled differently. Company names that rebranded, merged, or have varying suffixes like “Ltd.” and “Limited”. Standardizing data for accurate analysis can take days, sometime weeks, even with powerful tools like OpenRefine and Dedupe, which were made […]

Posted in Data Journalism Tags AI, ChatGPT, data cleaning
Read More

Recent Posts

  • Pair programming with LLMs: putting 5 leading models to the test
  • How to use ChatGPT Vision to turn handwritten forms into data
  • Using ChatGPT to clean data: an experiment
  • How to extract entities from raw text with Spacy: 3 approaches using Canadian data
  • Getting tabular data from unstructured text with GPT-3: an ongoing experiment

Recent Comments

  • Part 2: Acquiring external data - DeepSense on On the ethics of web scraping
  • Aditya Sharma on Using Python’s calendar module for scraping date-based data
  • Jed Clark on Using NLP to analyze open-ended responses in surveys
  • Roberto on Using ChatGPT to clean data: an experiment
  • Chris on Using ChatGPT to clean data: an experiment
Theme: Albar by Kaira