Wanna skip the blabla and get right to the code? Access the Colab notebook here. I recently took a short course on DeepLearning.ai called Pair Programming with LLMs, where you learn how to use Google’s PaLM2 language model to help write, debug, and explain code within a Jupyter Notebook environment. Well, PaLM is old news, […]
How to use ChatGPT Vision to turn handwritten forms into data
Takeaways: ChatGPT can turn handwritten forms into data, even with sloppy handwriting. Defining a schema of the desired output helps. It makes mistakes. Output still needs to be validated and possibly fixed by hand. Can’t be automated with API yet. Still need to manually upload images to web application. Limit of four images per upload […]
Using ChatGPT to clean data: an experiment
One of the most annoying parts of data work is dealing with inconsistent entities: names of the same person spelled differently. Company names that rebranded, merged, or have varying suffixes like “Ltd.” and “Limited”. Standardizing data for accurate analysis can take days, sometime weeks, even with powerful tools like OpenRefine and Dedupe, which were made […]
Getting tabular data from unstructured text with GPT-3: an ongoing experiment
One of the most exciting applications of AI in journalism is the creation of structured data from unstructured text. Government reports, legal documents, emails, memos… these are rich with content like names, organizations, dates, and prices. But to get them into a format that can be analyzed and counted, like a spreadsheet, usually involves days […]