How to extract entities from raw text with Spacy: 3 approaches using Canadian data

TL;DR: Use the en_core_web_trf transformer model with Spacy to get much more accurate named entity recognition with multilingual text. Entity recognition is one of the marvels or current technology, as least from a journalist’s perspective. There was a time journalists had to read through hundreds, maybe thousands of documents, highlight names of people, companies and […]

Getting tabular data from unstructured text with GPT-3: an ongoing experiment

One of the most exciting applications of AI in journalism is the creation of structured data from unstructured text. Government reports, legal documents, emails, memos… these are rich with content like names, organizations, dates, and prices. But to get them into a format that can be analyzed and counted, like a spreadsheet, usually involves days […]

4 ways to make self-updating Datawrapper charts

Datawrapper is right now the best tool for creating quick and simple charts. It’s so useful and feature-rich that news organizations that had their own in-house charting tool are switching over. One of its best features is the ability to connect a CSV file hosted on the web as a data source. This enables users […]

Using NLP to analyze open-ended responses in surveys

One of the final frontiers of data analysis is making sense of unstructured text like reports and open-ended responses in surveys. Natural language processing (NLP), with the help of AI, is making this kind of analysis more accessible. Libraries like spaCy and Gensim, although still code-based, are simplifying the process of getting insights out of […]

It’s foolish to rank psychedelic drugs, but I’m going to try anyway

Erowid.org is a website with an overwhelming amount of information about drugs. It curates scientific literature, legal documents, cultural uses, and user experiences on all kinds of substances. It’s well known among those who use psychoactive drugs for self-exploration. I like to call it Encyclopaedia Psychonautica. One of the more interesting sections of the website is the […]

A look at Canada’s international trade through maps and charts

Canada has trade relations with 224 countries and territories, with which it trades more than 5,500 types of products and services. In 2014, Canada imported $511 billion in goods, and exported $525 billion, according to data from Statistics Canada. But who are our main trading partners, and what kind of goods flow back and forth […]