And other insights from 7 years of anonymous Wikipedia edits by government employees








And other insights from 7 years of anonymous Wikipedia edits by government employees
Datawrapper is right now the best tool for creating quick and simple charts. It’s so useful and feature-rich that news organizations that had their own in-house charting tool are switching over. One of its best features is the ability to connect a CSV file hosted on the web as a data source. This enables users […]
One of the final frontiers of data analysis is making sense of unstructured text like reports and open-ended responses in surveys. Natural language processing (NLP), with the help of AI, is making this kind of analysis more accessible. Libraries like spaCy and Gensim, although still code-based, are simplifying the process of getting insights out of […]
This is the text of the keynote speech I delivered at the 2018 Concordia Library Research Forum. It has been edited slightly. Thank you for this lovely opportunity to be among you this morning. I feel especially honoured to deliver this keynote because I believe librarians and journalists share a special kinship because our jobs […]
IMPORTANT UPDATE This post is outdated now that AWS Lambda allows users to create and distribute layers with all sorts of plugins and packages, including Selenium and chromedriver. This simplifies a lot of the process. Here’s a post on how to make such a layer. And here’s a list of useful pre-packaged layers. This post […]
This post originally appeared on Medium. I was fortunate to take part in the Data Journalism Unconference hosted by Global Editors Network in New York this week. Attendees had the option of visiting two newsrooms for a “study tour” of their data teams. I chose the New York Times and ProPublica, two publications I admire. […]
Erowid.org is a website with an overwhelming amount of information about drugs. It curates scientific literature, legal documents, cultural uses, and user experiences on all kinds of substances. It’s well known among those who use psychoactive drugs for self-exploration. I like to call it Encyclopaedia Psychonautica. One of the more interesting sections of the website is the […]
Canada has trade relations with 224 countries and territories, with which it trades more than 5,500 types of products and services. In 2014, Canada imported $511 billion in goods, and exported $525 billion, according to data from Statistics Canada. But who are our main trading partners, and what kind of goods flow back and forth […]
Click here to see the map at Huffington Post Québec First of all, a clarification. I did not really make that map. I adapted the code from Noah Veltman’s San Francisco history map, and made one for Montreal. Compare both maps, and you’ll see they are very similar in many ways. That said, the data sources […]
PyCon is the world’s biggest conference for Python programmers, with great talks for both veterans and newcomers. And every year, organizers publish videos of talks and workshops for free for all to enjoy. Here is my selection of videos from this year’s conference in Montreal that I believe are of value for journalists who use […]