One of the reasons data journalism is such an interesting field is because there's so much you can learn and so much you can do. Over 10 weeks of intensive training, Lede provides a foundation for whatever kind of data journalist you want to be:
or anything in-between!
You can see a peek of 2023's content schedule here. Note that the schedule for 2024 will have similarities but will certainly be somewhat different.
Lede's curriculum changes each year, but let's take a look at a rough overview of how things might look. Typically the program is split into three sections:
While skills from one are directly relateable to another, it's an easy way to mentally break down what skills you will pick up.
Each year the Lede curriculum shifts and changes, but we always start with the Python programming language. Python is a flexible, widely-used programming language that can be used for everything from data analysis, scraping web pages, building data visualizations and web sites.
After you gain some basic Python skills, we level you up into working with APIs, a coding-focused method for computers to share data with each other. It's how your code might get the weather, compute travel times, or get updates from Spotify or Twitter.
We then transition into data analysis with pandas, the adorably-named Excel equivalent for Python (programmers: don't yell at me for making that comparison!) and Jupyter Notebooks, a popular programming environment that allows you to create easily-readable documents ideal for sharing your work with coworkers or editors.
Not all data is easily available, so our next stop is scraping: converting data from the web into an easy-to-analyze format. That might mean downloading the top stories from a news organization, extracting tow truck licenses from a state website, or medical malpractice records from a health organization. Scraping is an important tool for journalists, so we make sure to cover not just the easy cases, but the tough ones, too (BeautifulSoup vs Selenium/Playwright, if you've heard of the vocabulary).
Finally, we settle down into a bit of statistics and machine learning/AI. Tackle six or seven hundred definitions of "average" and learn to read a hundred thousand documents without an army of interns!
While the data sessions are mostly focused on sets of specific tools and skills, the visualization sessions are more focused on ideas and concepts. Things like structuring narratives, chart anatomy, visualizing relationships: no matter what tools you'll use to create the final product, these are the topics lurking in the background.
But! Everyone loves a list of tools, so let's look at what we learned in 2023:
...and yes, you'll likely encounter some scrollytelling, too!
GIS – geographic information systems – is an entire discipline focused on doing analysis and visualization with maps. Because it's such a wide-ranging field, the mapping selection tends to change more widely than the data and visualization ones.
You might cover web-based mapping with MapBox, geographic analysis with QGIS, Python-focused geographic data wrangling with geopandas, and a hundred and one specific concepts: geocoding, projections, choropleths, and more.