ScraperWiki Tries to Turn Journalists Into Hackers
More and more, regular people are learning how to program: Codecademy is drawing hundreds of thousands of students; hackathons have become a seemingly everyday happening across tech-savvy cities like San Francisco; and now some are even calling for children to learn to program in elementary school.
So, are journalists setting themselves up to get left behind by technology again?
Not this weekend, at least. At the offices of the San Francisco Chronicle, a hackathon called NewsHack Day kicked off yesterday with a day-long session aimed at teaching reporters how to scrape data for their stories.
Led primarily by ScraperWiki’s Thomas Levine, attendees got a crash course in how to find and pull down data, clean it up, and apply it to projects.
“Computers can do anything that a team of interns can do,” Levine said.
My project? Instantly scraping Securities and Exchange Commission filings for the “compensation tables” that detail what company execs get each year in salary and perks. Done manually, digging those tables out of multiple documents can be slow, but they often contain interesting insights.
Even if some of the drudgery of document-reporting can be outsourced to a computer, though, technical know-how isn’t enough to make a strong story.
Levine used a Department of Labor Web page that linked to unions’ collective bargaining agreements as part of his introductory lesson. But, while looking at the site, he said he didn’t know the difference between “private sector” and “public sector” unions.
T. Christian Miller, a senior reporter for ProPublica in attendance at the event, said the union example shows the need for journalistic “hacks” and tech-savvy “hackers” to work together in the future.
“Every journalist needs to know a little about coding, but it’s hard to expect full knowledge,” Miller said.
To remedy that knowledge gap, the journalists in attendance Friday will be joined over this weekend by designers and programmers, to either flesh out their existing projects or start new ones.
NewsHack weekend organizer Michael Coren, a contributor to FastCompany and The Economist, said projects can be either “story hacks” to bring data into in-progress stories, or more general “news tools” that reporters can use on any number of stories.
So, the SEC scraper would be a “news tool,” because — if it works, and that’s a big if — it could scrape any company’s filings from any year for compensation info.
But what’s the bigger point of all this? Does the reporter of the future have to be equally adept at both writing and coding?
No, Coren said — the most important skill reporters can gain from NewsHack Day is being able to talk like a programmer. That means journalists can know what’s possible with technology and communicate what they want the tech experts to do.
Coren criticized newsrooms that segregate techies from journalists — singling out CNN, as of the time he previously worked there — as part of the problem. He said the sense of journalism as a community of people with different skills is the real goal of NewsHack weekend, not the projects.
“If we come out with great code, I’m going to be happy, but at the end of the day, it’s two days,” Coren said. “The community is what’s sustainable.”