Use 3rd party databases, but confirm findings and disclose source

CaptureIn my mail this morning from a group called Good Jobs First. Lots of advocacy groups generate their own data. Often, it’s stuff you can’t easily get elsewhere, like violations of economic development agreements.

Use it, double check what you come up with and always ID the source. Here, government development agencies might complain the data is biased. Disclosure is on answer to that.



April 13, 2017

To Our Journalist Friends:

We thought you would like to know about an excellent article by Stephen Koff in the Cleveland Plain Dealer and that makes great use of the information in both of the Good Jobs First databases: Violation Tracker and Subsidy Tracker. Koff uses Violation Tracker to identify the most penalized companies in Ohio and also notes the subsidies each has received. The article can be found here.

If you’d like help using the Trackers to do a similar analysis for your state, contact Good Jobs First research director Phil Mattera at

Next week we will post an update of Violation Tracker containing data from two additional agencies: the Labor Department’s Wage and Hour Division (more than 30,000 cases going back to the beginning of 2010) and the Federal Communications Commission. The update will also include cases from all agencies during the first two months of the Trump Administration.

If you are working on a piece about the United Air Lines scandal, note that Violation Tracker contains data on cases brought by the Aviation Consumer Protection Division of the Transportation Department since 2010. United leads in total penalties from this agency.


DIY abortions: Using Google search fo data reporting

CaptureSometimes, data journalist sneak over the line from using spread sheets to  find trends to sophisticated  data analysis. On the simple approach side, see this  from the NYTimes Sunday Review story on Google searches and abortion :

In 2015, in the United States, there were about 119,000 searches for the exact phrase “how to have a miscarriage.” There were also searches for other variants — “how to self-abort” — and for particular methods. Over all, there were more than 700,000 Google searches looking into self-induced abortions in 2015.

For comparison, there were some 3.4 million searches for abortion clinics and, according to estimates by the Guttmacher Institute, there are around one million legal abortions a year.

The 700,000 searches included about 160,000 asking how to get abortion pills through unofficial channels — searches like “buy abortion pills online” and “free abortion pills.”

There were tens of thousands of searches looking into abortion by herbs like parsley or by vitamin C. There were some 4,000 searches looking for directions on coat hanger abortions, including about 1,300 for the exact phrase “how to do a coat hanger abortion.” There were also a few hundred looking into abortion through bleaching one’s uterus and punching one’s stomach.


#Jprofrant: Don’t quote from the newspaper! Use primary sources, not secondary sources.

When you quote another news source in your story, it means you lack the skills or drive as reporter to get the material yourself. Many of my students quote statistics from The Boston Globe, or worse, marketing sites like BU Today or the Harvard Gazette.

Do this in the real world and your editor will roll her eyes, tell  you to go to the source and mark it down on her notes about why you shouldn’t pass probation.

Part of the confusion here stems from the failure to understand the difference between blogging—where you collect and cite info from other sources — and reporting, where you collect info yourself.

Let me be a chair-throwing editor for a moment  — IT’S JOURNALISM 101! GET THE DATA/QUOTE/INFO  YOURSELF.

This is a problem even if you cite the source. If you don’t’ cite the source, then you’ve crossed into ,  it’s plagiarism.

Most of the time, it’s easy to find the source.  Just figure out where the reporter got his or her data and get it there. Then learn where to look for primary sources for your next story. Examples of primary sources – academic journals, government reports, statistics from an independent source, databases. Here’s a great source of primary info.  Also, try the library. Social media can also be a primary source, but it needs to be vetted. Later on that.

Examples of secondary sources: Newspaper and magazine article, marketing content, press releases and blog posts.  Most of the stuff that comes up when you do a Google search. You want to read this stuff, but don’t quote it.

The only time you want to quote a newspaper article is when the article is the news. For example, anything Bill O’Reilly may have written about the Falkland’s War. That story is about a dispute over his description of his deeds there. Not a great idea to quote Brian Williams on Iraq either.

Which brings up another reason not to steal someone else’s reporting – they might have gotten it wrong, which means you can’t even pull numbers or data from a secondary source. Go to the original source. (Your correspondent once killed off a couple hundred people in NC by supplying a graphic artist with morbidity data and calling it mortality data.)

And, don’t even think about using someone else’s quotes. That reporter worked hard to get those quotes.  Mitch Albom got caught doing that and the joke was – Maybe it was Wednesdays with Morrie.

So, do you job, don’t lose it!  Report, don’t copy.

Data journalism at the local level

CJR reports: 

dials2Data has always played a role in local journalism, including on investigative desks and in computer-assisted reporting. But with the rise of data journalism an increasingly influential and prominent subset of journalism, some regional news sites have developed local versions of FiveThirtyEight, exploring local issues through numbers, sometimes in a written story but often as some sort of graphical visualization. The Tennessean, The (Louisville) Courier-Journal, and The Denver Post are just some newspapers whose websites have a landing page that compile the data behind some big projects. For the past two months, The (Cleveland) Plain Dealer’s Data Central has offered a mix of databases for readers to sift through as well as data analysis, such as comparing various taxes in the metropolitan area to other places. – See more at:

Data Visualization 101: How to use Google Sheets make a simple pie chart

This is an update of a Knight Tutorial.

Go to Google Sheets

ss open ss

Click Blank  to create a new Spreadsheet. Note the following features:

  • The menu bar lets you select different commands to change your spreadsheet.
  • A cell is an individual square where you can double-click and type in information.
  • The cells are organized into rows (assigned numbers) and columns (assigned letters).
  • (This screenshot is from is older Google.)


Here’s what the 2017 edition looks like.



Make a pie chart

The pie chart is the most ubiquitous of charts. Here’s what it is and when to use it.

  • It is a circle divided into segments.
  •  It should illustrate the relationship of the parts of a total.
  • The data may be numeric but it is usually displayed in percentages.
  •  Never more than 5 parts. If you have more than five subsets, consider a TreeMap.

Save your spreadsheet and call it “Pie Chart.” You only have to do this once. The document will autosave as you make changes.

Fill in the data as shown below. Select cell A1 by clicking it once. Hold down the Shift key and click in Cell B4 to select the range of data

Screen Shot 2017-09-30 at 4.38.07 PM

Click on “Insert” in the Google Menu Bar and select “Charts.” A new window will appear

The “Chart editor” will recommend charts for your data. Under “chart type” choose pie chart. You can customize typeface, color, etc.  Hit insert and the chart will show up on your spread sheet.

Click the menu — three little dots –on the upper right corner of the chart and choose “Publish chart.”  (You also have the option to copy or save chart as an image here. )

Get link or embed code.


Screen Shot 2017-09-30 at 4.55.54 PM

Screen Shot 2017-09-30 at 5.02.23 PM


Here’s what the pie chart looks like.