Note: Where available, the PDF/Word icon below is provided to view the complete and fully formatted document
Disclaimer: The Parliamentary Library does not warrant or accept liability for the accuracy or usefulness of the transcripts.These are copied directly from the broadcaster's website.
Media Report -

View in ParlView

Richard Aedy: Hello. I’m Richard Aedy. Welcome to the Media Report on RN. Today we’re doing something different. This program is about numbers, about quantities, about data. The amount of data in the world is growing exponentially. Scientists and other researchers are finding ways of grappling with huge amounts of information to improve our lives, especially our health. Companies are using data to work out what we buy and why we buy it, how we use their products and services so that they can persuade us to use them more. And politicians and their staffs and consultants are gleaning more insights about all of us—the better able to know our concerns and craft their positions and messages accordingly.

Of course Google and Facebook are at the centre of all this. Google knows what we’re looking for, what we care about, what we want. Facebook is collecting data all the time on every single user. If you’re on it remember you’re the product that they’re selling to advertisers and others. And journalism is increasingly becoming aware of the power of big data—that ability to harness huge amounts of information and find the stories that lie within. This has led to the development of what’s called data journalism. Cynthia O’Murchu is an investigative reporter at The Financial Times in London and an expert user of data journalism techniques. So, what is its power?

Cynthia O’Murchu: It’s the fact that you can find information that otherwise would be very difficult to find. I think that data journalism, or the ability to manipulate and when I say manipulate I mean analyse, data is really a key skill for any journalist nowadays to have in their tool kit. Because what you can do is you can look behind the veil of statistics and information that is provided to you by say press releases—I mean I never really trust 100 per cent what comes out because for example if someone writes a press release they’ll always try and spin it the best possible way. But also you can go and root out information that is not really at anybody’s fingertips and I’m not just talking about statistics. You know it could be information that’s contained in letters. It could be information that you can put together from several different data sets and to give you insights into what really goes on in the world.

Steve Doig: It’s been going on since arguably as far back as the mid-1960s.

Richard Aedy: Steve Doig is Professor of Journalism at Arizona State University.

Steve Doig: Very occasional people in the United States had started doing things like public opinion surveys or analysis of criminal justice patterns, things like that. But it was very spotty, just one or two. And it was the sort of the creation of micro computers, the first PCs and Apples and so on that got into the hands of normal people like me that allowed us to start doing more interesting things. So it really probably started taking off in the early 1980s.

Richard Aedy: In 1992 Steve Doig was working at The Miami Herald when this happened.

Reporter: We’re on the MacArthur Causeway and there’s a tree that’s down blocking the road. Just listen to this wind. I’m going to hold the microphone right up to the open window.

Steve Doig: It basically involved looking at the aftermath of Hurricane Andrew which hit South Florida in 1992, destroyed or badly damaged about 80,000 homes in Florida. The project I was involved with was looking to see whether the extent of the damage was sort of an act of God or our own stupidity. And by analysing the damage patterns I was able to show it was really our own fault. The sort of smoking gun we found was that the newer the home, the more likely it was to be destroyed.

Richard Aedy: So to do that you would’ve essentially had to kind of map two data sets onto one another.

Steve Doig: Yes. We pulled together several data sets. We got a data set of the damages. I matched that up with the property tax roll which gave information about the type of construction and the cost and the location and so on, plus the year of construction of every home that was damaged. And so looking at that I was able to sort of see that pattern of kind of the opposite of what you would expect; newer homes more likely to be destroyed.

Richard Aedy: And were you able to work out why that was? Was it a matter of less stringent building codes?

Steve Doig: Right. Florida’s a place where hurricanes routinely cross. And we’ve always had, we thought, a very strong building code. But over the years it had been chipped away to make homes less expensive to build, under pressure from the builders and so on, to the point where we were allowing homes that shouldn’t have been built to be built. And those were the ones that got damaged the most.

Richard Aedy: Steve Doig actually won a Pulitzer Prize for this work, the Pulitzer for public service. And work like this, data journalism manifestly in the interest of the public has been going on in Australia too. This time the topic is not linked to destruction or building codes but it is something that most of us care a lot about.

Montage of opinions about My School: Parents have voted with their fingertips and said that they want My School…This data is accurate…Ranking schools from best to worst…I meet individual teachers every day in state schools who aren’t afraid of data and aren’t afraid of…damaging to students, damaging to schools…the financial data that the government was planning on publishing is misleading…there is no educational benefit for publishing the financial data of schools other than trying to play the politics of envy…I don’t believe anyone should be fearful of transparency…

Richard Aedy: When the government launched the My School website two and a half years ago, it was very controversial and much discussed. What it wasn’t, was particularly clear. It wasn’t particularly helpful either if by that you mean what parents are actually interested in. Justine Ferrari the national education correspondent at The Australian realised that for her newspaper this was an opportunity.

Justine Ferrari: The idea was to take the data that was publicly available on schools and their performance in national literacy and numeracy tests and present it in a more accessible way for parents and indeed for schools. I just know from my own circle of friends that My School, the official government’s website My School which has all these figures on it, is quite difficult to understand. I’m often being called and asked to explain the graphs and so forth. And they also limit how you can compare information on it on purpose. They don’t want you to do certain things. So if as a parent you are wanting to look at how your school is doing compared to other schools in your area that’s very difficult, cumbersome and time consuming and not necessarily very clear on the My School website. So The Australian’s Your School website we set out to present that information in a more accessible user friendly way.

Richard Aedy: We’ll hear more from Justine Ferrari shortly.

But to give you an idea of the breadth of what data journalism can do, a couple more examples. The first is a big complicated investigation into what turned out to be massive health care fraud in California. Steve Doig.

Steve Doig: We started this project because we were being given tips from people on the inside that, you know, something bad was going on with a particular chain of hospitals in terms of fraud.

Richard Aedy: And what was going on?

Steve Doig: What they were doing was basically turning common conditions into exotic ones that…and the government pays more for treating. The funny example really is a malnutrition condition. There’s a thing call kwashiorkor which is a protein deficiency that typically is seen only in starving African infants—the little swollen bellied children that you see—yet they were finding hundreds of cases among elderly California patients in this chain of hospitals, far above the rate of any other hospitals in California. It’s a thing called up-coding where you basically take something normal, you turn it into something exotic and the government pays several thousand dollars more for when they have a case of that. So if you have hundreds of cases that have been up-coded like that a hospital can make hundreds of thousands or millions of dollars more from the government.

Richard Aedy: But not every investigation is as massive or time consuming. Cynthia O’Murchu at The Financial Times.

Cynthia O’Murchu: One of my favourite stories that I did was looking at inspection records in the North Sea and this was in the wake of the BP Anaconda well blowout and I was just was curious what the record was in terms of inspections in the North Sea. And all it was, was about 50 to 100 letters by inspectors who were looking at various companies and citing them for violations of particular rules. And that wasn’t the big data set. That was maybe a hundred lines in an Excel spreadsheet. And all I did was categorise according to which rule was broken or which guideline was not kept to. And it showed that in two, I think it was two different years half of the inspections that were done of BP platforms showed that they hadn’t completely adhered to guidelines on oil spill training. And as I say that wasn’t difficult to do. It didn’t require a lot of technical expertise. It was about 100 lines in an Excel spreadsheet that you then filter and do percentages on. Very simple.

Richard Aedy: Cynthia, as I mentioned, is an expert data journalist. The sort who’s invited to talk about her work at conferences. But when Justine Ferrari started Your School at The Australian she was very aware that she wasn’t an expert, she hadn’t even heard of data journalism and she was on her own.

Justine Ferrari: It started with me. When My School first came out I said to the paper we could do this. We’re the only national newspaper, and the other papers do various things but it’s all state based so we’re the paper that could give comprehensive national coverage of this and build a database for our readers. It’s taken a bit of time. We didn’t have the capability at first because our website was quite new a few years ago. It’s come on in leaps and bounds since. So it was me and then I worked…we had an expert consultant, Peter Knapp from Academic Assessment Services who works with schools on how to analyse data. He’s an expert in data measurement and he worked with us on designing the website so what was meaningful ways of comparing numbers, what were meaningful ways of averaging numbers, that sort of thing. And then we had a web designer, Jamie Ferguson who’s a sensational web designer, it looks so much better than My School. And a multimedia guy which he says is his official job title, Ryan O’Connell who built the actual website. And he did that quite quickly in the end because he was very busy and there’s only one of him on the paper because it’s all quite new in newspapers—this idea of data journalism.

Richard Aedy: Yes.

Justine Ferrari: And that was essentially the team and there were other…we also had an external company that actually mined the data for us from the My School website because they…

Richard Aedy: I was going to ask about that…

Justine Ferrari: …wouldn’t give it to us.

Richard Aedy: Justine Ferrari.

This is the Media Report on RN. I’m Richard Aedy and this week we’re looking at the still-emerging phenomenon of data journalism. As you’ve heard it can be effective even powerful. But it’s not without its problems. Steve Doig, Professor of Journalism at Arizona State University has been one of the pioneers and as he explains it’s the little things that can trip you up if you’re not careful.

Steve Doig: A lot of times data, we call it dirty data. But basically it’s you know… the data often has been gathered for bureaucratic reasons so they are not as you know, careful about things like spelling of names, or things like that. An example in the United States; we have campaign finance data where all politicians who take any money to run for office have to record all the money that’s coming in and how they spend it. But the names and occupations and so on of the givers, they’re misspelled often and so on. So a lot of clean up is necessary to be able to, you know, do any real analysis. In other words if attorney and lawyer and atty and a whole variety of other things are used for attorneys as the occupation, you need to standardise those.

Richard Aedy: At The Australian, Justine Ferrari was grappling with a similar problem. She wanted Your School to make comparisons easy but with each state and territory running schools its own way, it’s hard to compare apples with apples.

Justine Ferrari: So you know high school is different in some states. In some…in a lot of states it starts in year 7, in other states it starts in year 8. And in some schools…like there are schools in Melbourne that are very high-performing and academically selective that only start at year 9, so they don’t have data for those missing years. So you have to figure out how you communicate that to your readers.

Richard Aedy: And that’s not all. She found there were other problems and limitations too.

Justine Ferrari: The database is so big; there’s 10,000 schools in the country. And they each have at least two years of scores for two grades of school. In fact there’s four years and that was another problem we came to. So just actually managing the database was huge, and making sure that it’s accurate so the external company we used double-entered it. I think they employed cheap labour in China or somewhere to physically cut and paste the figures from the website into an Excel spreadsheet. So it’s very time consuming, very labour intensive. And then trying to sort through those…Excel’s not very reliable with big data sets so that was a problem. I mean what I was really hoping to do this year—because we also did a newspaper report accompanying the launch of the website analysing the data further in the old fashioned newspaper way—and one of the things we were hoping to do on the website in that was look at how much schools have improved since the tests began four years ago. But we weren’t confident of the accuracy of the data we had mined for those early years. There were some discrepancies. So we didn’t go ahead with that. We erred on the side of caution there.

Richard Aedy: Justine Ferrari. There is another issue. It’s not one Justine was particularly affected by but many journalists are. Maths. Before she took on education Justine Ferrari worked as a medical reporter and in that capacity she became familiar with confidence intervals and some statistical measures but for too many journalists words are their thing whereas maths is something they stopped doing when they left school. Steve Doig who teaches data journalism sums up the numeracy of much of the craft in one word.

Steve Doig: Horrible. Innumeracy, you know the inability to add successfully is unfortunately a common trait. You know, my theory is sort of everybody has 12 chances to hit their bad math teacher and at some point or other you do and then you’re left with this feeling well I must be a word person not a numbers person. And I tell my students sorry but to be a good journalist you have to be able to do math. Their faces fall but I tell them it’s basically elementary school math. Add, subtract, multiply, divide. Though I do have to say I was called professor years before I became one because I could do percentage change. You know in my newsroom. You know I would have Pulitzer winners coming to say does the big number go on the top or the little number? So it’s scary how innumerate many journalists are. But to do decent data journalism you have to at least be comfortable with percentage change.

Richard Aedy: Nicolas Kayser-Bril is an independent data journalist and consultant in Europe. He meets a lot of non-data journalists and agrees that the level of numeracy is poor. As an example he quotes the reporting in a German newspaper of research into multiple sclerosis. According to the paper the risk of contracting the illness doubles if you work at night. But that isn’t the whole story.

Nicolas Kayser-Bril: That’s a very common mistake that journalists do, is that they talk of relative risks instead of absolute risks. So in this case it’s true that the risk of multiple sclerosis doubles in the sense that the risk goes, when you work at night, the risk of having multiple sclerosis goes from 1 in 10,000 to 2 in 10,000 so it doubles. It goes from 1 up to 2 but it’s still 2 in 10,000 so not much. And that’s typical; that’s typical journalism. And it’s true that it makes for better headline to say the risks double than to say a risk is 2 in 10,000.

Richard Aedy: Nicolas Kayser-Bril. Earlier you met Cynthia O’Murchu the investigative journalist at The Financial Times. Financial journalists are the most numerate in the craft. But she says that these days all journalists need to be competent when dealing with numbers.

Cynthia O’Murchu: Well they have to be. I mean the world revolves around numbers as well. I mean obviously it depends on what kind of journalism you do. But I think anything where you’re trying to get a little bit deeper down—and even just to understand how it works. I mean I’m not necessarily saying that you know the…okay, fine there might be people who just don’t really have a knack for it. If you look at basic statistics for newsrooms, you have to be able to quantify if you know there’s a percentage change and you cannot make mistakes there otherwise you look silly.

Richard Aedy: You do. Working out a percentage change is basic arithmetic. It’s year 7 stuff. All journalists in every newsroom ought to be able to do it. Cynthia O’Murchu has some other minimum requirements.

Cynthia O’Murchu: At least be aware of what can be done and on a particular beat know what kind of data is out there, what kind of information could be made into stories or what kind of questions one might ask and then work with someone who can do the analysis.

Richard Aedy: The problem has been something I alluded to earlier; most journalists stop maths when they finish school, many stop as soon as the school will let them. And just like everybody else whose job hasn’t required them to be mathematical, over the years they forget what they’ve learnt. That means to become a data journalist you need training.

Steve Doig: I have my students get comfortable with Excel because it gives you a chance to at least get the concept of taking data, sorting it, filtering it, counting it, doing simple math like averages and medians and so on. You know, there’s a whole range of tools beyond Excel that for those that are ready to do more things I also teach them statistical concepts. There are organisations also in the US like investigative reporters and editors that has very active training program for journalists who are already out of the profession. So journalists who are interested in learning these skills, they have a variety of places to go to get it.

Richard Aedy: Unsurprisingly Cynthia O’Murchu is right up to speed on that. And she says you can learn enough to get started in quite a short time.

Cynthia O’Murchu: We’ve, through the Centre for Investigative Journalism here in London, we’ve done a lot of courses for people to learn for example Excel which I personally was very very surprised that most journalists don’t know how to use it. Because it’s such a basic tool; it really doesn’t take that much to learn. I mean we do these two-day courses for beginners and a lot of people you know, they walk away with very very good skills. You know beginner skills but they really can go far.

Richard Aedy: This is the Media Report on RN. I’m Richard Aedy and we’re doing a special edition of the show this week. It’s all about data journalism. Data means quantities. Quantities mean numbers and unless you get it right, numbers mean readers turning off. So part of data journalism is something called data visualisation, taking the numbers and presenting them in an interesting and easily understood way. When this is done well it’s simple, elegant and really powerful. Cynthia O’Murchu says the key is that it must be relevant.

Cynthia O’Murchu: I think you can go very far with it. I think with data visualisation again you made the very good point; it’s about telling a story. And I do see quite often data visualisations that are just there to look pretty or just there for the sake of it and I think for data visualisations there has to be a reason to tell the story. And there has to be a story to be told. And I think where it becomes very powerful is once you can make it interactive. So the user can manipulate and do their own queries on the data, or change the visual or click into particular segments of the data to find out more. And there’s some fantastic examples out there.

Richard Aedy: There are, but for Nicolas Kayser-Bril data visualisation is only the beginning. He argues that journalists need to be much more imaginative in their story telling.

Nicolas Kayser-Bril: My basic point is to say look at all the other media like movies and video games. I mean they have become a huge industry now. And look at what they’ve done with technology in the past 20 years.

Movie excerpt: Sam, are you okay? …What am I going to do? Listen. I’m going to get you out of there…I promise…

Nicolas Kayser-Bril: On the other hand look at what journalism has done. I mean we still have texts on screens and videos and images which is nice but it’s not that different from what was done on traditional media like television, radio and in print. And what I’m arguing is that we have to work more closely with developers and designers to invent more exciting ways of presenting the news.

Richard Aedy: I think that journalists haven’t really thought about that at all and yet the enormous rise of the game development industry in the last 10 or 20 years shows that you can put something out there that’s enormously involving for the person playing the game and if journalists could do something similar many more people would be way better informed.

Nicolas Kayser-Bril: Exactly. I think we as journalists have to think about how…how we feel when we play video games. I mean most of the journalists now grew up with video games anyway. We have to think about how exciting it can be and at the same time when we discuss current affairs and politics it can be also very exciting. It doesn’t have to be boring, and now that we as journalists again have to compete for attention we really have to make news consumption more exciting. And I think it’s a great challenge and it doesn’t have to lead to a dumbing down or whatever criticism journalists can find…

Richard Aedy: But that’s further away. If we come back to where data journalism is now, some of it, like the California example you heard earlier is old-fashioned investigative stuff harnessing new tools. And some of it like The Australian’s Your School uses these tools to make publicly available information much more useful. And for many newspapers that isn’t what they’ve traditionally done. Justine Ferrari.

Justine Ferrari: It’s challenging as a newspaper. I mean I’m a diehard print journalist. I’ve only ever worked in newspapers. So you grapple with the idea, is this journalism? Or aren’t we just sort of you know repeating what’s already available? But I think it is journalism because even though we’re not interpreting it, the information, ourselves because the readers select what information they want to look at, we’re presenting it in a more accessible way. I mean it’s almost like writing with pictures in a way. Because it’s graphical, it’s interactive. But readers get to select what they want to look at as opposed to me deciding what’s of the most interest to the general reader.

Richard Aedy: For Nicolas Kayser-Bril it’s very clear. Data journalism is journalism. It has enormous potential and it’s up to journalists to make it work. There’s a long way to go but he’s an optimist.

Nicolas Kayser-Bril: Yes I’m extremely excited and I think there are many many opportunities to be taken in this changing environment. One of the big problems though is that in journalism schools…we do a lot of training in journalism schools with my company and we realise that the awareness as you were saying, the awareness of this change is really not high, and journalists have to become more entrepreneurs to really take these opportunities.

Richard Aedy: So given that it’s early days what do readers think of it so far? Justine Ferrari has been thinking about this a lot.

Justine Ferrari: Generally it was very positive. People found it very easy to use. They found it a lot easier to understand than the government website. And they shared it, I mean we had, you know, sort of posts on and sharing on Facebook and Twitter and so forth. So in the main I think we got some very positive feedback, you know, emails, people taking the time to write quite extensive emails about their frustration over the years of the lack of data, the lack of information when it comes to schools and being able to have some basis on which to compare schools. I mean Julia Gillard has said this herself—and this was one of the reasons she said that motivated her to implement this reform—was that parents choose schools basically on the basis of word of mouth. And it’s almost sort of superstition or, you know, bones in the wind, or something. It’s…there’s not a lot of hard evidence available to parents. So that seemed to be appreciated and that it was so easy to use.

Richard Aedy: So what do you think was best for the reader? Was it your, I’m sure very well researched, interpretive articles or was it the actual I suppose visual representation of data?

Justine Ferrari: I got the feeling there were…I haven’t…we haven’t investigated this in detail but I have the feeling that there were possibly two types of readerships. There were the ones that just read the traditional newspaper interpretations and then the ones that maybe already read the paper online so saw the website separately. And I think that experience gives readers a more individual approach to their information which you can’t provide in a newspaper.

Richard Aedy: No.

Justine Ferrari: You know, and being national we have to give a very broad overview that even people in individual states might find too large. So to be able to actually look at the information that just concerns you. So it’s a very tailored news experience.

Richard Aedy: You can make it exactly what you wanted.

Justine Ferrari: Yes.

Richard Aedy: So would you and the paper…are you going to continue to do it?

Justine Ferrari: Yes. We’re committed to it. We intend to do it every year, to update the information every year as the new schools become available. It is a new area of journalism that we’re only just starting to explore.

Richard Aedy: Justine Ferrari is the National Education Correspondent at The Australian. You also heard from Steve Doig, Professor of Journalism at Arizona State University, Cynthia O’Murchu from The Financial Times and Nicolas Kayser-Bril, the independent data journalist in Europe. We spoke to Cynthia and Nicolas via Skype. You’ll find more information at our website,

Today’s show was produced by Kyla Slaven. Our sound engineers are Judy Rapley and John Diamond. I’m Richard Aedy. Thanks for joining me.