Note: Where available, the PDF/Word icon below is provided to view the complete and fully formatted document
Senate Select Committee on Health
Health policy, administration and expenditure

BOYD, Associate Professor James Hutchison, Director, Centre for Data Linkage, Curtin University

FERRANTE, Associate Professor Anna Maria, Deputy Director, Centre for Data Linkage, Curtin University

KEARNEY, Professor Brendon, Chairman, Population Health Research Network

REDMAN, Professor Sally, Chief Executive Officer, Sax Institute

SMITH, Dr Merran Beckley, Chief Executive, Population Health Research Network

WELLS, Mr Robert William, Deputy Chief Executive Officer, Sax Institute

CHAIR: Thank you very much for joining us, and welcome to our second roundtable for today. Do you have anything to say about the capacity in which you appear before the committee?

Dr Smith : I am the chief executive of the Population Health Research Network, which is a research infrastructure network building data linkage infrastructure for Australia.

Prof. Kearney : I am the chair of the Population Health Research Network, and I also work as a clinician at the Royal Adelaide Hospital.

Prof. Boyd : I am with the Centre for Data Linkage in Curtin University, which is a node of the Population Health Research Network.

Prof. Ferrante : I am deputy director of the CDL.

Prof. Redman : I am the CEO of the Sax Institute, and we are the custodians for the Secure Unified Research Environment facility, which is funded through the PHRN.

Mr Wells : I am the deputy chief executive officer of the Sax Institute, and I oversee, specifically, the work of the Secure Unified Research Environment, which is funded through the PHRN, as part of my duties.

CHAIR: Thank you very much. I understand that you have provided us with some documentation. Senator Moore, would you move that we receive that.

Senator MOORE: Sure.

CHAIR: Thank you. We might commence with Professor Kearney for an opening statement.

Prof. Kearney : Thank you very much for the invitation, and thank you also for your interest in this area of data linkage. Briefly, the Population Health Research Network has been in existence for just over five years. It is an NCRIS funded project, so our funds come from the Commonwealth government through the Department of Education and Training, and our mission over the past five years has been to build a national network that provides for privacy-preserving, highly accurate data linkage. In that time, we have established networks or linkage facilities in each of the states and territories and three national facilities, two of which are represented here today—and I will talk more about it. They are the Centre for Data Linkage at Curtin University and the SURE facility, which is a remote-access facility at the Sax Institute enabling high-quality data to be accessed by researchers remotely. There is also the Australian Institute of Health and Welfare.

Having completed that task—the infrastructure has been built—we are now undertaking a strategic review as to how to improve and use the concept of privacy-preserving, highly accurate data linkage more widely within the community. Some of the things that we are looking at in that strategy are not only to further enhance the process for researchers to access these services—and there are still a lot of problems to overcome to enable more timely and ready access to those services—but also that governments have become aware that data linkage of this kind can be quite useful to them. Also, with industry, I heard a little bit about your interest in pharmaceuticals and post-market surveillance, and this kind of data linkage is ideally suited to support those kinds of developments.

The other issue in my opening statement is that we realise this facility has been built around health services, but in our consultation over the strategic plan people see its use as much broader than that, and in particular as relating to the social determinants of health—so linking health information with educational information, justice information and human and social services information. We think that the future lies in that direction.

We are also particularly keen to complete the strategic plan because of the innovation environment around data linkage that exists at the moment. With the recent innovation statement around 10-year funding and presumably more to come from the Clark review, it is important that, amongst all the bids for these kinds of scarce funds, data linkage for population health be amongst those bids in innovation.

The only other thing that I might say is that I also heard what was said around the issue of enduring linkages, and that is something that we need to pursue or would like you to pursue, in the sense that it is variable at the moment. At the moment, the Commonwealth Department of Health has a policy that does not support enduring linkages. But, in the Department of Veterans' Affairs, which is closely aligned to the Department of Health, their review of veterans and use of data linkage of the kind that I am talking about is an enduring linkage, and there are safeguards with regard to that. Some state and territory governments are concerned about enduring linkages as well. But we think that is an efficient and appropriate development to occur for better use of linked data.

I should mention the issue of the highly accurate data linkage that we are talking about versus the general availability of linked data that is aggregated, such as the MBS and the PBS, which is provided to the states and territories at the moment. That can have some good uses, but, in terms of influencing health policy, the detailed and accurate outcomes require this privacy-preserving linked data approach.

Finally, we could talk about some of the proof-of-concept projects that we have undertaken—one on immunisation and one on hospital care—where they have shown the ability to technically link this kind of data through CDL and also to have an influence on health policy or to reveal things that were previously not understood in our community. Thank you for listening to me. I will hand over to Merran.

CHAIR: Thank you, Professor Kearney.

Dr Smith : Perhaps if I digress for a moment to talk about linkage. We did meet with the committee in October last year—and I think that was a very useful meeting—and we explained the linkage process and how it preserves privacy. It does that by taking the identifying part of records—that is, the name, address and date of birth—separating them and giving them to the linkage unit, which maintains the links. When we talk about enduring linkage, we mean that a linkage is done, and then, when the next lot of data comes in, it is matched, or checked to see if it matches, with the existing data. Those links continue to be maintained. The importance of that is that some of these datasets are very large. If you go to a lot of effort to create a high-quality linkage and then, after the project is finished, you destroy the links and start doing the same thing again for the next project, first of all, it is highly inefficient; it is both expensive and time consuming. Second, it does not lead to improvements in linkage quality. When we talk about the importance of enduring linkage, that is why it is important.

When we put together datasets for the researchers, we take the content data with a project-specific linkage key which allows us to join the data, and in fact with our infrastructure now we pass it through a SUFEX system—that is the name of the system—which is run by the Centre for Data Linkage, so the data can be securely transmitted across the country and it can be placed in the facility that the Sax Institute runs, called the Secure Unified Research Environment, or SURE for short. So it can also be very securely held. The privacy risk is virtually negligible. But I digress.

At the meeting in October last year, we highlighted the difficulties we experienced in accessing Commonwealth data. We think there are a number of factors contributing to that difficulty that include the approval process—the process used to approve applications for Commonwealth data; the lack of enduring linkage between the Commonwealth datasets; and then there is the problem of not being able to link the Commonwealth data systematically with the state data.

We have made some recommendations in our submission to you. There are a number of ways in which we need to approach this problem. It is one of those wicked problems which have multiple components and you need to deal with each of them; there is no silver bullet. In the course of the morning we would be very happy to talk some more about those.

CHAIR: Thank you, Dr Smith.

CHAIR: Thank you.

Prof. Boyd : Anna and I would also like to thank you for the opportunity to make the submission and appear here today. I will start by saying, obviously, you can tell I have a Scottish accent, so if that causes a problem please stop me. If I really get into deep SCOTS, Anna is going to translate.

Prof. Ferrante : My job is to translate.

Prof. Boyd : So please do stop me if I am not clear.

CHAIR: We were just discussing the joy of accents. I know that I can speak for Senator Moore and me that the Irish accent is a good transition point for Australians to understand the Scottish accent. Senator Dastyari, with his very Irish ancestry, I know he is a friend of the Irish, friends from all over.

Prof. Boyd : I thought I had better mention it because we stopped somewhere and somebody said, 'You're from Glasgow; I can never understand anybody from Glasgow' that would not be a good start to the day. Anna and I have been involved in creating and analysing link data for many, many years. In fact I had hair when I started working on link data, so I think that says something about the frustrating environment that we work in sometimes. We are both keen to maximise the value of the administrative data that is collected in Australia and we are keen to spread the word about data linkage and how it can be used for research.

As Brendon said, sensitive data linkage is one of the national components of the Population Health Research Network. We are located in Curtin University and we were established to do cross-jurisdictional linkage—basically, to take different state collections and join them together so that we can track patients when they appear in more than one state. Obviously, that is important to research because researchers want to know the complete history of a person so that they can make a proper assessment when they do their research.

Together with the Australian Institute of Health and Welfare, the Centre for Data Linkage provide the complete national data linkages services for the Population Health Research Network. In relation to that, and as Brendon and Merran have talked about it, there are a few additional functions. The secure file transfer through SUFEX is an important function that we provide, ensuring that data transfers from point to point in a very secure way. We are also the only university node within the Population Health Research Network; the rest are within government departments. We use that to make sure innovation, research and development is part of what we do in the network. We do a lot of work developing tools and software to solve problems and get rid of bottlenecks in data linkage.

We do training, evaluation and support for linkage units nationally and internationally and support the research community. Our aim is to support data linkage units to make sure we get accurate, efficient links that we can make available and then release this data linkage capacity across Australia. For a researcher, our biggest thing has been supporting them through the authorising environment. It is not always clear or easy for them to understand how they get access to the information.

Hopefully, the submission provides a constructive overview of the current state of data collections, data linkage and access to health datasets and why that is important to the Australian research community. Australia has a very strong track record in data linkage. We are probably the world leaders at the moment in that field. We have innovative, large scale technology that does national linkage probably the size of which is not able to be done elsewhere. And Merran is exactly right: we have to do that because we have a create-and-destroy model. So we have to create data for projects and then we have to destroy them afterwards because we only have approval for a project, and that is inefficient.

We also have to make sure that these systems that we have are flexible enough to leverage the new developments that are going to come from the data science and big data initiatives over the next five to 10 years. Those are going to develop technologies that are going to allow us to do things faster and more efficiently. The Commonwealth datasets remain an issue for a lot of researchers. We work in a university and we hear the moans—the time to get the data. The burden of the approval process gives them a challenge, and it is difficult to do efficient and timely research in the time frames that we are talking about to make that possible. We are hoping the submission gives you some recommendations and ideas that would improve that, and we want to unlock the potential of the data that we have in Australia so that we can benefit the whole research community.

CHAIR: Thank you very much. Do you want to add anything?

Prof. Ferrante : There may be some additional points. James spoke about training and support. In know in some of the other submissions the issue of training the researchers so that they can handle data has been raised. It is also important to train the data linkage workforce. It is a very niche area. We have just come through training a whole new group of people who have been recruited by Victorian Data Linkages unit, who have no experience in data linkage. We have a fine reputation in this country of doing very high-quality data linkage, and we want to maintain that, and we have to do that through training. It does not come naturally. That is the first point.

James has also alluded to the difficulty in accessing Commonwealth data. That has been an issue for researchers, and the Commonwealth, sadly, does not have a good track record in allowing access. That also extends to linkage units, and you will probably hear some of the frustration from the linkage units outside of the Commonwealth about how difficult it is to access data for purposes of linkage. I think that is about it.

CHAIR: Thank you very much, Professor Ferrante. Dr Redman. Sorry—Professor. I cannot quite see your title there. You have earned it, so I will just correct the record.

Prof. Redman : Thank you very much for the invitation to come and talk to you today and also for your interest in this issue, which I think is really very important but has been, perhaps a little bit niche. So it is really nice to see the interest.

Bob Wells is going to talk to you about SURE. Just before we do that, I would like to make a couple of more general comments. At the Sax Institute, our role is to make a bridge between research and policy, so we think a lot about the way that these kinds of data can be effectively used by policy agencies, in partnership with researchers or individually. Another issue that I think is perhaps worth the committee considering is that we see that policy agencies often struggle with the complexity of these data, and I think in terms of training, which my colleagues have alluded to, thinking about how we better support people who are in policy arenas to make best use of these data is perhaps worth considering.

The second issue is that we have been working on these kinds of data for quite some years—12 or 15 years—and I think that what we see is that the data themselves are very useful but, in addition to that, the ability to connect those data to other kinds of information that are collected—like registries and large cohort studies, for example, that provide more personalised information about the individual—can potentially be very valuable in enriching the data and making better use of them, particularly for policy. We have included, in the papers that we tabled, some examples of the added value that comes from that kind of additional collection.

Mr Wells : Thank you to the committee for inviting us to talk. I will talk just a little bit about SURE in a minute, but I just thought I would emphasise the point that Professor Redman made. It is not just the routinely collected data or the administrative data that is critical; it is being able to link the administrative data to other datasets such as the 45 and Up Study. We have given some more information about that in the material we tabled. The 45 and Up Study is of approximately 267,000 people aged 45 and over in New South Wales, so it is about 10 per cent of the New South Wales population over 45 years of age—those people who have given consent for their health records and their health data to be linked. So the process of access to Medicare data and hospital data for those people is much more straightforward, and a lot of researchers use that study to do work, including researchers from interstate, and not just researchers from Australia. We have given you one example where a researcher based in Western Australia used that data to get some significant findings around cancer care. So it is, I think, an important resource and important infrastructure for researchers. It is not just what the PHRN provides—which is magnificent, and we are part of that—but also the other data cohorts, such as the 45 and Up or the longitudinal study of women. There are a whole lot of other cohorts around to enrich the data that is available.

I will just say a little bit about the SURE system. It is a computer-based system. It is operated by the Sax Institute. It provides remote access for researchers, so it works for researchers because from their desk or from home—or from wherever they are in the world, actually—they can access data that has been approved for them to access. Effectively, our system takes over their computer for the period that they are working on it. A project will get a number of work spaces, and each researcher has their own unique work space within that project. With a team of, say, six researchers, one researcher might only have approval—ethical approval, for example—to access a certain part of the data, so that is the only part of the data those researchers will be able to access. There is also capacity in the system for researchers to work collaboratively in writing papers for publication et cetera so they do not have to work on their data and then go somewhere else to work up their papers and make it work collaboratively.

It is a very secure system. The key feature of the system is what we call the curated gateway; this is where we operate. The curated gateway is where all the data comes in. For instance, a custodian says, 'We'll give you dataset X. We check that it is in fact dataset X and that the custodian has not accidentally left some identifiers on the data or whatever. What goes out is not the original data. The researcher does the work but they can only take out their analyses or the work they have done. They cannot take out the original data. While they are working on it, the SURE system will not allow them to copy. They do not have email access from the SURE system, so they cannot email the data to their best friend, the newspaper or whomever. So it is quite secure that way. We know who has used what data for what purpose. We keep that data. By the way, our systems are in secure data centres—two data centres in Sydney, which are used by government agencies, including federal government agencies such as Defence and other agencies like that. It is a very secure environment. The data transmitted is encrypted, but it is also mainly transmitted over the AARNet, the university network, which is a secure internet network. I want to emphasise that it is in fact very secure.

The other benefit for researchers is that they can access more than one set of data. For example, they can access health data through the linkage services that the states provide, and some of the study participants can access the MBS and PBS data without too much fuss. We have a number of clinical registries. Clinical registries are where clinicians in a particular hospital or network of hospitals keep data on a particular topic. For example, we have the Australian-New Zealand Intensive Care Society clinical register. It is quite a rich database about intensive care treatments and outcomes and that sort of thing. Obviously, there is the 45 and Up Study, which I have talked about. We also have some private health insurance data that is accessible through this system. We are negotiating with some other agencies in government about broadening the sources of data outside of the health sector, which could then provide a much more comprehensive access to data for linking, which would get to how we might prevent what are some of the other factors that influence poor health outcomes et cetera.

CHAIR: Mr Wells, just as you finish there, could I ask what the agencies are which you are referring to there that you are trying to—

Mr Wells : We have not reached agreement so I will just mention a couple where we are perhaps in advanced negotiations. There is the Department of Social Security—not for all their data; these are for some specific data sets—and the Australian Taxation Office. As I said, we have some private health insurance data, and we are hoping to get a large cohort of data available through that within the next couple of years. That is a private insurer, so I do not feel at this stage that I could say more than that. It is commercial-in-confidence.

CHAIR: That is sufficient indication to give us an idea of what you are talking about.

Senator MOORE: I asked the other group when they came to the round table whether they all knew each other. Everyone is nodding their head, which is good. I have got the impression from reading the submissions—and I do not have a background in this area at all—that this is a group within Australia made up of a number of organisations and people. It is not that large but it is a group that do know each other and work cooperatively together. Is that a fair statement?

Dr Smith : PHRN is a collaboration. The principal of the increased funding is collaboration. In our view, collaboration is the only way to solve this problem. There needs to be collaboration between government, academia and industry—and Bob referred to that in terms of some of the work he is doing. It was raised briefly this morning in terms of the pharmaceuticals. I think there are a number of stakeholder groups around the country, and we need to provide a framework in which they can all work collaboratively.

Perhaps some of the problems with the Commonwealth that we are facing at this point in time is that, in my experience over many years, the Commonwealth government agencies often do not look outside. They are busy doing what they need to do in Canberra. Perhaps they do not appreciate, or are not in a position to leverage, the value and benefit that can be obtained from working with other groups within the nation.

Senator MOORE: Can I ask what your academic backgrounds are? I was interested with the last group of witnesses, whose backgrounds were medical, mainly. In terms of working in this field, what was your pathway to being involved in this area? It is just interesting to know whether it is a medical background, a statistics background or an economics background. What is the background?

Dr Smith : Mine is in science, medicine, economics and management.

CHAIR: You have got the lot!

Prof. Kearney : Mine is in clinical medicine and also management. I have been a chief health officer, chief medical officer and CEO of a health system.

Prof. Boyd : Mathematics and statistics. I worked with the government as a statistician.

Prof. Ferrante : For something different, my background is in mathematics and then criminology. So I am a social scientist.

Prof. Redman : This is obviously well suited, because my background is also in social science. Social scientists do use really large datasets, as you would know. I am also interested in the interface between research and policy and practice. These data are so compelling and so useful in that particular interface.

Mr Wells : My background is political science and philosophy, which are quite useless in themselves!

CHAIR: You would never get a job with those qualifications!

Mr Wells : So I had to work for government most of my life! But, also, more recently I worked at the Australian National University, where I headed up a primary healthcare research centre.

Senator MOORE: Just in the grouping, there is a wide variation. I think that is an interesting dynamic within the area. Mr Wells, you were talking about the groups of data that you have. Is there a cost to your organisation to acquire that data, or is it given by the organisations, the governments and the private health people because they see that it is a benefit? What is the arrangement? Is it a financial one?

Mr Wells : It is not a financial one. In fact, this is a very good question. At the moment we—

Senator MOORE: My first, I think!

Mr Wells : They are all good questions. We do not pay them to get their data. They make their data available. We have to pay for the asset. They are housed in data centres. We are actually charged by the volume of transactions and the amount of energy we consume, actually. There is a cost to us of housing a dataset. We have to actually persuade the custodians that not only is it a good thing for them to provide their data but also it is a good thing for them to help fund the provision of their data. That is how the transaction works from the data custodian side. On the researcher side, we do charge researchers. But that charge does not reflect the full cost, because we have a subsidy through NCRIS funding. So we charge them a charge which is not the full cost of providing it. But, certainly, it is a very costly facility to run. I am not saying that is a bad thing, but it is costly. We are constantly looking at enhancing its efficiency et cetera, but there is very much a cost to operating and providing this service.

Senator MOORE: The financial process.

Prof. Redman : Senator Moore, I just want to clarify, too, that the SURE facility, at its core, is not really a data repository. It is more a way of facilitating access.

Senator MOORE: That is how I read the fact sheet that you put into our process.

Mr Wells : Data are put in. Then, when that project has finished, the data go back out.

Senator MOORE: So you do not maintain the data?

Mr Wells : We maintain records of the transactions of what is done with data while they are there. But we do not actually maintain them. However, there are some agents which are interested in us actually housing a dataset which they would update from time to time. The researchers could access that dataset, subject to ethical and other approvals. So it would be something that we would actually retain for a period of time—perhaps a very long period of time. At this stage, we have not finalised any agreement for that arrangement.

Senator MOORE: So the funding arrangement for all of you around the table is centred on NCRIS—is that right?

Mr Wells : Correct.

Senator MOORE: Professor Kearney, in your opening statement you mentioned 10-year funding. Is that 10-year funding for NCRIS?

Prof. Kearney : Yes.

Senator MOORE: What is the 10-year window for that?

Prof. Kearney : We have had annual funding through the NCRIS program for six years, and we have it provisionally for next year, but it has been one year at a time. But, with the innovation statement one or two weeks ago, the Prime Minister indicated that they are looking at longer term funding and at 10-year funding for the NCRIS program, so that is really quite an exciting development. Of course, the next step will be that we have to complete our strategic plan, bid against the other 27 capabilities within NCRIS and get our share of the resources that are available.

Senator MOORE: The expectation from the statement was that it was going to be forward—another 10 years into the future.

Prof. Kearney : Our impression is that the arrangements are much freer or less restrictive than they had been.

Senator MOORE: How long have you been on the one-year cycle?

Prof. Kearney : It has been six years.

Dr Smith : The first lot of money was for three years, the next lot was for two years, and after that it has been one year. This year it was one year. Next year, which is in the forward estimates, is another year, but we have not got our allocation yet, and obviously it is December now and we are trying to plan for the next financial year. But the new money, I think, will start from July 2017, so it will be 10 years from July 2017, which is a wonderful opportunity to do some forward planning.

Senator MOORE: Yes, it is the chance to prepare and all that. It will give you almost a full year to do that, which everyone says is the best way to go.

Dr Smith : Yes.

Prof. Kearney : So we are hoping to have our strategic plan finalised by early February, and we will then submit that to the Department of Education and Training to go into this 10-year budget.

Senator MOORE: So your minister is Minister Pyne?

Prof. Kearney : No, Minister Birmingham.

Senator MOORE: Minister Birmingham.

Mr Wells : I might add something to that. That is all true, but often those announcements of a year's funding actually come during the course of the year for which the funding is, so it has an implication not just for our service but for the whole NCRIS. We cannot guarantee people employment, and we are probably actually carrying some of those people for some months ourselves just to keep the operation running, without any guarantee that the money will be forthcoming. So it has created a lot of uncertainty and a lot of hesitancy among the workforce, and I know we have lost people who feel they would rather go somewhere where they can get at least a couple of years guaranteed employment.

Prof. Kearney : We were hopeful, because in the last five years there have been four programs—NCRIS, EIF, CRIS and NCRIS 2—all with complex requirements, and this 10-year NCRIS program just sounds wonderful to us. But of course we have to work through the issues and the detail now.

CHAIR: We have had a lot of evidence, around every single health funded program, about this loss of capacity and the churn of administrative submissions taking up money, and again the loss of expertise and cultural knowledge and capacity. That has been a big cost. We were very mindful earlier of the threat to the NCRIS with it being tied to the passage of legislation through the parliament, so we are pleased to see that you are still operating.

Prof. Kearney : We are competing with astronomy, agricultural innovation and a whole range of programs, so health is really one of many different programs within that basket.

Dr Smith : I think the need for that longer term funding is highlighted through the Clark review, and I think we are still waiting to see the government response to that review, but effectively it said you cannot run national research infrastructure of any sort on an annual basis; you need at least five- if not 10-year rolling plans, and in the case of the astronomers they are probably even longer. So we are very pleased to see that now funding will be available for 10 years. It will make a big difference.

Senator MOORE: The government made the innovation statement, and they also made, in the last couple of weeks, an Australian government public data policy statement. I am interested to see whether, in the development of that policy, any of you were involved in terms of feedback. Also, what I am asking people in this area is: now that the policy statement has been made, and it does seem to pick up many of the issues your submission has recommended should happen—public access, consideration of the need, balancing privacy and all of those things you talked about—what do you see as the way you can be involved in what happens next? We have the policy statement—it is about data. How do people working with data engage with government to ensure that the policy statement becomes enacted?

CHAIR: First, were you actually engaged in the development of it?

Prof. Kearney : No. I think PHRN has undertaken coordination of development in this area. I think we have influenced changes in Commonwealth policy and also state and territory policies. We have really tried to act as a broker between government, researchers and other groups. I think there is a key role for that, because, as Merran said, departments tend to want to own that information and control it and use it for their purposes as opposed to a public purpose. So I think we see our role as being a broker to try to bring groups together and facilitate advancements. Certainly, it is a long and slow process and a lot more needs to be done.

Senator MOORE: Government policy: is that all-of-government policy?

Prof. Kearney : Yes.

Senator MOORE: You have all indicated that health is not peculiar—that to look at what we want to find out we need to have an all-of-government response.

Prof. Kearney : As I suggested earlier, the Department of Health has changed significantly in the last year or two. It does release MBS/PBS data to the states and territories, but it is not identified data, so it has limited usefulness and cannot be used for the purposes we are talking about today. Also, they have not agreed to enduring datasets at this stage. I guess there are issues with the authorisation process and with custodial approvals and ethics approvals. We think there is a lot of work to be done that can streamline the process significantly.

Senator MOORE: How do you think that should be done? I am trying to find out how you think we can turn the policy statement into action. From the perspective of all of you, what would you hope the next step would be to make sure that these things happen?

Dr Smith : I think it is around getting a perspective on what is the devil in the detail of that policy. In principle we really welcome it because it is government saying, 'We appreciate the value of this data and we want to make it more widely available.' But, of course, health data is a special case because it is sensitive data. The question is: what data will be available through the open data process? It is highly likely that that data will be aggregated.

It is not to say that that data is not valuable, and it is. Aggregated data is valuable and even linked aggregated data is valuable. But it probably cannot do the sorts of things we are talking about for the health/medical research that really needs the detail.

We have built a secure process to deal with that sensitive data, and it is quite likely that the open data policy will not require that sensitive process, because the data has been changed—confidentialised or perturbed—so that it can be made available. It is filling an important niche but it probably is not doing the sort of thing we are talking about.

CHAIR: I think that goes to some comments you made, Professor Kearney, because as I am reading this:

Australian Government entities will:

- make non-sensitive data open by default to contribute to greater innovation and productivity improvements across all sectors of the Australian economy;

Then there is this ambiguous term:

- make high-value data available for use by the public, industry and academia, in a manner that is enduring and frequently updated using high quality standards;

What does that mean?

Mr Wells : Might I make a stab at that! I suspect that is the sort of data we are talking about. The 'high value' is the data that actually is able to be used for industry or whatever purposes, but would also be useful for government to be able to share with other people who could do work. I mentioned the Australian Taxation Office earlier. I am not sure what specifically they want this done on; I am not familiar with the dataset, nor would I be. But they want to make this available, through us, to a group of researchers who they want to do a particular piece of work, so I suspect it is that sort of activity they are looking at.

We have been working with the Department of Health about the release of the data they have given to the states recently—I am pretty sure it has gone to the states now—and we have been helping in a project where the federal Department of Health trained the states, because the states are not familiar with these data—the MBS and PBS data and the various complications within different Medicare items, for example. They are not easily understood, and if you are not familiar with them you can get in all sorts of bother. So we have been running a training program for the state people who will be using these data. As part of that program, the federal Department of Health wants to establish a community of interest across the data users, with the federal people and the various state people. I think that is a positive development. I think that is an area where we could possibly work further—certainly with the health department—about implementing the open data policy in the health sector.

I am obviously not privy to how policies are developed in the federal government, but my understanding is that there has been a very senior-level interdepartmental committee working on this area of making data more accessible for some period of time—at deputy secretary level across a range of agencies. Presumably that committee would have some continuing role in how this open data policy might be progressed, so perhaps that is an avenue where this group and other data groups could work with that committee in terms of how it might best be implemented both to protect security and privacy and to make the data accessible in time frames that are useful for the purpose for which people would want to use it.

CHAIR: Have any of you engaged with that committee? Has any of your expertise been sought?

Mr Wells : No. I think it is a secret committee.

Prof. Kearney : It is a high-level committee that is involved in looking at taxation, human services, health and so on, so it has a remit that is quite different from, I guess, our role. I think what we were trying to say is that about half the data in health is held by the Commonwealth and the other half is held by the states and territories. Although there are AHMAC committees that deal with data, they do not actually engage the research community or other interested groups such as industry, and we think that another structure—something like the PHRN, or it might be different—is needed to bring all those groups together, because there has to be a different attitude and approach to, I guess, the release and use of data. All the controls are known and are there, but we are not really maximising what we could do with it.

CHAIR: So, in addition to data linkage, we have to get—

Prof. Kearney : We have to get a policy environment—

CHAIR: We have community linkage.

Prof. Kearney : that is community wide, yes.

CHAIR: And those three areas where you spoke about the need for collaboration in your opening statement were government, academia and industry.

Prof. Kearney : And industry, yes.

CHAIR: If I heard correctly this morning, government and academia are having more frequent conversations but are not as deeply engaged as might need to be the case going forward, but industry seems to sit even further outside the tent. Is that an accurate representation of where we are right now?

Prof. Kearney : I think that is fair. There has been a lot of change over five years, but I think the PHRN has really filled the space of developing researcher involvement in Commonwealth and state data. Industry is in its infancy in the use of data, but there are some important uses that are possible and hopefully could happen over the next few years. But they will need, I think, more than just department-to-industry negotiation, because of the issue of the states and Commonwealth needing to pool their data and to meet all these privacy-preserving rules. The issue with industry is the need for, I guess, an honest broker to ensure that their processes and procedures do not put at risk the whole system of data privacy.

Prof. Boyd : PHRN has done a great job in trying to negotiate through some of the state and Commonwealth barriers to information. We work in the research community and we have a lot of the university researchers coming to us, and there is a genuine frustration about how they get access to this information. They do not always feel it is transparent or equitable about who gets access and what they need to do to get access. We are at a different end of this. We work in a university. We work with researchers who all want to do things to benefit the public but have real trouble trying to get information.

CHAIR: So you are talking about inequitable access there—or claims of it.

Prof. Kearney : I think it is even an understanding of the whole system. We have funded several proof-of-concept projects to try and test the various systems. The first one took over four years to complete, and that was mainly getting ethical, custodial and a variety of contractual approvals. The last one has taken about 18 months to complete. So there is some improvement—

CHAIR: A significant productivity improvement!

Prof. Kearney : but 18 months is still too long to get all the approvals. They were important ones. The first one was a study of use of hospitals by Australians and looking at cross-border flows. They are important to understand for the provision of health care but also for the funding of health care, with the state and Commonwealth agreements. It showed that there are different types of people who cross borders: those who live near a border—so, if you live in northern New South Wales and you go to Queensland for care; those who travel for work or who are on holidays, where it is a semi-permanent arrangement where they get sick in another place from their normal residence; and those who shift for care. These were two key groups: people with renal disease and mental health patients. They were things that were unknown, and we were able to show that by data linkage. What we also showed is that, because deaths are recorded at a state level, hospitals were understating their standardised hospital mortality ratios—that is, they did not understand their true death rate, particularly for those hospitals that were near state borders, because the patient—

CHAIR: Because it is so porous.

Prof. Kearney : would maybe die, say, in a hospital in Queensland but the death was registered in the New South Wales registry of deaths, and so it would not be picked up. It proved that not only could this be done technically but there were policy implications from this work.

The other was a study on immunisation where we showed that you could link Commonwealth and state data. Again, we showed that, although it was thought that the programs were really successful, the re-immunisation rate was disastrously low, and as a result something is being done about it. It is powerful stuff and we can do it, but it should not take that long to do all of this. The ethical and custodial approvals are just multiple and complex at the moment, and we need to be able to streamline them.

Dr Smith : Certainly, these projects were able to quantify the time taken. That was one of the reasons we were doing them. There were lengthy delays in the Commonwealth. Part of the challenge with accessing the Commonwealth data is that there is not an explicit process. When you are accessing state data, most states have online an application form and an explanation for how you need to go about it. But even now when you go online and try to find how to access data, there is a little bit on the AIHW site, hardly anything useful on the Department of Health's site or on the Department of Human Services' site. To acknowledge the work of my colleagues in the Commonwealth, they certainly did as a result of this process go through some conversations and streamline processes, but it is still not really explicit in the way that a researcher who says, 'Look, I think I want to use this Commonwealth data set' needs. It is not really clear how they go about the process to get the approval. We have tried to assist that by developing an online application system. So PHRN has developed an online application system, which the Commonwealth and the other jurisdictions are using, and that is facilitating the process. But I still think there are opportunities for improvement.

CHAIR: It is still really sticky.

Prof. Boyd : It sounds time-consuming, doesn't it?


Prof. Boyd : As the Centre for Data Linkage and AIHW are quite close in terms of doing cross-jurisdictional and Commonwealth linkage, we try to keep abreast and make sure the researchers know what the doors are and what forms need to be filled in. But it does, I have got to say, change on a regular basis, so you have to be across it.

CHAIR: Is education going to fix it or does something more significant need to happen—or is it both?

Prof. Kearney : I think that if there were some top down approach that would be very helpful to speed things up.

CHAIR: What might that look like, Professor Kearney?

Prof. Kearney : I think it should come from government to say that the systems for accessing data need to be secure and so on, but they need to be freer and they need to be supported.

CHAIR: You are talking about a driver of cultural change that would then lead to the practical responses that follow?

Prof. Kearney : If the government says it wants this to happen, then maybe it could happen.

Dr Smith : I have another perspective on it. There are issues within the underlying legislation and with the guidelines issued by the Privacy Commissioner for accessing MBS and PBS data. Legislation around the country varies. In some places it is fairly explicit about releasing data. In some cases it was not, but the jurisdictions changed the legislation to make it clearer. I think there is an opportunity to make it clearer within Commonwealth legislation and guidelines, and that might specify what the process is for approval, and in an ideal situation would specify the criteria for approval—that is, each project must meet certain criteria—and then there may be a time line for reviewing that application.

CHAIR: So reveal the hidden and then be accountable in terms of some sort of time line?

Dr Smith : Yes, that would be ideal.

CHAIR: I go to the statement that Senator Moore referred to, the 'Australian government public data policy statement', which states:

At a minimum, Australian Government entities will publish appropriately anonymised government data …

It then states:

Requests for access to public data can be made via or directly with the government entity that holds the data. If access to data is denied by an entity, users may appeal the decision using the public request functionality available through data.

Dr Smith : Something similar would be ideal.

CHAIR: Do you think that is going to help?

Dr Smith : I do not think it will help our case, but if something similar were applied to the request to access unit record data for research purposes, that would make a big difference.

CHAIR: Because this is too broad scale—

Dr Smith : It is anonymised data.

CHAIR: Because it is anonymised, it does not meet the criteria. So, replicate this, make the whole process a lot more transparent with accountability structures built in?

Mr Wells : Yes, some performance standard of expected times and that sort of thing.

Prof. Kearney : I think you can always do the legislation, but my experience is that it takes a long time to do that and to harmonise it. But it is often not necessary. If there is a lead at political level then things can happen anyway. Although there are differences in the legislation, none of them are insurmountable or can prevent these changes happening.

Dr Smith : I suppose our challenge is that, over the years, the regimes in government agencies change. Some are more willing to interpret things liberally and others are not keen to do that at all. If something were in guidelines, it would be helpful. I think Professor Kearney is exactly right: legislation does take a long time. I think this would need bipartisan support and should have bipartisan support in the national interest.

CHAIR: The harmonisation of national legislation around this would be an efficiency measure that you would think would be worth investing in. So legislative reform plus guidelines, or maybe guidelines in the first instance to allow people to move—

Mr Wells : Guidelines at the start, yes.

Senator MOORE: We heard from previous witnesses that the guidelines were actually the killers. The last panel went into quite a degree of detail about the fact that, from their perspective, the guidelines currently as they are written tended to be where the blockages were.

Prof. Boyd : If we could have the guidelines in partnership between the government and the academic community, it would help. At the moment the guidelines allow certain integration or joining of the datasets, but only by Commonwealth agencies who are the ones that can be accredited. There is a problem as well to build capacity if there are only three that can do it. So there is a limitation and also limitation in access to the information.

CHAIR: And that is the logjam.

Senator MOORE: The last witnesses said there were two.

Prof. Boyd : I think there are actually three.

Senator MOORE: So there are stats in AIHW.

Prof. Redman : And the Australian Institute of Family Studies.

Prof. Kearney : ABS and Human Services.

Prof. Ferrante : These are integrating authorities but they all sit within the Commonwealth government.

Senator MOORE: Yes, and they are the only ones who are accredited.

Prof. Ferrante : For linkage units that are part of the PHRN and sit outside, there are still restrictions in the way those arrangements have been put together.

Dr Smith : One of the state linkage units has sought accreditation to be able to receive Commonwealth data and the view from the Commonwealth was that they could not accredit it because it was a state agency, so they did not have jurisdiction.

CHAIR: Okay. So it was an arbitrary decision based on historical separations.

Prof. Boyd : It is currently a Commonwealth organisation that does integration.

Prof. Ferrante : I want to go back to Merran's very first point. A lot of the time Commonwealth departments culturally will tend to create these guidelines with blinkers on. They will think about their own context so they are written in that context without really thinking about the broader impact. So I think it is important that those guidelines when they are drafted allow for some input by those who will be affected a lot by them so that when they do arrive they arrive in a way that is acceptable. In a sense I will not say 'negotiated'—that is probably too strong a word—but they are not imposed and imposed in a way that is kind of restrictive, which I think is what has happened with the current arrangements.

Senator MOORE: So looking at the current guidelines and seeing whether they work from a guideline perspective and seeing how it can be made better and more responsive?

CHAIR: The location and the jurisdiction of the PHRN nodes are the Centre for Data Linkage, the Centre for Health Record Linkage, Queensland linkage, SA-NT, Tasmania and Western Australia, which pulls it all together—and that is the work you have been doing over the last five years, Professor Kearney. So when you say the infrastructure is established that is exactly what you are referring to?

Prof. Kearney : Correct.

CHAIR: Good. I just wanted to make sure we got that on the record.

Prof. Kearney : I should say that Western Australia has had a long history of data linkage and New South Wales has too. Our main investments have been in the three national facilities I mentioned plus South Australia, Victoria, Northern Territory, Queensland and Tasmania. Now we do have a national network that is collaborative and we have a board and then a participant council, so we have a structure to really keep developing and I guess enhancing that network.

CHAIR: Building on the back of that network now established, I want to go to the evidence you gave, Dr Boyd, around new technologies. You said something about innovative data linkage and Australia's capacity. We have been hearing that we have a deficit here in our capacity to do it but you just said actually Australia is quite advanced in this. I think you linked it to the fact that we have temporary sets of data and then it has to disappear and we have to keep rebuilding. There is some kind of skill set I am assuming in the middle of all of that that is an asset. Can you explain that a little more?

Prof. Boyd : I think Anna pointed out the capacity in terms of people's knowledge, skills and the workforce capacity, so building the skills to do the work is one thing. The other thing is about the technology. Of course we are trying to do national linkage, which is challenging. We have a population of 20 million and we have to build systems that can link data for 20 million people. Our centre has been trying to leverage the newest technology to try to do things faster, quicker and more accurately and get value for money basically because we have this policy of create and destroy and we do not want services waiting years to get the linkage part when they have waited years to get the approvals part. So our aim is to do things quickly and accurately. We have invested a lot of time and effort into doing things that make it faster to do the actual joining of records together. We have also done other things to sort out bottlenecks in terms of how you separate the data and manage the data over time and hold the data securely. More recently we are trying to get across legal issues by trying to come up with technology that does not require names to do the linkage. We are always trying to innovate and look at what is happening elsewhere, leverage that and bring it to Australia.

CHAIR: So it is almost like we have an accidental skill set because of our—

Prof. Boyd : A lot of the time, yes. It is workarounds. It is a solutions focus thing. We try to get across the problems and move on to make the data available.

Prof. Kearney : I think there have been enormous changes and developments over that five- to six-year period, and Australia is as well placed as any nation in this privacy-preserving data linkage space. The number of research projects have grown annually, and we could provide you with information on that. The performance of the various nodes has increased over that time, so there is more need, but we have done a lot and it is growing and improving. We are probably as well placed as any nation to go forward in this area.

Mr Wells : One of the reasons we are so ahead on linking is that it is probabilistic linking. We do not have a unique identifier common to all our systems, even though there was some effort to do that I think in recent years. That would certainly reduce the cost and complexity et cetera of linking if we in fact had a unique identifier that crossed state and federal systems. A lot of the problem is getting it down to: is this the right person or not the right person?

Prof. Ferrante : Seriously, they may not be, but it goes back to the conversation about transparency earlier, about the processes that we set up—that open government thing and being transparent and knowing what the processes are. That is important, not just because they are good things to do but because what we are doing in data linkage—we do not have an Australia card—is something that we should not hide from the community. We should be very up-front and open about what we do. We have been, by having good systems and good governance that sits over the top of it. We might not have an Australia card now; we have to do these other things, but they are well managed. If we manage what we are doing now well, which is really hard, because we do not have an ID, I think that will set us up for being responsible to manage an Australia card if we are ever ready for it. That is a long way of making a point!

CHAIR: We had evidence at the end of the last roundtable that the Medicare card is a significant tool to enable that sort of fine-grained data about individuals to be followed, and you are saying that we are really proving our stripes with management of much more complex things than looking after your identity. Professor Ferrante and Professor Kearney, did you want to respond to that?

Prof. Kearney : We have designed our systems around the Federation, because that is the reality, and in the absence of the unique identifier. So the systems work. But I think there is an assumption that, if you had a unique identifier, everything would be simple. It may not be, because of Federation. The databases are held by different governments and in different places and there will always be the need to bring them all together through linkage systems.

Dr Smith : I think that is something that Australia is working toward solving. We should understand that this system that Australia now has is of international significance, and there are few other countries in the world that can do what we are doing. The UK cannot. Scotland, where James comes from, and Wales do have very good systems, but England does not really. You might have heard that some of the Canadian provinces also have very good systems, but not across the whole of Canada, and then there are the Scandinavian countries. We seem to have gone some of the way to cracking the nut of the Federation. Certainly Germany, which has 20 states—I feel sorry for Germany—is quite interested in how we have managed to do linkage across the Federation. I think you have had some conversations with colleagues through a separate process.

Prof. Kearney : Yes.

Dr Smith : So I think that is important. In terms of other innovations, if we could, what would we have? Most likely we would try to have some sort of repository of content data. In Wales, they have a repository of content data. Scotland may work slightly differently. In Manitoba—which you might have heard about this morning—that content data is put, in a de-identified form, in a repository, and all those different datasets are curated in a way that makes them much more easily available to researchers rather than having to pull them from all the different custodians around the jurisdiction.

CHAIR: I think that was called a clearing house.

Dr Smith : It could be called a clearing house or a repository. There has been some work done in Western Australia. In fact, we funded Western Australia to do a pilot for that. They have something called CARES, which is a custodian administered research extract server. The different custodians—in their case from health and health related areas including education, early childhood development, and justice—have put their content datasets in a single server managed by the health department. When the researcher has permission to access data from, say, two or three of those datasets, they are already together in one place and can be more readily provided. Something like that at the jurisdictional level would be really useful.

CHAIR: Can I go to Professor Redman and Mr Wells for the last question. You spoke about the data being useful but links to registries and cohort studies needing to be considered as well. I think you referenced the longitudinal women's study and access to the Australia-New Zealand intensive care research. There might have been another one that you mentioned—

Mr Wells : The 45 and Up Study.

CHAIR: The 45 and Up Study. Could you speak to where we are with that at the moment and make some suggestions about what we need to change? I also indicate I am going to ask each one of you to give us your No. 1 recommendation to take away from today's conversation.

Prof. Redman : Essentially, if we take each of those, the difference with the routinely collected data is that in general people have consented to being part of those special data collections. In recruiting, if we talk about the 45 and Up Study and about half of the women's longitudinal study, they consented to the linkage. The issue we have been talking about here this morning is much easier when you think about linking in those particular resources, because we usually have individual consent to do that.

If I take one of the examples that we have put in your folder, think about what impact obesity has in driving healthcare costs. We know a lot from the routinely collected data about hospital costs, but we know much less about the individual's obesity, their socioeconomic circumstances—all of that kind of information that you can collect with the enriched data sources. When you put that information together, you have a much more powerful tool for asking about how you could reduce hospitalisations within the Australian context, for example. Because they have that individual, personal data, the dataset is much richer than it might otherwise have been. So that is the value of it.

In terms of connecting that, in the 45 and Up Study we asked for permission and we recruited through Medicare. Therefore the links were already established from the outset, and we found that process to work fairly effectively, I think. Bob might want to comment. With the Australian women's longitudinal study, when they asked for individual consent the linkages worked pretty well, but the first people they recruited—it was about 20 years ago now—were not asked for consent. The process of getting access to that data through the Commonwealth was very slow. We took the same kind of time that my colleagues were discussing earlier.

CHAIR: Which one—the four years or the 18 months?

Prof. Redman : Much more at the four-year end. In fact, my recollection is it was longer than that. It took a long time to get them, but they do now have those data. My perception has been that the dialogue around that has dramatically improved over recent times and that the Commonwealth has been quite supportive of providing that information to them. It is obviously a changed process. The registry data are different again.

Mr Wells : I might just add about the 45 and Up that we get from the Health Insurance Commission or whatever it is now called—the Department of Human Services—the Medicare and PBS data for those people. So we do not have to link it; essentially, it is already linked. So we are 100 per cent certain that, if they want data on this group of people with these characteristics, we are giving them the correct data—the 45 and Up data and the MBS and PBS data for that group. That is not a probabilistic linkage; it is an actual linkage and it shows what could be done if there were a unique identifier. I am not saying it would solve all the problems, but I think it would speed up the process. The Commonwealth is quite happy to give us those data, because all those people have consented for the data to be provided. So we get it en bloc, as it were. Once a year they give us the latest set of those datasets for that group of people. What was the other thing?

Prof. Redman : The registries.

Mr Wells : The registries, yes. They are very small datasets. They might be only a few hundred, and some of them are even smaller. People access those for research. But people doing training for particular specialties, say, intensive care, are required usually to do a research project as part of their fellowship training. They use those data for that purpose so it is not just for research or interest; it actually helps those trainee fellows better understand what they are dealing with and how they can understand the craft that they are practising. It has a number of uses.

CHAIR: Our time is nearly over but as I just indicated, if you would like to give us what you think is the most significant recommendation going forward, that would be fantastic.

Dr Smith : My recommendation would be to review the guidelines for access to Commonwealth data in consultation with key stakeholders.

CHAIR: Which means us, government, academia and industry?

Dr Smith : Yes, and probably one other group we have not yet mentioned: consumers.

Prof. Kearney : I would like to see the PHRN as the leader in the development of data linkage systems in Australia for government, industry and researchers.

CHAIR: Having already claimed the ground for some time now, it seems that would be a pretty good fit really, wouldn't it?

Prof. Kearney : There is a lot to go but we have done a little bit.

Senator MOORE: I am sure you would like to add secure funding into that list. I think that would be pretty big on your list.

Prof. Boyd : I think solving the dataflow issue and removing restrictions to the Commonwealth data to accomplish things in partnership with those areas would be really good.

CHAIR: Yes because the logjam we heard about earlier is a problem.

Prof. Boyd : Yes.

Prof. Ferrante : I am going to extend that to get to the maximum recommendations possible, which would be to make that data flow and also enable Commonwealth data to be kept through an enduring mechanism with those linkage units that sit outside the Commonwealth.

CHAIR: Rather than research and destroy?

Prof. Ferrante : Yes. Moving from project to enduring.

Prof. Redman : I would have to support Dr Smith's recommendation. I think that is the most important thing.

Mr Wells : I would recommend that the presumption be that data will be released unless there is a reason why it would not be released. I think at the moment there is actually a presumption that it will be released if it is possible or whatever to release it. So if the presumption is it will be released unless there is a legitimate reason—privacy, legislation or whatever—and that departments resource that activity as well, because at the moment they see it as a burden rather than something they need to do and as something that they are not funded to do.

CHAIR: So you are seeking a significant cultural change and the funding that would flow from such a change?

Mr Wells : Yes.

Senator MOORE: They seem reasonable, don't they?

CHAIR: Absolutely.

Mr Wells : We could go on; you only wanted one.

CHAIR: I thank you very much for your work in the field and also for making yourselves available to give evidence to the committee this morning. We genuinely appreciate your expertise and the contribution that you have made to this conversation. Enjoy the Christmas period as well.