Note: Where available, the PDF/Word icon below is provided to view the complete and fully formatted document
Senate Select Committee on Health
Health policy, administration and expenditure

ANTONIUS, Dr Nicky, Acting Assistant Secretary, Information Knowledge Management, Department of Health

CRETTENDEN, Mr Ian, Assistant Secretary, Health Analytics Branch, Department of Health

FOSTER, Ms Alanna, First Assistant Secretary, Research, Data and Evaluation Branch, Department of Health

RICHTER, Mr Warren, Chief Information Officer, Australian Institute of Health and Welfare

von SANDEN, Dr Nick, Head, Statistical and Analytical Support Unit, Australian Institute of Health and Welfare

WILSON, Ms Michelle Louise, General Manager, Strategic Information Division, Department of Human Services

CHAIR: Welcome. I remind committee members and officers that the Senate has resolved that an officer of a department of the Commonwealth or of a state shall not be asked to give opinions on matters of policy and should be given reasonable opportunity to refer questions asked of the officer to superior officers or to a minister. This resolution prohibits only questions asking for opinions on matters of policy and does not preclude questions asking for explanations of policies or factual questions about when and how policies were adopted. Do witnesses have any comments to make on the capacity in which they appear?

Dr Antonius : I would add that information knowledge management is, essentially, about data management and adhering to some of the data access and release policies.

Mr Richter : I am responsible for data linkage, analytical support, IT and the business transformation program.

Dr von Sanden : The Statistical and Analytical Support Unit is responsible for rolling out our analytical services and development.

CHAIR: Thank you. We hope to be joined by Dr David Kalisch, from the Australian Bureau of Statistics, and Mr Sean Innes, from the Department of Social Services, by teleconference, from Canberra. They are not with us yet. Dr Antonius, do you have an opening statement you would like to make?

Dr Antonius : Ms Alanna Foster?

CHAIR: Ms Foster, thank you very much.

Ms Foster : I have an opening statement for Health.

CHAIR: I would appreciate that.

Ms Foster : Thank you for the opportunity to appear before the committee. Being able to use big data and data linkage holds the promise of being able to inform significant health system improvements. We have a substantial resource in the big datasets from Medicare and the Pharmaceutical Benefits Scheme and hospital data provided by states and territories as well as from the private sector. These datasets are further supplemented with important details about patients' health, service uses and outcomes by a broad range of smaller datasets, such as disease and screening registries, mortality data and survey data.

Due to the separate legislative requirements, it can be challenging to link these datasets while also adhering to strict privacy guidelines. Notwithstanding these challenges, it is clear that linked data could help better inform policy and the work of researchers to enable a better understanding of the characteristics of patient populations—for example, their risk factors and what health conditions they may have or their socioeconomic status. Linked data would also enable understanding of the full extent of patients' health-service usage—that is, it would be possible to follow patients' pathways through the system and answer questions about patient populations, such as: are the high users of primary care also high users of the hospital system? If we provide better access to chronic disease management in primary care are patients less likely to present to hospital? What interactions do patients have, with their GPs, when they leave hospital?

With big-data technologies linking and advanced analytic capabilities, we could, for example, use pattern mining to quickly identify adverse events that may arise from medical devices or health services, use cluster analysis to assign patients to like groups—for example, identifying groups with diabetes or cardiovascular conditions that may be amenable to policy intervention and then model the impacts of those imperfections, in terms of costs and patient outcomes. We could use pathways analysis to investigate how patients—for example, cancer patients—are moving through the health system and model the impact of policy interventions targeted at improving these pathways. These are just some of the tools that could be used when forming government decision making and the work of researchers.

The department is working on finding ways of improving our capacity to link data and improve big-data analysis techniques to the valuable health datasets. This is being done within the framework that maximises the use of the data while ensuring individuals' privacy is protected. It is being achieved within a broader government strategy aimed at unlocking the potential of public sector data in order to drive innovation, efficiency, productivity and economic growth.

As set out in our submission, we have been: working across government agencies to trial a de-identified dataset of Medicare, tax, social security and census data through the multi-agency data integration project to support policy development and research activity; building confidence and momentum for the PM&C public sector 's data management project by leading a high-value data driven project focused on mental health service provision; sharing patient-level MBS and PBS data with state and territory governments; and working to share MBS and PBS data with researchers through the AIHW, who will facilitate data linkage also in another initiative through the publication a of a de-identified 10 per cent MBS-PBS sample.

We have been working on: publishing data at low levels of geography for Primary Health Networks to help them undertake their training activities; providing access via data releases to medical researchers and governments throughout 2015; and building our internal data storage and analysis technologies and the skills and capacity of our analysts. Health data is particularly sensitive and the department will ensure that data is only released in a manner which will protect privacy, particularly where large amounts of MBS and PBS data is being made available.

CHAIR: Thank you very much for that, Ms Foster. I think you have touched on many of the issues that have been raised. Mr Richter, do you have an opening statement?

Mr Richter : Yes, I do. Just picking up on what Alanna said, I think we can start on a positive note. We are very pleased to say that the arrangements to allow the AIHW to access and store MBS and PBS data have reached finality, and we will be acquiring that data over the next few weeks. That is going to mean a very significant reduction in the time taken to service data linkage requests that require MBS and PBS data. So this is a very pleasing result. We will go through some of the concerns that have been raised in previous submissions, though.

Another thing we would like to raise—and I heard people talking about it this morning—is the data dictionary. We have just called for a request for expressions of interest to redevelop the AIHW's metadata online registry. It is called 'METeOR'. We had the industry presentation yesterday. We will be seeking to develop a panel of potential suppliers and to go to tender. We are aiming for this to become a critical part of national infrastructure. It already is, but we want to expand its use so that we can establish much better meaning of data that is accessed in the health area.

The other thing we are doing which I think is very relevant to this is working with, initially, the state of New South Wales—NSW Health—to establish what we call enduring data linkage keys so that we can build up a spine of prelinked identified records without the identifiers in them. You can then link to health content data on an as-required basis. That will also substantially improve data linkage. It essentially means that you do not have to do it each time you need to do a linkage.

CHAIR: From scratch.

Mr Richter : Yes.

CHAIR: This is what we were talking about—detect and destroy, I think it was, or something along those lines. So this is about the enduring nature of that data and it becoming available for others to build on.

Mr Richter : Yes. There are two components to that. You can have an enduring linkage key, in which case you use the key to acquire the data when you need it. That has the advantage of not having lots of datasets lying around. So you can destroy it and then re-create it, if you need to, fairly cheaply and efficiently. The other approach is to have an enduring dataset, in which the data is retained and you access it under appropriate constraints.

CHAIR: One of the articulations of such an enduring dataset was maybe a 10 per cent sample of a particular group that was permanently there for people to access.

Mr Richter : Yes, that is true, but I would qualify that a little bit. I do not want to speak for Health, but a 10 per cent sample in this context is typically a very safe dataset. There is no risk, and it can be publicly available. When people talk about an enduring dataset—and the ABS will talk about that, I am sure—it is a sensitive dataset still and you need to control access to it; but it does pre-exist, which saves a lot of time in terms of having to create it every time you need it.

CHAIR: So that would be one with a key to it that might have been created and removed and then made available for a purpose later on.

Mr Richter : With an enduring dataset, you use the key to create the data and then you would remove it unless you were going to add more data to it. You can do that in a way where you separate the keys from the content, and only when you want to add to the content do you materialise the key and then draw the extra content in. But, essentially, with an enduring dataset you have created a resource that, under certain circumstances, can be made available to people.

With an enduring data linkage system, you are enabling the creation of data on an ad hoc basis, if you like, to much more efficient and effective. Maybe I can go on about that a bit more?


Mr Richter : As an analogy, a researcher will approach us and say they have a cohort of people they are interested in undertaking further research on. They may have a particular disease. So a very precise data linkage is required. They might have a set of 1,000 people. We will undertake a linkage across the data resources we have and find records that match the ones they have. They might be in the National Death Index, the cancer database, the MBS or the PBS. So you create that linkage—you link those records together—de-identify it and then make it available to the researcher. For those applications, an enduring data linkage key is particularly valuable. It just saves you a lot of time.

With an enduring dataset, you are usually creating an enduring resource, for research purposes or analytical and statistical purposes, which would be accessed by a number of people over a much longer period of time.

CHAIR: But accessible very, very quickly, because it would just be there and available.

Mr Richter : Yes, that is right—because it exists.

CHAIR: Yes. We will come back to you, Dr Antonius. Ms Wilson, did you want to say anything?

Ms Wilson : I do not have an opening statement, no.

CHAIR: Now, let us try again: have we got Mr Kalisch on the line yet? No.

Senator MOORE: Chair, can I suggest that we end the attempts to get Mr Innis and Mr Kalisch on the line? They are both Canberra based. We could keep trying for the next half-hour and I think we should—is the term '86 them'? I think that is the term.

CHAIR: Okay. We have made that decision, then. Dr Antonius, you wanted to make a comment at this point.

Dr Antonius : Thank you for providing this opportunity. I just highlight that the Department of Health is pursuing both opportunities that Mr Richter was talking about. We are working with AIHW to create enduring data linkage keys and we are also working with the ABS to create the enduring dataset. At this stage, we are not preferring one over the other; we are actually pursuing both methods to see the benefits that can be realised from both.

CHAIR: Thank you. Ms Foster and then Dr von Sanden.

Ms Foster : I was going to provide a little bit more information about the 10 per cent sample that was mentioned.

Senator MOORE: Ms Foster, can you clarify for me the 10 per cent sample and also in the submission where it says that the department is providing AIHW with five years of MBS and PBS data. In my mind I had five. Mr Richter, you talked about it being an advance—and it is—but can we work out that decision to go to 10 as opposed to what was in the submission, which was five.

Ms Foster : The 10 is more or less a departmental initiative that links unit record data for the MBS and the PBS as well as hospital data. It will be publicly available. It is de-identified. We have been working with the ABS to make sure that it is absolutely 'confidentialised'—or however you say the word. So no linkage keys will be able to be used to identify anyone.

Mr Crettenden : And there is perturbation of the data so that it is not possible to identify an individual, even if you found someone who looked relatively unique in the dataset.

Dr Antonius : At one stage we made it publicly available. Any Australian citizen researcher for whatever purpose could access that data.

Senator MOORE: It will be public from the time it is released?

Dr Antonius : Correct.

Mr Crettenden : In the submission we talked about the data we provide to AIHW, which is the entire MBS and PBS file for five years. So there is a public-use file—10 per cent of it—and then there is the data we are providing to AIHW, which is the entire dataset for five years.

Senator MOORE: And the date from which the 10 years is determined? There is also the 10 years, if you are providing 10 per cent. So is that 10 per cent from now?

Mr Crettenden : That 10 per cent is also on the most recent five years. So it is 10 per cent of each annual—

CHAIR: So it is on an annual updating cycle?

Mr Crettenden : Yes, that is the intention.

CHAIR: Very good. Can I go to the MBS-PBS data issue. Can you give us the historical frame of why this has been such a massive problem. There is the transition point we are at now, and you have just indicated where you think we are heading in the next few weeks. Can I get a longitudinal view of that so that I can understand it more fully?

Ms Foster : Under legislation and under binding privacy guidelines on the use of the information, privacy is really at the heart of the perceived difficulties in accessing the data. In the guidelines and, I think, in the act, it specifies that the two datasets must be kept in different databases.

Mr Crettenden : That is right. The Health Act 1953 specifies that the two datasets have to be kept separate and then it also requires the Privacy Commissioner to make a number of binding guidelines. The issue of them being kept in separate databases is mentioned again. The Privacy Commissioner's guidelines go on to say that Medicare Australia, as it is described in there, and the department are only able to link the datasets under very specific circumstances and that the dataset that is used to create the linkage must be destroyed within one month of it being created. So it is quite tied down, I guess.

Ms Foster : It is very specific.

CHAIR: So the historical context in which that occurred is quite a different place from 2015. What was the context for that?

Ms Foster : I think it was in terms of concerns about privacy. There were very strong concerns about privacy. The Privacy Commissioner guidelines that we referred to were 2008—so, within the last seven years. It is relatively recent, I suppose, from a historical context compared to the 1950s act.

CHAIR: That brings us to the situation you have described, Mr Richter, of recent times. Can you catch me up on what has happened between 2008 and now?

Mr Richter : It would probably be better for the department to talk about that actual transition, because we are very much the recipient of this rather than a policy creator.

Ms Foster : I think what has happened fairly recently is that there has been a significant cultural shift in the way data is regarded. It is regarded as an asset; it is regarded as a key tool in informing policy development and research. I think we are shifting from a culture of protecting data at all costs to one of protecting data but also identifying ways we can use it. Are there ways we can use the data to make it available to inform policy and enable researchers to access it?

CHAIR: The third part of that equation, which has been put to us today, is how to improve outcomes for consumers of basic health services.

Ms Foster : Absolutely. The patient is the central driver for government policy and for research. I suppose the vision of our portfolio is to improve patient health outcomes. That is stated up front. Trying to improve access to the data is a key driver in trying to improve health outcomes. To go back, the very first principle in the department's policy is that nonsensitive data should be made available. It is becoming quite core to our department. When he first arrived in the department, our secretary said, 'I want to release data to the states and territories.' He got about 100 pages of why that was not possible, but then he asked the question: 'How can I provide that data to the states and territories?' He was given advice on how that could be done. It is more or less a case of looking at how much can be done within the current legislation, within the guidelines, in order to provide information. There is a paradigm shift—a major cultural change—occurring across the public sector, as well is in our department.

Mr Crettenden : I think we are also becoming more aware of the kind of techniques which Mr Richter was talking about that allow data linking in an environment that still protects the privacy of individuals, whereas what was being contemplated in the act and the guidelines was that any kind of data linkage was automatically a breach of privacy.

CHAIR: At which point of the transition are we now, Mr Richter?

Mr Richter : We have a schedule agreement that was signed last week with the department. We have a public interest certificate which is in the process of being signed within the department and we have arrangements in place with the Department of Human Services to receive the Medicare enrolments data, which we need to do the linkage—to identify the links. As soon as that public interest certificate has been signed—I do not know if you know, Michelle, but we have your people all on tap to hit the button—

Ms Wilson : We are looking at the piece of equipment that is holding the data to make sure it is ready to go. We will wait until the public interest certificate has been tied up with a bow and signed by everybody. It is going through the final processes, as is the multi-agency data integration project. The public interest certificate is also at that final stage of processing. Each of our departments has to go through its own legal processes.

Senator MOORE: I am interested to know whether the human services data includes only Medicare data or whether it has Centrelink and DVA data as well.

Ms Wilson : The multiagency data integration project is actually a project for the Bureau of Statistics but I can talk at quite a high level about—

Senator MOORE: At this stage I am just talking about the exchange that is happening between Human Services and AIHW at the moment.

Ms Wilson : Human Services is providing the Medicare enrolments database, which is really the linking data, if you like, that will enable the Institute of Health and Welfare to then link in the MBS and PBS data that is being supplied by the Department of Health.

Senator MOORE: So that is limited just to Medicare at this stage?

Ms Wilson : To the enrolments database, which is the information about each person who is enrolled in Medicare.

CHAIR: Just out of sheer personal curiosity, how many Australians are not enrolled in Medicare?

Ms Wilson : Australian citizens?


Ms Wilson : I would have to take that on notice. There are various ways people become enrolled in Medicare. People are enrolled in Medicare when they are born and their parents submit a newborn form that looks at their social security and family tax benefit, parenting payment and enrolment in Medicare as well. There is one single form for that. The other way people become enrolled in Medicare is when they enter the country—they are either eligible for Medicare or not eligible for Medicare depending on their circumstances.

CHAIR: But there might be a number of people to whom a form has not been submitted who might be floating around?

Ms Wilson : Potentially, yes. However, their capacity to use the medical system services might be an issue for them. I guess that is a matter for the Department of Health. We do not purposely go out and make sure that everybody is enrolled—

CHAIR: Until you mentioned it I had not thought of any gap that might exist. When you said 'enrolment data' I thought of course there would be some people sitting outside the system who might not be picked up until they engage.

Ms Wilson : We could take on notice for you—

CHAIR: What your estimate is?

Ms Wilson : whether we think there is a gap. There is actually a higher number of people enrolled in Medicare than there are in the population statistics though.

CHAIR: Okay. That is interesting.

Ms Wilson : Because there are people who were not in the country on census day or are living overseas but are still Australian citizens and still enrolled.

CHAIR: As soon as that happens you said that they are linked in then to social services?

Ms Wilson : Within the Department of Human Services we use the same process to enrol a newborn child in Medicare. We use the same single process to make sure that the families are receiving all of their appropriate payments for a newborn child and that we are ready to go when the parents apply for the payments. It is one form that collects that information. It might go into different systems but to make it convenient for parents we have one form.

CHAIR: And they go into different databases?

Ms Wilson : Of course. Our Social Security system is in the ISIS database and the Medicare system is a separate database, albeit those databases do talk to each other for things like immunisation and eligibility for particular supplements under the family tax benefit.

Dr Antonius : That reflects the arrangements that the Centrelink data is in the custodianship of DSS, the Department of Social Security. If Mr Innis were on the panel I think he would be able to explain that part of it. If DHS was asked to divulge data to AIHW that would depend what dataset and who is the custodian of that dataset. For MBS, PBS and Medicare enrolments data it would be the Department of Health but we are the ones who will pursue the public interest certificates and then share that with the Department of Human Services to forward it to the requester. If the requester is asking for social services data then they would be directed to the Department of Social Services.

Senator MOORE: Why Social Services and not Human Services if it is Centrelink data?

Ms Wilson : I can speak to this. The Department of Social Services are the data custodians of that data, so while the Department of Human Services holds that data, we are not the custodians of it. It is the same as with the Medicare MBS and PBS data that we collect in the course of our business: we are not the custodians of that data; we are the holders of it.

Senator MOORE: Are you custodians of anything?

Ms Wilson : We are custodians of the data about how our customers behave in our service delivery systems.

Senator MOORE: Which is? I am genuinely struggling in terms of Human Services being the department that has Centrelink and DVA now and the Medicare process, when I would have thought, simplistically, that if you are looking at the custodians for anything to do with human services, it would be Human Services. That was not clear to me until that last question. Basically, Human Services are not custodians for any Centrelink data, even though you collect it all, and you are not the custodians for the Medicare data, the MBS data—that belongs to Health. What is your status with DVA?

Ms Wilson : I would have to take that on notice.

Senator MOORE: We have had significant evidence today to say that the DVA process has been different from other forms of data access. A number of people giving evidence have said that it has been much simpler to work within the DVA process than it has been in any other department. I do not want to verbal them, but I think they came very close to saying that DVA had been moving very rapidly down the process. So you do not own Centrelink, you are not sure about DVA and you do not own Medicare. What is the data for which you are the custodians? What constitutes the behaviours of people in your area?

Ms Wilson : For example, the data about how a customer uses our online systems to interact with those other systems is data that belongs to the Department of Human Services. I do want to clarify something though. While the Department of Human Services is not a data custodian, we are the stewards of that data.

Senator MOORE: Dear God!—there is going to be a difference between the terms 'custodian' and 'steward'.

Ms Wilson : A steward is somebody who sees the data through and ensures that it is well looked after and that the guidelines are adhered to.

Senator MOORE: Is that a technical term?

Ms Wilson : It is a technical term. It is embedded within our policies in the department.

Senator MOORE: So we have a steward of data now.

CHAIR: So it is a minder but not a holder. A custodian is a holder of it.

Ms Wilson : Actually, the enterprise data warehouse that holds that data is within the Department of Human Services servers, if you like. We keep it in the boxes and it is accessed by other departments. We supply data feeds under certain circumstances, under agreement with other departments. But we are the data stewards, and that is a technical term, as is custodian, in my understanding.

Senator MOORE: The term 'steward' has not been used in any of the submissions yet, so that is a new term for us.

Ms Wilson : I think it is a unique position.

Senator MOORE: DSS is the custodian of who is on a payment, but for interactions of people on a payment it is Human Services.

Ms Wilson : Yes, that is right.

Senator MOORE: So going on and off, changing payment type, adding people—that would be the kind of thing that the department would be the custodian for, but not for who is on the system in a Centrelink sense. The reason I am stressing is that a lot of the need for linkage is with data that I would have determined was Centrelink data: income levels, size of family, interaction with the system with different forms of payment—that kind of information.

Ms Wilson : The Department of Social Services is the department that has the appropriation for payments, like the disability support pension. The Department of Social Services is the policy owner and custodian of that data. The arrangements for the linking of that data under the multiagency data integration project, which is one of the things that has been announced under the Prime Minister's Public Sector Data—

Senator MOORE: 'Madip'.

Ms Wilson : That is right. We are trying to come up with a better name.

Senator MOORE: It works.

Ms Wilson : Now that you have said it in public we are stuck with it, I think.

Senator MOORE: In that submission it has 'multiagency data integration project' and 'Madip' in brackets, so I thought that was already a term.

Ms Wilson : Yes. The arrangements for the linking of that data are such that the Bureau of Statistics will act as the integrating authority. That is an accepted term under the integrating authority arrangements. There are supporting agreements between each of the departments to allow sharing of that data with the Bureau of Statistics. There are public interest certificates, from any of the policy agencies that are custodians of that data, to support the release of the data for linkage. Those arrangements ensure that we are releasing the data only under the circumstances that are allowed under the various pieces of legislation and under privacy legislation.

Senator MOORE: So any questions about how the multiagency data integration project will operate needs to go to stats. Is that right?

Ms Wilson : To the ABS, I would suggest, yes.

CHAIR: A mud map, on notice, of who looks after what would be really helpful. If somebody could get that to us and clarify where the doors are on the ends of the corridors—I can hear the Get Smart music in the back of my head as doors are opening and closing around me, when I look through that. Unless you want to pursue that line, Senator Moore—

Senator MOORE: No.

CHAIR: I want to go to evidence we had today about the cost of accessing data. What are the costs, how does that work and what are the considerations for creating a new and innovative way of people accessing this data?

Mr Richter : Maybe I should speak to that. As you may be aware, the Australian Institute of Health and Welfare receives about 30 per cent of its funding from appropriations, so 70 per cent of our revenue comes from the provision of goods and services to others. We run our data-integrating authority and data-linkage services on a cost-recovery basis. That is, essentially, the cost of a salary plus the overheads associated with running buildings and things like that. When somebody wants us to do a day's work, it is the salary plus overheads. That is how the cost is generated. There are other organisations involved in the chain. If you have a state-linkage organisation, a Centre for Health Record Linkage in New South Wales, they are also running on a cost-recovery basis, so the researchers have to pay the costs that are incurred to deliver the service they are asking for.

Senator MOORE: At what level salary?

Mr Richter : It would be, typically, in the public sector, ASO6 or EL1 type level salaries.

Senator MOORE: Can you tell me numbers?

Mr Richter : Yes. An EL1 is about $95,000 a year or something like that.

Dr von Sanden : We have standard costs per linkage, so we could give you some idea of that, instead, which might be more indicative—

Senator MOORE: Can you get it from the website? I would be very pleased to get it from you but I was just wondering whether it was on the website, in terms of public disclosure.

Mr Richter : I would have to take that on notice; I do not know. I would say that the first thing we do when we talk to a researcher is talk about cost, so it is not as though we are trying to surprise them. Could we take that on notice?

Senator MOORE: It would be fine if you took that on notice. I thought there might be a standard rate that would be based on salary plus overhead but—

Mr Richter : Yes, I was using them as an indication. There are standard rates, which is an amalgam of all of the people, but that is the level of salary that we are talking about that goes in to make up the rate.

Senator MOORE: Are there any areas where there is no charge, in terms of determination of public interest or something like that? You can take that on notice as well, if you like.

Mr Richter : There are examples we could provide of where we have done that, yes.

CHAIR: Where you have enduring datasets available, are they going to become free for people to access? Once they are done, they will be there.

Mr Richter : We do not plan to create enduring datasets; essentially, that is the approach the ABS is taking for these very large projects, like the multiagency data integration project. We will be creating enduring data linkage keys, which will cheapen, over time, the cost of data linkage. It will be much more efficient. But until our budget situation changes, if it does change, we have to cost recover. So we would have to cost recover any marginal cost associated with any particular data linkage.

CHAIR: I want to, for a moment, be the champion of rural and regional Australia, for example—the bush is where the Labor Party was founded.

Unidentified speaker: Barcaldine.

CHAIR: Exactly. We have had some evidence this afternoon, and I did not get to all of it, about the inaccuracy of data collection and the distortion, which does not tend to things like an index of need: 'A number of organisations currently working on index of need would go some way to address the problem of need analysis that compares cities, rural and remote, but does not compare the GP numbers per 100,000 population.' The granularity is at the heart of the problems with the data around the regional and rural profiles that look like they exist within the data. There is some concern around that.

The submission also raised the issue of the reporting of rural health data by analysts with little contextual understanding of the rural health landscape. For example, PBS data has been repeatedly reported without considering the fact that there are proportionately more people in rural areas on a health card and, therefore, eligible for a PBS co-payment. So we are talking about degrees of granularity about information being made available. Are these part of your ongoing considerations with regard to the data?

Mr Richter : I will just talk about it from a technical perspective without getting too technical about this. With any of the datasets you have confidentiality considerations and you have to apply some kind of mechanism. In the case of providing data, for example to a rural health researcher, you have to avoid disclosure. You do that in various ways but, typically, you reduce the level of detail of the classification. In that particular case, you may not be able to supply the lowest level of the ABS's area classification, you may have to go to SA2, which is above that, or even above that. I think that is what they are talking about. They may be expressing some frustration about not being able to access data, because it is potentially going to disclose an individual. This is, particularly, in the context of a rural area, where there are not a lot of people, and you are doing the research and you can find some attribute about someone—they live in Kununurra or something—and then you are going to find another attribute, and then you are going to disclose a lot of information that you do not really intend to or do not want to or you can't, under the Privacy Act, so I think that is what they are getting at. But I would have to look at the questions they asked to really get to it.

CHAIR: Because of the capacity of the PHNs—and we had some evidence in that last session about the size of the PHNs being too big to respond to the uniqueness of particular populations in regions—they need data that is specific to particular contexts rather than the broad data. This is where the real critical issues about privacy, in a very different way from 1951, are emerging. Are you aware of these and how are you handling that?

Mr Richter : I might have to defer to the Department of Health because it is getting into the policy area.

Mr Crettenden : This week we released a facility that provides data at PHN level and, recognising that PHNs are of significantly different size to each other, we have also the data at SA3 level in the same facility. We are making available MBS, PBS, mental health and aged care, and that data is available on the website for PHN use and the public use as well. We are trying to give as much easily available data out at as low a level as we can, but the point still remains that it is very easy to identify individuals once you get to small areas and in particular to start linking data. It is very easy to find what are called 'population uniques' in a large dataset in a small geographical area. I am not aware of a clever technical way of getting around that moment.

Ms Wilson : The Department of Human Services does take applications from researchers for provision of information at a more detailed level and has a process where the Department of Health considers research proposals and the provision of data at a lower level. Some of those requests are granted and some are not. In some cases, if private details are likely to be released, we might do something like a consent study. We have done consent studies in the past where we have written to the people who would have their information shared and sought their consent before that information is shared. So it is not that it cannot happen, but in the public domain that data is not available because of the risk that somebody may be identified. Under our processes they can come to the department and ask for the data or information and have that released. Of course, we work very closely with the Department of Health in those circumstances and we have a committee process that considers each of those requests.

CHAIR: One of the recommendations put to us today—and it reminds me of the charitable donations inquiry—is about seeking ethics approval. It seems the researcher had to go through six different ethics approvals for the same project because it required information from a range of agencies. I am not familiar with the details. Are you aware of a proposal that would address that issue for researchers?

Dr Antonius : If I may clarify, Health's position on this is for our ethics committee to recognise any pre-existing approvals from other ethics bodies which are of a similar standard. Normally we would refer to the NHMRC's human research ethics standards and, if we receive proof that a research application has had ethics approval from another body, we would not make them go through another ethics process. Sorry, Mr Richter.

Mr Richter : I was going to say very much the same thing. On the very positive side, with access to MBS, PBS unit record data the public interest certificate that will be issued by the department will allow us to release non-identified data on the basis of an approval from our ethics committee under our legislation without having to go back to the ethics committee in the department. That is going to be a very significant step forward.

CHAIR: Will there now be a spot where people can say, 'All these agencies have the same standard. If I get through one, I will get through the lot'?

Mr Richter : I have to qualify that a bit, because it is certainly going to be applicable to the MBS and the PBS datasets but when you start to introduce other datasets you get into different legal domains, particularly when you cross into another jurisdiction.

CHAIR: Do you mean internationally or within Australia?

Mr Richter : Within Australia. If we want to link data from, say, Victoria to Commonwealth data, there may be a separate ethics committee process that the Victorians would insist on going through.

CHAIR: That is exactly the kind of thing where people are saying 'Enough'. We are in the same country. We are all citizens of the same country. They are talking about getting through the state level a whole lot more quickly than the Commonwealth level. The blame has been apportioned to the Commonwealth to date. You are talking about a significant cultural change. Can I just say that, based on what we have been hearing today, the last thing that we need to be hearing now is: 'Now it's going back to the states for their verification as well.' These conversations need to be had in order to streamline the process, with no diminution of integrity. I think it was the replication—because it is more than duplication, isn't it?—and then the reporting that follows on from that.

Mr Richter : I completely agree. We have actually addressed that in our submission. It is still a problem. It varies across Australia, so you will have a much more accommodating arrangement in one state than you do in another. It is certainly something that I think the Population Health Research Network have talked about as well. So we really need to get together and talk through this and get one sensible, consistent approvals process.

CHAIR: Is that in hand?

Mr Richter : For example, we are working with New South Wales on the enduring data linkage key. Associated with that, I hope they will agree to a similar arrangement which we have with the Department of Health that if our ethics committee approves something or if theirs does then we would be able to link the data. That is not in place yet. I think it is going to take some time for that to unfold.

CHAIR: One of the other issues that was raised was access to data custodians. This was from Dr Gidding. She was talking about the immunisation register and the response that she had over the two years, I think, it took her to get the approvals. Trying to get to the data custodians was finding out who had what and then getting them and actually speaking to them. When I asked her for more information she spoke about people saying, 'That's not really my job,' or 'I'm so busy I can't do it.'

Senator MOORE: It is not my core business.

CHAIR: 'It's not my core business.' What is the status of data custodians? We have got an old culture and clearly it looks like we are shifting to a new culture. How does that fit in with people's KPIs? How much is that a driver of success in different departments and agencies to make the management of data, the sharing and integrity of data, core business rather than peripheral? Where is that?

Ms Foster : In terms of access to data and better use of data, the Department of Health has been through a number of different reviews that talked about better use of data, better valuation, better research to inform policy development towards better health outcomes. In fact, that was the reason the division I head up was established. It was to try and make better use of data and ensure better use of evaluation. Certainly, in my KPIs it would be a fairly key feature that there be better use of data. In the department's vision, for instance, as I mentioned earlier, there is that reference to better health outcomes and one of the key underpinnings of that is that better use of research and data. So it really is quite widespread throughout the department. There is a major cultural change that is occurring.

Dr Antonius : I think Dr Gidding's question relates to the expertise and the knowledge in regards to the data that she was going to request access for. When I listened to that question my suggestion would be to direct her to something like METeOR that AIHW maintains. That is publicly accessible infrastructure by which all researchers, all members of the public, can access and find out what datasets are being collected, what the data items are, what they are being collected for and how they can be used for research. For example, the Department of Health has recently released metadata around the MBS and the PBS data collections on METeOR and that has become publicly accessible.

Dr von Sanden : I believe I worked on that same project with Heather Gidding—if not, it was one very similar. I want to support what Nicky just said in that essentially the data custodian is usually responsible for arrangements for people to access the data and the questions Heather Gidding had with the project we were on were what actually does that data mean and the variables? That comes and is stored in metadata repositories or data dictionaries, so it is not usually the responsibility of the data custodian to know that but it can be the responsibility of the data custodian to help produce that.

CHAIR: And have it stored in the data dictionary?

Dr von Sanden : Or METeOR or some other metadata repository, yes.

CHAIR: So how many metadata depositories does a country need? I am thinking of a researcher who has just finished their degree. They are so excited because they have got their PhD—and they are still not going to have to pay for it because we have managed to hold that back—and are ready to do research. Where do they go?

Mr Richter : That is a good question. I am volunteering to answer this because it is kind of dear to my heart. I would like to think I have some expertise in it from a long background at the ABS and other places. Think of metadata as being of three kinds. There is definitional metadata that says 'age' means how old you are, if you like. Then there is procedural operational metadata, which is actually the kind of metadata that systems need to be driven by. So it will have protocols in there and descriptions of field lengths, character types and things like that. It is very technical. You do not need to access that usually if you are going to do your PhD thesis. What you are really interested in is the definition of the item. You may also be interested in the other set of metadata which is conceptual metadata, concept sources and methods. How the ABI compiles the balance of payments, for example, is described in a lot of documentation.

The problem with metadata is that it is so easy to get lazy when you are doing analysis and creating data. You get carried away with creating a new variable, coming up with a wonderful design of the collection and you do not have the discipline to sit down and describe this thing in the detail that it deserves. Also when you have an administrative by-product dataset like, for example, the Medical Benefits Scheme and the Pharmaceutical Benefits Scheme the kinds of changes and the nuances of the changes are not always described in metadata in a way that would be—

CHAIR: Accessible to somebody who was not there at the time watching the process?

Mr Richter : Yes. So the answer to your question is we think that we could get to the stage in government where there is one place to go to for definitional metadata that is in support of the major datasets. We have been talking to PM&C about that and we are hoping that METeOR will be picked up as the place for that to be stored, and the Department of Health is already doing it. Concept sources and methods metadata could be put into there but it is going to require that kind of cultural change and the kind of discipline that is hard to make happen, but that is what is needed.

CHAIR: That is a different answer to what I thought I might get.

Senator MOORE: I am still not sure of the difference between metadata and data, but I am not going to go there. Mr Richter, in your presentation you spoke about the new METeOR and said that you had an industry presentation yesterday. What does that mean, and what constitutes 'industry' in that definition?

Mr Richter : I should have said 'briefing'. It was an industry briefing presentation. We used AusTender to publicise the fact that we, with the assistance of the Department of Health and the Australian Health Ministers' Advisory Council, got some funding to go forward with a request for expressions of interest from industry. So we publicised the fact that we were going to be doing that and invited people to Canberra and on a webinar, and we just talked about what we would be asking for. There was also an accompanying, quite voluminous set of documents describing the requirements. That is essentially—

Senator MOORE: Without having to access that information, which we will not, can you tell us in brief what is going to be required. What is the intent of the project and what have you been funded for?

Mr Richter : We have been funded at this stage to call for requests for expressions of interest and to evaluate them and to produce a report that will go through the health committee structure, which is NEHIPC, the National E-Health and Information Principal Committee. It will eventually go up through to the Australian Health Ministers' Advisory Council mid next year, and that will contain a proposal for calling for tenders and getting funding to actually build this new thing.

The first part of your question is: what is it? Essentially, this facility that currently exists has been in place for 10 years, and it is just really old fashioned and clunky. It does its job—it contains and stores metadata—but it is hard to use in a modern way. It does not interface with modern systems and it does not use what the IT people call 'metaphors'. It does not use modern metaphors to access it, so it does not look like a modern IT system, and people do not like to use it.

There are a lot of opportunities as well to be much cleverer about the way metadata is created and accessed and searched for. The idea is that you create metadata once and you use it many times. If you can use it many times, you are actually then getting into a situation where you have got, to use the technical term, a master data management situation where you are reducing the number of information concepts that are used. So, instead of using your own area classification, for example, you use the Australian Bureau of Statistics area classification, which means, then, you can compare the data items across areas. That is essentially what it is about; it is facilitating this information management across government.

Senator MOORE: So it is going to be a tool.

Mr Richter : Yes, exactly.

Senator MOORE: When you say 'industry', what is industry in this sense?

Mr Richter : Largely software companies.

Senator MOORE: So it is IT.

Mr Richter : Yes, IT companies mostly.

Senator MOORE: The way you describe it, it will go up through the process by mid-2016.

Mr Richter : Yes. But, hopefully, we will get funding, and then they will start to build it, and we will have a new facility.

Senator MOORE: You do not have any idea of the size of that tender yet?

Mr Richter : No, we do not. We know how much we estimate it at—and it is in the documents, so I can tell you. We thought that, to provide the basic mandatory requirements that we have specified, it would cost about $1½ million. But to do more than that—which we are looking for; we are looking for ideas—we do not have a figure for that.

Senator MOORE: That would be funding through the health department?

Mr Richter : It would come from jurisdictions. The Commonwealth, typically, in these circumstances, funds about 50 per cent and then the rest—

Senator MOORE: So this would be a Health project.

Mr Richter : Yes.

Senator MOORE: We talked a little bit about the PHN. The department's submission talks about the PHN having a new website that is going to have all the relevant databases on it and is to be launched in early December 2015. Have I missed it?

Mr Crettenden : It was released this week.

Senator MOORE: What is on that? The description here is:

It will also provide links to other relevant data sources. The PHN website aims to reduce duplication of effort in data collection and storage and improve the quality and availability of data.

Mr Crettenden : What we have provided on the website at this point is all of the data we were able to pull together from within the department at SA3 level for each of the PHNs. We have information on MBS and PBS activity. We have information on a number of aged-care programs and a number of mental health programs. There is a very simple data display-and-analysis tool so that the PHNs can compare themselves to other areas, and they can also compare the smaller SA3s within their own area.

Senator MOORE: So it is all the existing data that is considered to be relevant?

Mr Crettenden : Yes.

CHAIR: And it is now available to the PHNs.

Mr Crettenden : Yes. Most of it is also publicly available via the department's website; we just have not publicised it at this point.

Senator MOORE: And the S-thing is a stats definition?

Mr Crettenden : Yes, SA level 3. It is the size of a local government area.

Senator MOORE: And that is a stats determination, isn't it?

Mr Crettenden : Yes, that is right.

Senator MOORE: Do all of the PHNs have SA2s, neatly?

Mr Crettenden : Yes, they are all composed of whole SA3s, so it is possible to build them up.

Senator MOORE: That was not the case in the past, so that is good.

Ms Foster : Just to clarify, the portal does have two tiers, I suppose. There is the publicly available front or part—

CHAIR: So a local journalist could get on and see some data?

Ms Foster : They could see some information. But, for the PHNs themselves, there is another element, another lot of data, that is password protected that only they can access. That is because of the detail that is available, and that needs to be that bit more protected in terms of privacy. But, yes, anyone can get onto the portal now. I have it loaded on my iPad.

Senator MOORE: Will that be maintained by the department or by the PHNs?

Ms Foster : By the department.

CHAIR: Can you just remind me what data is on there now?

Ms Foster : There is MBS data, PBS data, aged-care data and mental health data, the Australian Childhood Immunisation Register, chronic disease data—that means it includes the Diabetes Care Project and the National Diabetes Audit data—and health workforce data.

Senator MOORE: What constitutes mental health data in that sense?

Ms Foster : That is referring to access to allied psychological services and the Mental Health Nurse Incentive Program—

Senator MOORE: So it is about the usage of those programs. That is what constitutes mental health data.

Ms Foster : Yes.

Mr Crettenden : And the Medicare mental health items.

Senator MOORE: MBS and PBS?

Mr Crettenden : Yes.

CHAIR: There is plenty more to ask, and I indicate that we might put a few questions on notice. But, given that we were unable to connect with Mr Innis and Mr Kalisch, the committee might actually get back together with you—certainly with their two agencies—in Canberra early in the New Year if we have some questions to clarify some of the things we have opened up today. Thank you very much for being with us.

Senator MOORE: I have one last question for Ms Foster. Another committee we were on looked at the commitment of the department to—what did they call them?—the determining factors of health. It is the term for all of the other things that can impact health.

Unidentified speaker: The social determinants?

Senator MOORE: That is it. I went completely blank. There was a commitment that all decisions in the department were going to be made with an awareness of the social determinants of health. So how is that being used in this project around megadata? I have always been interested in the department's response, in that other committee we were on, that its overwhelming focus was on the social determinants of health. Within this project, how were they factored into deciding the data sources, what will be published and those sorts of things?

Mr Crettenden : Are we still talking about the PHNs?

Senator MOORE: No, we are talking about the whole project. Maybe that could go on notice. I think it has already been said that many of the people who came to us were talking about this issue—that the data sources with which they wished to be linked in terms of developing analysis were very much based on the social determinants of health.

Mr Crettenden : I think the multi-agency dataset that the ABS is sponsoring is very much directed at that kind of thing.

Senator MOORE: It could be, but I would just like to get something back from the department about how the social determinants of health are factored into the work that has been done under the departments in this process.

CHAIR: I think the other thing was, according to a question today, the delivery of contracts to different agencies to do different work—

Senator MOORE: Consulting.

CHAIR: Yes, consulting. But also there was the fact that they were given projects and then sent, much later, data requirements about what they had to report on. So they had started the work and it was sort of retro-fitting data down the track. The indications of the work that they were doing, the gathering of data, the forward thinking—'What data do we hope to collect from this for greater purposes rather than just project by project and, certainly, just putting indicators of performance in down the track rather than in embedding them up-front in the process?'—were raised with us today as well. I just wanted to put that on the record and see if you do have any policies around thinking about big data and bringing it down in scale to the outlines of projects that the department might be funding or research that you are commissioning. That was another question that was raised—the capacity of the department to commission research. We have more to talk about, but we will draw to a close.

Can I just see if Senator Moore will—

Senator MOORE: I will—

CHAIR: the tabling of the National Rural Health Alliance statement that we received today. I am also seeking a resolution for the date of return for answers to questions on notice. We are indicating 14 January as the date. Would you like to move that way, Senator Moore? I think we have done all of those official things. I want to thank all of the witnesses who have been with us today. Thank you for giving your time and expertise. Thank you also to Hansard, broadcasting and the secretariat. That concludes today's public hearing. This is last hearing for this committee for 2015. I particularly acknowledge the great work of the secretariat, Mr Stephen Palethorpe, Ms Jed Reardon, who is writing furiously in Canberra and not with us today in Sydney, Mr Michael Kirby, who has joined us lately, and Josh See. They have done great work for this committee in the course of 2015. I wish everybody a very Merry Christmas. The committee stands adjourned.

Committee adjourned at 17 : 17