
Integrating Data: Boosting the capabilities of researchers to inform policymaking.
Miles explores how data linking can help tackle cross-cutting issues in an increasingly uncertain world, and how the ONS' new Integrated Data Service will provide a step-change transformation in how researchers will be able to access public data.
Joining him are ONS colleagues Bill South, Deputy Director of Research Services and Data Access; Jason Yaxley, Director of the Integrated Data Programme; and award-winning researcher Dr Becky Arnold, from the University of Keele.
TRANSCRIPT
MILES FLETCHER
Welcome again to Statistically Speaking - the Office for National Statistics Podcast. I'm Miles Fletcher and in this episode, we're going to step back from the big news making numbers and take a detailed look at an aspect of the ONS which is, less well known, but arguably just as important.
The ONS gather an awful lot of data of course, and much of it remains valuable long after it's been turned into published statistics. It is used by analysts and government, universities and the wider research community. So we're going to explain how that's done and look at some really interesting and valuable examples of how successful that has been to date. And we're also going to hear about a step-change transformation that's now underway in how public data is made available to researchers, and the future potential of that really important, exciting process. Our guides through this subject are Jason Yaxley, Director of the ONS's integrated data programme, Bill South who is Deputy Director of the Research Services and Data Access Division here at the ONS, and later in the podcast we'll hear from Dr. Becky Arnold who is an award-winning researcher from Keele University.
Right Bill, set the scene for us to start with then, we are talking here about the ONS Secure Research Service, take it from the top please. What is it? What's it all about? What does it do? What do we get from it? BILL SOUTH
Hi Miles, thank you. Yes, the Secure Research Service, or the SRS, is the ONS' trusted research environment. We've been running now for about 15 years, and we provide secure access to unpublished de-identified micro data for research that's in the public good. So in terms of numbers, we hold over 130 datasets, we've got about 5000 Researchers accredited to use the service and about 1500 of those would be working in the system at any given time on about 600 live projects. MF So what sort of data, what is stored and what's made available? Is this survey responses? BS Traditionally the SRS has held most of our ONS surveys. So that's the labour market, business...all of our surveys really. In the last four years, thanks to funding we've received from Administrative Data Research UK (ADRUK), we've been able to grow the amount of data we hold, so now we've increasingly got data coming from other government departments. And we've got more linked datasets that enable us to offer new insights into the data. MF And so these are people's responses to survey questions and people's records, as well as data that are held by other departments? BS Indeed, yes, the data coming from other departments is often administrative data, so not from surveys but more admin data. MF And a lot of the value in that is in being able to compare and to link this data to achieve different research insights? BS
Absolutely. I mean, a good example of that is a dataset that's been added in the last year or so where our ONS census data from 2011 was linked to educational attainment data from the Department for Education into a research dataset called Growing up in England (GUiE). And it's hugely important because we have a lot of rich information from the census but you know, linking that with the educational attainment data offers new insights about how kids do at school, and how they're linked to the characteristics of their background. MF So you use the underpinning of census to provide a really universal picture of what's going on across that particular population, and therefore gain some insight into how people have achieved educationally in a way that we wouldn't have done before. Of course, all this and the power of it is clear in that example, but a lot of people might think, oh my gosh, they must know an awful lot about me that in that case, tell us about how privacy and anonymity are protected in those circumstances. BS Yeah, absolutely. It's a central part of their operation, and clearly the word secure in the name is key there. So we follow a five safes principle which underpins everything we do. The five safes are safe people, so that anyone who uses the SRS has to be trained and go through an assessment to be accredited by us to use the environment. Once they're accredited, they then have to apply to have a project that's running in the system, and that gets independently assessed. There are a number of checks around whether it's ethically sound, whether the use of data is appropriate, but the key thing really is around the public good. So all research projects that happen in the SRS have to be in the public good and there's a commitment to be transparent. So every project that happens in the SRS, there's a record which is published on the UK Statistics Authority website. The third safe is around the settings, so it's a very controlled environment where people access the data. The fourth stage is around the data, so although we've said it's record level data it's already identified. Names and addresses, any identifiers are stripped out of the data before researchers can access it. And the final stage, the final part of the of the researcher journey if you like, is around outputs. What that means is we do checks to ensure that when any analysis leaves the environment that no individual or business can be identified for the published results. MF So in essence, you must convince the ONS that you are a Bonafide researcher, and you also have to convince them that what you're doing is definitely for the public benefit. BS That's right. And the other thing that's worth noting is that the SRS, like a number of other trusted research environments across the country, has been accredited under the Digital Economy Act to be a data processor, which means we go through a rigorous assessment process around the security, the environment, but also our capability to run it. So that's our processes, our procedures, whether our staff are adequately trained to run the service. That's a key part of that accreditation under the Digital Economy Act. MF So, on that point then about anonymity, you can drill right down to individual level, but you'll never know who those individuals actually are or be able to identify them? BS That's right. Researchers typically will run their code against the record level data, but when they've got the results of the analysis, there are clear rules that say you won't be allowed to take out very low counts. So that means like our published outputs, there's no way of identifying anyone once the research is published. MF And the SRS has built up over the years a good reputation for actually doing this effectively and efficiently. BS Yes, I think that's fair to say. We have a good reputation, and the service is growing in terms of the number of datasets and the number of projects and the number of people using it. So, I think that speaks for itself. MF Okay, let's pull out another I think powerful example of why this facility is so important and that comes from the recent COVID pandemic. Many listeners will be aware that the ONS ran a very, very large survey involving upwards of 100,000 people providing samples, taking COVID tests, and they were sent off to be analysed creating an awful lot of community level data about COVID infections, and we in the ONS then publish our estimates and continue to do so as we record estimates every week of fluctuating infection levels. But behind all that work, there were expert researchers in institutions around the country who were doing far more with that data. And the SRS was fundamental to delivering the data to them. Tell us about how that operated Bill, and some of the results that we got out of it. BS Yeah, sure. I mean, the COVID infection survey that you refer to there, that dataset is available for accredited researchers to apply to use, and they have done, but we've also brought in a number of others, about 20 COVID related datasets are in the SRS, so things around vaccination or the schools infection survey, mortality, etc.
So since the start of the pandemic we've had over 50 projects that have either taken place and completed, or are currently underway, in the environment. Some of those are directly using the COVID related dataset. So looking, if you like, at the health impact, but there's also projects that are are looking at, if you like, non COVID data, economic data or education data, that are projects dedicated to understanding the impact of COVID. MF What sort of insights have we seen from those? BS In terms of those using the COVID related data there's been analysis to highlight the disproportionate impact of the virus on ethnic minorities, that went on to implement a number of government interventions. Another project assessed the role of schools in the in the Coronavirus transmission. We had another project that was run specifically on behalf of local authorities to inform their response to the pandemic that offered insights into the risks between occupation. Also research into footfall in retail centres and how business sectors were affected by the pandemic. So a really huge range of things. There were other research projects looking at the impact and you know, an example there was a project that looked at learning loss. So, kids not being in school for that sort of 20 to 21 academic year. Similarly, the Bank of England ran a project looking at the financial stability of the UK during the pandemic period. So hopefully those examples give
المعلومات
- البرنامج
- معدل البثيتم التحديث شهريًا
- تاريخ النشر٣١ يناير ٢٠٢٣ في ١:٠٠ ص UTC
- مدة الحلقة٣١ من الدقائق
- الموسم١
- الحلقة١١
- التقييمملائم