Prof. Sabina Leonelli (Professor of Philosophy and History of Science) talks to Dr Chris Tibbs, Research Data Officer at University of Exeter about open research, the use of Artificial Intelligence in research, and the importance of understanding the diversity of research environments when implementing open research practices. Podcast transcript Chris Tibbs: Hello and welcome. I'm Dr Chris Tibbs and I'm the University Research Data Officer, part of the open research team based in the library here at the University of Exeter. My role involves providing support for researchers across the university as they work with and manage their research data, and today I have the pleasure to be joined by Professor Sabina Leonelli, a Professor of Philosophy and History of Science at the University of Exeter. So welcome, Sabina. Just to start, would you like to tell us a little bit about the research area that you work in? Sabina Leonelli: Thank you and hello everyone. So, I'm interested in the dynamics of research and research processes. Why is it that people who work in science use the methods that they use, handle data in particular ways, decide to publish in particular ways? Why do they choose certain research goals, and how does that occur historically but also conceptually? And what are the social implications of those choices? Chris Tibbs: That's very interesting. So you're really looking at these sort of different approaches and the different methodologies that different researchers are taking and that's very interesting because obviously different research areas will have different approaches and methodologies that they use. Now one thing that I noticed that you're very interested in, based on your web profile, is obviously open science and openness in research, and the European Commission and the United Nations, among others, all use this term of open science and just so that everyone listening is clear, open science is the approach to research based on openness and co-operative working, and it really emphasises the sharing of knowledge, results and the tools as widely as possible. But I just wanted to point out also that obviously these approaches can apply to all research disciplines, not just science. And so, for example, we are the open research team. And so, I tend to regard open research and open science as synonymous. So, I just wanted to get your take on this, Sabina. Do you see these as separate terms, or do you use them interchangeably? Sabina Leonelli: I also tend to use them interchangeably, but I think it is very unfortunate that it’s the term open science that has gotten so much mileage in the English language because in the English language we are aware of the fact that it does tend to be taken to refer to the natural sciences, more rarely to the social sciences, and never to the humanities and the arts. And this is different for lots of other languages. I mean, most, I guess famously the term wissenschaft in German tends to encompass all of the research domains, including humanities and the arts. I’m very partial to that, partly because I think that we're in a moment where research is so interdisciplinary and the boundaries between domains are so blurred, that actually making strict distinctions between what counts as a humanist approach, and what counts as a natural science approach, or a mathematical approach is becoming more and more difficult. As of course in history it has been very difficult throughout. So yeah, so I'm very partial to the use of the idea of open research in English, but of course we tend to use a lot the term open science too, because this is, as you were saying, very well recognised by policymakers and by funding bodies and a lot of people working in academia more generally. Chris Tibbs: OK. Well, thank you for explaining that. And again, the reason I just wanted to confirm this is because I want to ensure that everyone listening can be clear that what we mean by the term open science and that they don't feel that this doesn't apply to them, maybe because they don't see themselves as a scientist. So that's what I just wanted to clear up, and so that these practises do apply to all disciplines. Now moving on. Sabina, you hold many different roles and one in particular that I would like to mention is that you are the theme lead for the data governance, openness and ethics strand of the Exeter Institute for Data Science and Artificial Intelligence. So, given this particular role, I'd really be interested to hear your thoughts on how you feel artificial intelligence can play a role in the research process, and particularly around openness and the open research. Sabina Leonelli: Yes, thank you. So, I guess openness lies at the heart of what it means to do research, no matter how you look at it, right. I mean, doing research basically means trying to answer a certain question, trying to solve a problem that you may have encountered in your everyday life and within the more scientific landscape, it means doing it in a way that's a bit more systematic, that is susceptible to scrutiny and can be evaluated by others. So, research and science are public enterprises pretty much by definition. If it was only something that, you know, the one individual does in their own room, then it wouldn't really be something that we count as being research or science. And in that sense, openness, the availability of the outputs of research, being able to discuss the methods one uses, the procedure one uses and make them available for scrutiny really are what defines the very idea of research. So, given this, of course, the fact that we now have an emergence of more and more artificial intelligent tools that can be pretty much directly applied to the research process affects the ways in which we think about openness, because this means accelerating the research process to some extent. It means the potential to automate some parts of it, or at the very least work together with machines so that some of the, if you want, hopefully at least more tedious tasks associated to research, the more repetitive ones, the ones that more easily standardised can actually be delegated to machines and then can be iteratively dealt with in collaboration with humans. So in that respect there is a strong temptation to think that bringing AI into research processes will almost automatically improve the openness of research because it will allow, and in fact it will be almost incentive for people to make their methods ever more transparent, to make their data more available, and to be more careful in noting down the procedures that they’re using and make them available to others because all of those strategies make it easier to adapt research work to AI and to make it machine readable if you want so that machines can actually take over some parts of that work. The problem in assuming that AI automatically enhances openness comes at several levels, however. First of all, there is this problem, I'm sure many people have heard about of opacity in AI. The fact that because a lot of the reasoning that machines go through to produce certain outputs, so the type of algorithms that are used, particularly machine learning, tend to become less and less transparent and more and more opaque as time goes by – precisely because the machine is doing operations that humans wouldn't quite do or wouldn't be able to follow in the same way, and we can't quite track every single step that the machine is making in that sense. Then actually the system that is AI powered, becomes by definition less open because it's less obvious how do we read the system, how do we make it more transparent, more scrutinizable and more open for review. Given that there are all these parts of the system which are not necessarily intelligible to humans. So that's one issue that is happening in the area. Another very big issue is the fact that many of the providers of artificial intelligence technologies and particularly tools that are then applied in research, we can only think about large language modelling and tools like Chat GPT, which is produced by a company which contrary to his name, is not publicly funded but is a private company, Open AI. Many of those tools are privately funded. The ways in which they operate is even less transparent because a lot of the algorithms that are used, are actually trademarked, are not available for public scrutiny. A lot of the training data that is used to refine those algorithms is also not necessarily transparent, in some cases not even entirely clear that the data are data that are, you know, in fact right to use, they tend to have been data scraped off the internet in a variety of ways that may or may not be ethically acceptable. And so, we are in a situation where whenever we pick a tool from the internet thinking, oh, great, this is going to help me to do my bibliography; this is going to help me to write my essays; this is going to help me to search for, you know, literary sources on a particular topic, we are immediately given data and relying on tools that are not openly accessible, that have been developed in a way which is not immediately scrutinizable. And in fact, it may be using some of our own information in a way that is not open. So, let's just say that there are quite a few questions that are open around whether the use of AI in research in fact is favouring openness, and whether that's going to happen more and more in the future or in fact it’s going to have the opposite effect. Chris Tibbs: Wow, that's really interesting. I really love that sort of yes, you might think that it's going to help make research more open, but actually in fact it's not so clear and so that is really interesting and I think, the one thing I just wanted to add is about obviously any data that's being used by AI systems, the data need to be well maintained, well documented. You need good data going into training t