University of California Libraries: Research Data Matters

University of California Libraries: Research Data Matters

October 13, 2019 1 By Stanley Isaacs


Something becomes data when it is used as evidence of some phenomena for some research purpose. And that allows us to recognize that we’re surrounded by data. Data is everywhere and nowhere at the same time. Everything in this room, the lighting, the people, the dust, the sounds, can be treated as data if we chose to but they’re simply part our atmosphere until we do so. Anything that you’re using to help answer research questions becomes your data. what is it that you’re basing your answers on and your methodology is how you get those answers from the data. I’m an Arts professor, so we define our research differently depending on if we’re involved in practice or critical studies. In critical studies our research is very much the same as other people in the arts and humanities. It’s words, text sometimes accumulated into coherent articles, sometimes not. It can also be images and sound. As far as production is concerned, the research objects are quite often moving images, formerly analog, these days increasingly video. Data management isn’t something we talked about it until recently. I mean, certainly when I was going through graduate school my early years we never talked about it. Grants liked you to make your data available somewhere eventually but it wasn’t pushed that much. Now it’s pushed much more. For me what it means is first of all, recognition that when we’re doing research, we don’t own the data So typically for university professors the data are owned by the board of regents where we are
because we’re doing as our work as our capacity as professors or overseeing the work of graduate students. If it’s grant funded data from Federal grants, state grants, it’s taxpayer money and so we have a responsibility to manage the data such that other people get to use it afterwards. We all want to exploit our own data in better ways and make them useful not only for ourselves but if we can to other people. Most researchers are plagued with the problems
of what happens when their graduate students graduate, when their post docs leave,
when their other research staffers move on, or, when the software changes, the tools change, the instruments change. It’s very difficult to migrate and sustain data so you can combine them and reuse them over long periods of time. Where we would like to do data release 1, 2 and 3, what usually happens is it’s versions of data, grad student 1, grad student 2, grad student 3 This is a problem that everyone has, and the
sooner the researchers and principal
investigators think about this and start to build some management structures that
will let them sustain data over time, the better off they’ll be. I got a grant from the National Institute of Justice to reanalyze some existing data. In that case, we had the data available to us but it was incomplete. There was so much missing data – there were real problems with it – that we never ended up being able to publish anything from it. This was a quantitative data set. It had been maintained. The data had been archived, but no one in those days was watching to make sure that it came in and you had enough quality so that you could really work with it. I don’t think people who are in research or scholarship understand that they’re not just speaking to their students and their peers. They’re also speaking to the world. You can go ahead and try to write a book for a popular publisher, outside the university press world but, if you let your materials be seen and reused by members of the public who may not have academic credentials, it gives your work legs. It gives your work a level of justification and a level of gravity that it would otherwise not have. The library is developing a larger and larger group that has expertise across different kinds of disciplines, different domains, different structures, that can help people in
managing their data better. I am delighted to see that the library here is taking the lead on this. Between libraries and research offices, that’s a major place to get started in getting the word out to everyone
And, I think universities have a real responsibility. The University of California is taking the lead and talking about open access publications and data management. That we need to make our data and our publications available. These are part of
the public good. Education is a public good, and we need to make the fruits of that education or that research available to others.