Dean’s Blog: Data Science Ten Years from Now

June 29, 2022

I am currently undergoing my 5-year Dean’s review; standard practice applied to all University of Virginia (UVA) leadership. As part of that process one of the reviewers asked me whether I saw data science existing as a discreet field in ten years, which could be interpreted as, will we really need a School of Data Science at UVA in ten years time? 

It’s a good question and my answer, after some thought,  is biased by my own background in bioinformatics, later computational biology, then systems biology, and now biomedical data sciences as I have previously written. My recommendation in that paper was to forget about what it is called and just do it. While that seems an appropriate answer for that particular field and audience of life scientists, it does not seem an appropriate answer for a school which covers a broad set of disciplines (a.k.a. verticals) and is a distinct organizational structure within a university framework. Nor does it seem a fair answer to those faculty who have or seek tenure in such a school and the many dedicated staff who make it run. 

Another way of addressing the reviewer’s question is, what exactly is data science today and how do we see the School of Data Science supporting that definition? And, will that need hold-up over time as other disciplines [and fields of study] become more digitally data-driven and practice data science themselves? 

Now we have broken the question down lets consider the two interrelated parts: First, what is data science and will that definition change in ten years? Second, will the organizational structure embodied by a School of Data Science applying that definition still be needed and relevant? Let’s take them one at a time.

Will the Definition of Data Science Change?

There are multiple varying definitions of data science. For us, it is defined by the 4+1 model of data science which is described in the same article I referenced above. In brief, we subscribe to 4 domains of data science:

  1. Systems - The hardware and software architectures that permit the study.
  2. Analytics - The methods, such as machine and deep learning that are applied.
  3. Design - The interrelationship between the human and computer as it relates to data.
  4. Value - The tension that exists between the societal good that comes from the research and the harm that can be done.

While the tools behind these four domains will change and the processes evolve, the conceptualization seems likely to be robust over time. Nevertheless, part of our School’s five-year strategic plan is to evaluate the durability of the model. Will the Domains of Data Science model make sense to us, other scholars, and those that employ our students in years to come? 

At the intersection of the four domains are the disciplines, the “+1” . This seems to be the most volatile part of the model as disciplines as diverse as religious studies, economics, chemistry, English literature, and many others, are embracing the four domains of data science as part of their extended set of research and teaching activities. Over time, if all these disciplines are embracing this, or some other model of data science, then,  do we need a separate school of data science? In other words, what does a school of data science contribute that is not found elsewhere within the university organization? The answer to this question lies in the current organization structure of academic institutions and the fact they are not likely to change significantly in the next ten years, even though, in my opinion, they should.

Organization of Academic Institutions

Academic institutions are traditionally organized around schools and departments. These provide faculty with a sense of belonging around long-standing traditional areas of study like those mentioned above. Conversely, this organizational structure is constraining as we do not study exclusively within one of those areas. Academia gets around this by having dual faculty appointments across schools and departments, or by creating centers and institutes, all of which are multi-disciplinary, encouraging research, cross-pollination and collaboration, yet frequently struggle to exist. These work-arounds in a fairly rigid framework are challenged due to competing interests, guarded resources, inter-departmental policies, and lack of central funding.

The beauty of data science - and a school bearing its name - is that if done right, by nature it is free from such constraints. As I have previously written, the idea of departments in our school does not make sense as we subscribe to the philosophy of being a “Schools Without Walls” to promote a sense that something different is going on relative to a traditional school. That difference is embodied in a Collaboratory model involving the School of Data Science and another school (e.g., the Collaboratory for Educational Analytics) or centers (e.g., the Intelligent Analytics and Policy Institute) focusing on a specific topic that is the responsibility of the school. In a heady moment, I might even say we represent how a university of the future writ large should identify and be organized. 

The School of Data Science is actively working across multiple disciplines with the intent that while data science will increasingly infiltrate all schools and departments, we are the glue and the leader where the exchange of best practices takes place. We aim to be the place where disparate data, methods, protocols, and workflows from many disciplines can be shared and exchanged in a way that would just not happen in a traditional university environment. A place that has the expertise and the students to participate broadly across disciplines. A place that trains students to be renaissance scholars. Therefore, until universities change, schools of data science will be needed. I suspect that will be for a lot longer than ten years.

Author

Stephenson Dean