«Prev3 ways to speed up dissertation writing NextGrowing Together: Collaboration and Technology»

Download the full report here

The Web of ScienceTM has always served as a chronicle of research, elucidating the threads and interconnections of scientific and scholarly inquiry over time. Recently, a group of scientists has begun harnessing Web of Science data for a slightly different end: improving the conduct of research itself. Via automated analysis of countless research reports, this work is pointing the way toward greater efficiency and more precisely targeted effort, raising the promise of faster progress toward curative therapies and other advances.

“There are a number of things you can use from the scientific literature of the past to help direct the future,” says James A. Evans, director of Knowledge Lab, part of the Computation Center at the University of Chicago. “That’s what we’re trying to do.”

As part of this effort, Web of Science files – hosted in the Knowledge Lab’s secure data enclave – undergo advanced analytic scrutiny in order to establish precise connections based on the subject matter and findings of millions of individual papers – far more material, of course, than human readers could usefully digest. Ultimately, the connections form networks in which links between the various nodes can be discerned. This high-level view of the dynamics of research affords insights into a range of questions, such as how different researchers have proposed and investigated various scientific claims, and which of their approaches have accelerated the accumulation of new knowledge. These insights, in turn, can inform current research.

Evans, along with his Knowledge Lab colleagues and their fellow members of the large, multi-institution collective known as the Metaknowledge Research Network, is advancing an emergent discipline known as the “science of science.” This field seeks to understand on a deeper level how research works, in terms of such standard practices as identifying a research topic, framing a hypothesis, and carrying out experimental investigation. For practitioners like Evans, the challenge is to turn their analysis into practical improvements that will benefit not only scientists but funding agencies, administrators, and everyone with a stake in the vitality and progress of research.

Molecular Markers

In research published in 2015, for example, Evans and colleagues used natural language processing and other computerized analysis to scan the contents of millions of biomedical papers, searching for the names of individual molecules under investigation. Groupings of molecules constituted thousands of nodes in a self-generating network, with the links between these nodes demonstrating which papers tended to cover the same material and which branched off in new directions.

Based on their analysis, Evans and colleagues concluded that most scientists, due to career-related or institutional pressures, tend to be conservative in their approach to research. Fresh findings and progress, however – as denoted by new links between the molecular nodes in the network over time – tend to result from a less risk-averse approach.

“It happens that, as a field matures, the most efficient experiment for science is not the most efficient for scientific careers,” says Evans. “It’s more efficient for science to pick risky experiments, but scientists are evaluated on their personal productivity, so there’s no incentive structure for scientists to more systematically pick risky experiments, even if it would be the best thing for the field.”

Assessing scientific claims – for example, that a given drug regulates a given gene product in a certain way – can point toward better-designed experiments, says Evans. A claim made by 10 researchers, all in the same lab and all using basically the same experimental setup, is less compelling than a claim generated independently by five socially independent groups using different research protocols.

“In the second case,” says Evans, “you might have more confidence that the claim is valid – there’s more independent experimental support. And that’s an observation that resources like the Web of Science make possible. You can basically trace out the structure of the scientific communities that are generating a very precise, particular piece of knowledge, and use that to either assess its relative likelihood given a new experiment, or actually help run new experiments. This capacity can also suggest areas that would be promising, but the community of scientists hasn’t yet studied them.”

Across Boundaries

Another benefit of the work by Evans and colleagues is the surmounting of boundaries between disciplines – a task made easier with citation data. For example, researchers in human and mammalian biology are much less likely to review work in, say, plant biology, than are plant biologists to refer to research in humans and mammals.

“What happens is that questions from human biology enter plant biology, but discoveries and advances in plant biology are far less likely to enter human biology,” says Evans. “So you have cases where researchers in human and mammalian biology will discover something that’s been known in plant biology for decades, and then they’ll get a Nobel Prize.

“By building these networks with citation data and other tools, you can identify questions or possibilities that have been discovered in certain areas but, because of biases in the attention of various specialty fields, are just never given air time in places where they potentially could have had a lot of value. Researchers could have been asking those questions, in some cases, 50 years earlier. We can now discern and overcome a lot of those boundaries with resources like the Web of Science, through citation patterns, co-authorship networks, and even the patterns of shared distributions of words.”

Evans and his Knowledge Lab colleagues, along with the Metaknowledge Research Network, continue their work to advance the science of science. One large project is a DARPA-funded study to determine the most efficacious cocktail of anti-cancer agents. Other work is pointed toward building effective teams and choosing the optimal experiment.

Evans’s hope is that, ultimately, this work will continue to prove itself and will earn more adherents. “You win over science with science,” he says. “If huge stakeholders such as the NIH and the Department of Defense – these concentrated decision-makers who control enormous quantities of the resources behind entire fields – if they want to organize things differently, it can happen.”

As Evans notes, the proper tools have made all the difference. “Looking at science as a system allows you to think about portfolios of projects and portfolios of hypotheses in a way that was just not accessible before these kinds of resources, and Web of Science is the premier arrow in that quiver.”