Skip to content

Right to Read, Right to mine

10 Apr, 2015

Reading, assimilating and making sense of the mountain of literature available in any scientific field is a daily challenge for researchers. Earlier this year, the Wellcome Trust held a workshop to explore the available tools and technologies to help researchers mine this information, and their potential to accelerate biomedical research and innovation. Holly Baines, Policy Officer at the Wellcome Trust, shares some of the insights gained.

With the growing avalanche of new scientific research data and papers published every day, traditional methods to identify, appraise and assimilate high quality research evidence into a systematic review of a field are increasingly no longer fit-for-purpose. The sheer volume of information being generated means that manual approaches to analysis are increasingly becoming unviable. Reviews are almost immediately out of date as new knowledge accumulates so rapidly.

Unlike humans, computers have an almost unlimited capacity to process and analyse this information.  So-called data and text mining tools offer the potential to help researchers extract relevant nuggets of information and to identify links and associations that would otherwise have been difficult or impossible to determine.

The Wellcome Trust has a strong commitment to maximise the availability and use of research publications and data.  In light of the increasing importance of computational tools to help derive value from this research, we convened the workshop to explore the available technologies and their potential uses.The workshop brought together a select group of leading experts from the fields of data and text mining, web data platforms, publishers, research funders, and pre-clinical and clinical scientists.

It was clear from the outset that keeping track of the literature and identifying papers of interest was an ever-increasing challenge across the biomedical sciences. During the first session of the meeting, we heard about the computational tools and platforms that could play a role in helping researchers to address these challenges and extract additional value: from developments in text mining and human machine symbiosis; to new approaches for crowd-sourcing and online scientific search engines.



The discussion then turned to the applications.  One area of real potential for these tools is to help enable ‘living systematic reviews’, which allows you to incorporate  newly published research on an on-going basis. Such reviews could help academic researchers and pharmaceutical companies make better-informed decisions about the need for further experiments before moving to costly clinical trials. They could also play a role in helping to increase the efficiency of the research enterprise, and avoid unnecessary duplication and waste.

Participants highlighted a varied range of other potential application areas for these tools – ranging from enabling gene and protein function prediction, through to novel uses in clinical psychiatry and in enabling the move towards personalised drug treatments in the pharmaceutical industry.

Throughout the day, there was lively discussion and debate around the opportunities and key road blocks to using these tools. Publication bias, different academic standards and non-standardised datasets were some of the challenges mentioned. However, it was also pointed out that no single technology is without its difficulties and community platforms can be used to help individuals compare tools and tailor these to their specific needs.

The overarching message was that there is a great appetite for automated data and text extraction tools to help scientists filter and make sense of relevant literature.  These hold great potential in biomedical research if the right tools are used appropriately for specific tasks, and there was a strong desire to begin piloting and evaluating their application.

Building on these discussions, the Wellcome Trust is keen to promote the wider uptake of text mining tools in the research community, and to help overcome the existing barriers in copyright law which constrain their use.  Over the last three years, we worked with a broad coalition of partners to advocate for an exception in UK law to enable text and data mining for non-commercial research purposes – which was eventually enacted last year.  We are now working with partner organisations in the research and library sectors to support the call for the same exception to be made throughout the European Union.

We also hosted a ‘hack’ event run by ContentMine, to train researchers in using these tools and then let them loose on the literature to see what they can do with them.  In just a few hours, our ‘hackers’ developed new text mining strategies to query the literature in areas as diverse as cancer genetics and bio-engineering.  The event closed with a policy session, at which representatives of major funders and library organisations met with the hackers to discuss next steps for advocating European copyright reform and promoting the wider use of these tools in the research community.  A fuller summary of the hack event is available on Peter Murray Rust’s blog.

To find out more about the potential of data and text mining, please see these short video clips from experts who attended our workshop, and check out our new policy spotlight page on data and text mining.

Image credit: Data path by r2hox via Flickr; CC-BY-SA

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: