Juan Manuel Torres Moreno, senior scientist, Computer Science Laboratory LIA, Université d’Avignon, France

The science behind ScioWire — Interview with Juan-Manuel Torres-Moreno

An interview with Juan-Manuel Torres-Moreno, senior scientist at the Computer Science Laboratory LIA, at Université d’Avignon, France.

State of the art summarisation technologies are the key to SciencePOD’s latest innovation: ScioWire, a newsfeed bringing together the latest open research, developed in collaboration with Université d’Avignon. To better understand the role automated summarisation plays in supporting scientists and knowledge economy professionals, let’s explore the summarisation processes that underpins the ScioWire newsfeed. 

In a remote video interview, SciencePOD asked Juan-Manuel Torres-Moreno, Senior Scientist and Lecturer HDR with the Laboratoire Informatique d’Avignon (LIA) at the Université d’Avignon,  about his collaboration with SciencePOD. We also asked him why summarisation matters, and how it can offer productivity solutions for scientists and knowledge professionals. 

Bridging the science communication gap

Torres-Moreno explains that scholarly communication normally adheres to a common format, observing that “scientists tend to write in a very particular manner”, whereas “the scientific domain itself is very particular, with its measurements, tables, figures and mathematical equations.” 

“Being a good scientist is not [always] synonymous with being a good communicator,” Torres-Moreno points out, “and therefore our system is attempting to bridge the gap, among other things.” 

It’s certainly true that the content of a scientific paper is not always directly accessible to those outside its field of research, which is where contextualised summaries come in. 

Being a good scientist is not [always] synonymous with being a good communicator and therefore our system is attempting to bridge this gap, among other things.”

ScioWire includes this contextual information by default with every summary in the newsfeed. This boosts accessibility for non-experts by giving an overview of the landscape in this field and clarifying technical terms. Each summary includes auto-extracted keywords, a guide to technical acronyms and lay definitions of specialist terms, and aims to aid interdisciplinary research. 

SciencePOD wanted to collaborate with Torres-Moreno to draw on his expertise in natural language processing; specifically, in automatic summarisation. He describes the work he did as part of the collaboration with SciencePOD as “a real challenge”, especially when “transposing purely theoretical algorithms into practical applications.” 

Unexpected challenges arose, including issues with the PDF format common to many papers. This was not ideal for auto-summarisation, and needed its formatting stripped before summarisation work could begin.  

Overcoming this challenges and numerous others was well worth the effort, though, as Torres-Moreno developed the tools to summarise research studies on a large scale. Indeed, the collaboration between SciencePOD and his laboratory, LIA, led to “real progress, at the technical level as well as at the scientific level for the automated summarisation field, in particular.”     

[We] selected the extractive [summarisation] method … [as] abstractive summarisation tends to modify original sentences a bit too much … In scientific domains, we cannot create or produce [summarised] sentences that are too creative.”

Extractive VS abstractive

Part of the project’s success hinged on the choice of summarisation method. 

“The selected method is extractive summarisation,” explains Torres-Moreno. He combined this approach with additional post-summarisation processing to deliver a clear, coherent summary. This processing included developing a method for analysing, assembling and shortening sentences from the original study, retaining the meaning of the original text. 

“Although there are other approaches to produce summaries, namely abstractive summarisation, these systems tend to perhaps modify original sentences a little too much,” he adds. “In the scientific domains, we cannot … create or produce [summarised] sentences that are too creative.”

Unique features

So what makes ScioWire’s summarisation unique? 

“The originality of our system lies in the adequate management of extractive summarisation tools,” says Torres-Moreno, “coupled with a method of generating pertinent keywords, which are themselves guiding the summarisation process.” 

ScioWire summaries “elucidate keywords and technical, scientific and novel terms, using an appropriate ontology.” The algorithm gathers all relevant terms from the original study to create this source. 

ScioWirebeta launched at the Frankfurt Book Fair 2022. It has been thrilling to see how scientists, clinicians and other knowledge economy professionals benefit from this novel productivity tool. 

Start your free ScioWire trial here:

Discover the ScioWire research newsfeed: summarised scientific knowledge ready to digest.