Skip to the content.

Search Queries for “Mapping Research Output to the Sustainable Development Goals (SDGs)”

Select one of the SDG icons to go to the latest Search Query.

SDG01 SDG02 SDG03 SDG04 SDG05 SDG06
SDG07 SDG08 SDG09 SDG10 SDG11 SDG12
SDG13 SDG14 SDG15 SDG16 SDG17  

This package contains machine readable (xml) search queries, for the Scopus publications database, to find domain specific research output that are related to the 17 Sustainable Development Goals (SDGs).

Click here for HUMAN READABLE version | Click here for PROJECT WEBSITE

Please note: we are currently working on a multi-lingual SDG classifier; training an AI using the data from these queries.

A global effort in "Mapping research output to the Sustainable Development Goals". An initiative by the Aurora Universities Network illustration of XML and translation in HTML

Introduction

Sustainable Development Goals are the 17 global challenges set by the United Nations. Within each of the goals specific targets and indicators are mentioned to monitor the progress of reaching those goals by 2030. In an effort to capture how research is contributing to move the needle on those challenges, we earlier have made an initial classification model that enables to quickly identify what research output is related to what SDG. (This Aurora SDG dashboard is the initial outcome as proof of practice.)

History & Intended Usage

The initiative started from the Aurora Universities Network in 2017, in the working group “Societal Impact and Relevance of Research”, to investigate and to make visible 1. what research is done that are relevant to topics or challenges that live in society (for the proof of practice this has been scoped down to the SDGs), and 2. what the effect or impact is of implementing those research outcomes to those societal challenges (this also have been scoped down to research output being cited in policy documents from national and local governments an NGO’s).

Method

The classification model we have used are 17 different search queries on the Scopus database composed from 169 sub-quesies, one for each target. The search queries are elegant constructions with keyword combinations, boolean- and proximity operators, in the syntax is specific to the Scopus Query Language. We have used Scopus because it covers more research area’s that are relevant to the SDG’s, and we could filter much easier the Aurora Institutions.

Accuracy (Precision and Recall)

In order to validate the accuracy of the classification model, we have tested these queries in a survey against a panel of 244 researchers. The survey was open from October 2019 till January 2020, and captured data from 244 respondents in Europe and North America.

The Survery data can be found here: Survey data of “Mapping Research output to the SDGs” by Aurora Universities Network (AUR) doi:10.5281/zenodo.3798385

We used that data to receive input for improvement, and to measure the accuracy.

Regarding accuracy we found the version 5 queries to have an average Precision of 70% (70% of the publications are truely related to that SDG; papers selected by the researchers), and an average Recall of 14% (14% of the publications are found in a ‘true’ corpus; papers suggested on forehand by the researchers).

The Evaluation report can be found here: Schmidt, Felix, & Vanderfeesten, Maurice. (2021). Evaluation on accuracy of mapping science to the United Nations’ Sustainable Development Goals (SDGs) of the Aurora SDG queries (v1.0.2). Zenodo. https://doi.org/10.5281/zenodo.4964606.

precision and recall graph

precision and recall table

Versions

Different versions of the search queries have been made over the past years to improve the precision (soundness) and recall (completeness) of the results. The queries have been made in a team effort by several bibliometric experts from the Aurora Universities. Each one did two or 3 SDG’s, and than reviewed each other’s work.

version date changes download
5.0.3 July 2021 typo’s and inconsistencies improved thanks to community feedback GitHub
5.0 June 2020 ‘improved’ version. In order to better reflect academic representation of research output that relate to the SDG’s, we have added more keyword combinations to the queries to increase the recall, to yield more research papers related to the SDG’s, using academic terminology. We mainly used the input from the Vanderfeesten, Maurice, Spielberg, Eike, & Gunes, Yassin. (2020) Survey data of “Mapping Research output to the SDGs” by Aurora Universities Network (AUR). We ran several text analyses: Frequent term combination in title and abstracts from Suggested papers, and in selected (accepted) papers, suggested journals, etc. Secondly we got inspiration out of the Elsevier SDG queries Jayabalasingham, Bamini; Boverhof, Roy; Agnew, Kevin; Klein, Lisette (2019), “Identifying research supporting the United Nations Sustainable Development Goals”, Mendeley Data, v1. And thirdly we got inspiration from this controlled vocabulary containing closely related terms. Duran-Silva, Nicolau, Fuster, Enric, Massucci, Francesco Alessandro, & Quinquillà, Arnau. (2019). A controlled vocabulary defining the semantic perimeter of Sustainable Development Goals (Version 1.2) [Data set]. Zenodo. GitHub / Zenodo
4.0 August 2019 uniform ‘split’ version. Over the course of the years, the UN changed and added Targets and indicators. In order to keep track of if we missed a target, we have split the queries to match the targets within the goals. This gives much more control in maintenance of the queries. Also in this version the use of brackets, quotation marks, etc. has been made uniform, so it also works with API’s, and not only with GUI’s. His version has been used to evaluate using a survey, to get baseline measurements for the precision and recall. Published here: Survey data of “Mapping Research output to the SDGs” by Aurora Universities Network (AUR) doi:10.5281/zenodo.3798385 GitHub / Zenodo
3.0 May 2019 ‘echo chamber’ version. We noticed that using strictly the terms that policy makers of the UN use in the targets and indicators, that much of the research that did not use that specific terms was left out in the result set. (eg. “mortality” vs “deaths”) To increase the recall, without reducing precision of the papers in the results, we added keywords that were obvious synonyms and antonyms to the existing ‘strict’ keywords. This was done based on the keywords that appeared in papers in the result set of version 2. This creates an ‘echo chamber’, that results in more of the same papers. GitHub / Zenodo
2.0 March 2018 Reviewed ‘strict’ version. Same as version 1, but now reviewed by peers GitHub / Zenodo
1.0 January 2018 Initial ‘strict’ version. In this version only the terms were used that appear in the SDG policy text of the targets and indicators defined by the UN. At this point we have been aware of the SDSN Compiled list of keywords, and used them as inspiration. Rule of thumb was to use keyword-combination searches as much as possible rather than single-keyword searches, to be more precise rather than to yield large amounts of false positive papers. Also we did not use the inverse or ‘NOT’ operator, to prevent removing true positives from the result set. This version has not been reviewed by peers. GitHub / Zenodo

Contribute and improve the SDG Search Queries

We welcome you to join the Github community and to fork, branch, improve and make a pull request to add your improvements to the new version of the SDG queries. https://github.com/Aurora-Network-Global/sdg-queries

Queries in XML - Human editable, Machine readable, Version controllable

We do have made all the queries in XML format from version 4 onward, but also referred it back to version 1. This enables:

Translation to HTML (XSL) and a descriptive schema (XSD) are available in this package.

License and Attribution, Acknowledgements and how to Cite

Acknowledgements

We would like to thank the presidents of the Aurora Universities for their support and executive representation in this project. Also we would like to thank all researchers involved to have shared their expertise. Many people from within the Aurora University Network were involved making these query improvements possible.

If you want to tribute this hard work, please reuse these SDG queries to create or improve your services and share your outcomes.

Do so by respecting the license and attribute the contributors.

Please cite these search queries sets as follows:

Search Queries for “Mapping Research Output to the Sustainable Development Goals (SDGs)” v5.0 by Aurora Universities Network (AUR) doi:10.5281/zenodo.3817445

Search Queries for “Mapping Research Output to the Sustainable Development Goals (SDGs)” v4.0 by Aurora Universities Network (AUR) doi:10.5281/zenodo.3817443

Search Queries for “Mapping Research Output to the Sustainable Development Goals (SDGs)” v3.0 by Aurora Universities Network (AUR) doi:10.5281/zenodo.3817437

Search Queries for “Mapping Research Output to the Sustainable Development Goals (SDGs)” v2.0 by Aurora Universities Network (AUR) doi:10.5281/zenodo.3817433

Search Queries for “Mapping Research Output to the Sustainable Development Goals (SDGs)” v1.0 by Aurora Universities Network (AUR) doi:10.5281/zenodo.3817352

License for reuse:

Creative Commons Attribution 4.0 International License

License Attribution when reusing this data:

Note: different people contributed to different versions.

Contributors, full list

Below you’ll see the full list of contributors to this project. With out them this was not possible.

Full name Organisation Name Org Abr ORCiD Project role SDG query v1 SDG query v2 SDG query v3 SDG query v4 SDG query v5
Vanderfeesten, Maurice Vrije Universiteit Amsterdam VUA 0000-0002-5119-3514 Project leader x x x x x
Spielberg, Eike University of Duisburg-Essen UDE 0000-0002-3333-5814 Workpackage leader x x x x x
Otten, René Vrije Universiteit Amsterdam VUA 0000-0002-6485-8810 Project member x x x x x
Strong, Nykohla University of Aberdeen UAB 0000-0002-6137-591X Project member x x x x x
Schmidt, Felix University of Duisburg-Essen UDE   Project member x x x x x
Zarioh, Baldvin University of Iceland UIC 0000-0001-9317-2597 Project member   x x x x
Vercueil, Didier Université Grenoble-Alpes UGA   Project member x x x x x
Guns, Raf University of Antwerp UAN 0000-0003-3129-0330 Project member   x x x  
Arienzo, Alessandro Università degli Studi di Napoli Federico II UNA 0000-0002-2867-5363 Project member         x
Delle Donne, Roberto Università degli Studi di Napoli Federico II UNA 0000-0001-8331-9436 Project member         x
Salvadó Estivill, Ignasi Universitat Rovira i Virgili URV   Project member         x
González Ugarte, José Luis Universitat Rovira i Virgili URV   Project member         x
Mikki, Susanne University of Bergen UBE 0000-0001-7078-7126 Project member   x      
Hasse, Linda University of Duisburg-Essen UDE   Project member       x x
Green, Adam University of East Angia UEA 0000-0002-7397-6579 Project member x        
Sesma, Ane University of East Angia UEA 0000-0003-3982-8932 Project member     x   x
Farrar, Jaqui University of East Angia UEA   Project member x x      
Kullman, Lars University of Gothenburg UGO 0000-0002-8871-7887 Project member x x x x  
Gaigg, Friedrich University of Innsbruck UIN   Project member         x
Grijp, Nicolien van der Vrije Universiteit Amsterdam VUA 0000-0002-5119-3514 Project member         x
Gunes, Yasin Vrije Universiteit Amsterdam VUA   Project member         x
Besselaar, Peter van den Vrije Universiteit Amsterdam VUA 0000-0002-8304-8565 Supervisor     x x x
Both, Joeri Vrije Universiteit Amsterdam VUA   Supervisor   x x x x
Kouwenaar, Kees AURORA University Network AUR   Sponsor x x x x x
Beukering, Pieter van Vrije Universiteit Amsterdam VUA 0000-0001-7146-4409 Sponsor x x x x  

For an updated version see our SDG knowledge base

works related
http://ap-unsdsn.org/webinar-mapping-university-contributions-to-the-sdgs/ A compiled list of keyword used by the Sustainable Development Solutions Network in the pacific to mapping university contributions to the Sustainable Development Goals (SDGs)
Vanderfeesten, Maurice, Spielberg, Eike, & Gunes, Yassin. (2020) Survey data of “Mapping Research output to the SDGs” by Aurora Universities Network (AUR). This contains publications hand picked by researchers that relate to an SDG. We ran several text analyses: Frequent term combination in title and abstracts from Suggested papers, and in selected (accepted) papers, suggested journals, etc.
Jayabalasingham, Bamini; Boverhof, Roy; Agnew, Kevin; Klein, Lisette (2019), “Identifying research supporting the United Nations Sustainable Development Goals”, Mendeley Data, v1. These Elsevier SDG queries are used in the Times Higher Education Impact Ranking. This has more recall but is less precise.
Duran-Silva, Nicolau, Fuster, Enric, Massucci, Francesco Alessandro, & Quinquillà, Arnau. (2019). A controlled vocabulary defining the semantic perimeter of Sustainable Development Goals (Version 1.2). Zenodo. A controlled vocabulary containing closely related terms, based on word vectors.