Estimating the carbon footprint of computer science conferences

Encadrants

  • Antoine Amarilli, INFRES
  • Emails: antoine.amarilli@telecom-paris.fr
  • Bureaux: N/A

Nombre d'étudiant par instance du projet:

  • Minimum: 3
  • Maximum: 5

Nombre d'instances du projet :

1

Sigles des UE couvertes et/ou Mots-clés :

DBLP, environment, carbon footprint, data analysis, geocoding, optimization

Logo/Favicon

project image

Description du projet :

The publication of research articles in computer science typically revolves around academic conferences, where authors present and publish their recent research. The largest conferences in artificial intelligence routinely gather tens of thousands of researchers each year 1 . However, travelling to conferences has a significant carbon footprint, which is at odds with the imperative of mitigating climate change 2 .

The goal of this project is to estimate the total carbon footprint of all conferences in computer science. These conferences are indexed by several resources, most notably DBLP 3 which lists over 10,000 conferences, but also WikiCFP 4 and general bibliographic databases like Crossref 5 . The location of the conference is often present in the metadata (though some geocoding will be necessary). To estimate the travel of conference attendees, one lower bound is that each accepted paper must typically be presented by one of the authors: an affiliation for authors is sometimes given by DBLP, sometimes by ORCID 6 ; more advanced ways to determine this can also be envisioned, e.g., by scraping articles on arXiv 7 or open-access publishers.

Objectifs du projet :

The concrete steps to start with this project are:

  • Investigate feasability: estimate how much of DBLP data contains location information for conferences and affiliation information for researchers, and whether it can be easily geocoded (converted to geographical coordinates) or not.
  • Investigate related work such as  [ 1 ] , [ 2 ] , or [ 3 ] .
  • Download or scrape data from DBLP to obtain a list of conferences, publications, and authors with their metadata information.
  • Use geocoding to locate the conferences and people for which location information is available.
  • Compute a first rough estimation of the order of magnitude of the total yearly emissions of computer science conferences, assuming that for every publication the geographically closest author travels to the conference.
  • Obtain more location information about conferences and authors by scraping other sources such as Crossref, HAL, or WikiCFP.
  • Show more precise bounds on the travel for every conference: find the least costly set of authors whose travel can cover all the papers published at the conference. Determine the computational complexity of this problem, as well as heuristics.
  • Handle overlapping conferences and chronologically consecutive conferences to take into account the fact that authors can travel to multiple events at once.

Références bibliographiques:

1
Laurent Feuilloley and Tijn de Vos. The environmental cost of our conferences: The CO2 emissions due to travel at PODC and DISC. ACM SIGACT News, 54(4), 2024. https://dl.acm.org/doi/abs/10.1145/3639528.3639537.
2
Jérôme Mariette, Odile Blanchard, Olivier Berné, Olivier Aumont, Julian Carrey, AnneLaure Ligozat, Emmanuel Lellouch, Philippe-Emmanuel Roche, Gaël Guennebaud, Joel Thanwerdas, et al. An open-source tool to assess the carbon footprint of research. Environmental Research: Infrastructure and Sustainability, 2(3), 2022. https://iopscience.iop.org/article/10.1088/2634-4505/ac84a4/meta.
3
Diomidis Spinellis and Panos Louridas. The carbon footprint of conference papers. PloS one, 8(6), 2013. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066508.