Skip to main content

Publications

Peer-reviewed scientific publications 

1. Rizzetto, E., Peroni, S. (2024). Mapping bibliographic metadata collections: the case of OpenCitations Meta and OpenAlex. In: CEUR Wokshop Proceedings, vol 3643, 20th Conference on Information and Research Science Connecting to Digital and Library Science (IRCDL 2024), Bressanone, Italy. https://ceur-ws.org/Vol-3643/paper15.pdf.  Also available in Open Access at https://arxiv.org/abs/2312.16523

Abstract
This study describes the methodology and analyses the results of the process of mapping entities between two large open bibliographic metadata collections, OpenCitations Meta and OpenAlex. The primary objective of this mapping is to integrate OpenAlex internal identifiers into the existing metadata of bibliographic resources in OpenCitations Meta, thereby interlinking and aligning these collections. Furthermore, analysing the output of the mapping provides a unique perspective on the consistency and accuracy of bibliographic metadata, offering a valuable tool for identifying potential inconsistencies in the processed data.


2. Massari, A., Mariani, F., Heibi, I., Peroni, S., Shotton, D. (2024). OpenCitations Meta. Quantitative Science Studies 1-26. https://doi.org/10.1162/qss_a_00292. Also available in Open Access at https://arxiv.org/abs/2306.16191.  

Abstract
OpenCitations Meta is a new database for open bibliographic metadata of scholarly publications involved in the citations indexed by the OpenCitations infrastructure, adhering to Open Science principles and published under a CC0 license to promote maximum reuse. It presently incorporates bibliographic metadata for publications recorded in Crossref, DataCite, and PubMed, making it the largest bibliographic metadata source using Semantic Web technologies. It assigns new globally persistent identifiers (PIDs), known as OpenCitations Meta Identifiers (OMIDs) to all bibliographic resources, enabling it both to disambiguate publications described using different external PIDS (e.g., a DOI in Crossref and a PMID in PubMed) and to handle citations involving publications lacking external PIDs. By hosting bibliographic metadata internally, OpenCitations Meta eliminates its former reliance on API calls to external resources and thus enhances performance in response to user queries. Its automated data curation, following the OpenCitations Data Model, includes deduplication, error correction, metadata enrichment, and full provenance tracking, ensuring transparency and traceability of data and bolstering confidence in data integrity, a feature unparalleled in other bibliographic databases. Its commitment to Semantic Web standards ensures superior interoperability compared to other machine-readable formats, with availability via a SPARQL endpoint, REST APIs, and data dumps.

3. Koloveas, P., Chatzopoulos, S., Tryfonopoulos, C., Vergoulis, T. (2023). BIP! NDR (NoDoiRefs): A Dataset of Citations from Papers Without DOIs in Computer Science Conferences and Workshops. In: Alonso, O., Cousijn, H., Silvello, G., Marrero, M., Teixeira Lopes, C., Marchesin, S. (eds) Linking Theory and Practice of Digital Libraries. TPDL 2023. Lecture Notes in Computer Science, vol 14241. Springer, Cham. https://doi.org/10.1007/978-3-031-43849-3_9. Also available in Open Access at https://arxiv.org/abs/2307.12794

Abstract
In the field of Computer Science, conference and workshop papers serve as important contributions, carrying substantial weight in research assessment processes, compared to other disciplines. However, a considerable number of these papers are not assigned a Digital Object Identifier (DOI), hence their citations are not reported in widely used citation datasets like OpenCitations and Crossref, raising limitations to citation analysis. While the Microsoft Academic Graph (MAG) previously addressed this issue by providing substantial coverage, its discontinuation has created a void in available data. BIP! NDR aims to alleviate this issue and enhance the research assessment processes within the field of Computer Science. To accomplish this, it leverages a workflow that identifies and retrieves Open Science papers lacking DOIs from the DBLP Corpus, and by performing text analysis, it extracts citation information directly from their full text. The current version of the dataset contains more than 510K citations made by approximately 60K open access Computer Science conference or workshop papers that, according to DBLP, do not have a DOI.

4. Chatzopoulos, S., Vichos, K., Kanellos, I., Vergoulis, T. (2023). Piloting Topic-Aware Research Impact Assessment Features in BIP! Services. In: Pesquita, C., et al. The Semantic Web: ESWC 2023 Satellite Events. ESWC 2023. Lecture Notes in Computer Science, vol 13998. Springer, Cham. https://doi.org/10.1007/978-3-031-43458-7_15. Also available in Open Access at https://arxiv.org/abs/2305.06047. 

Abstract
Various research activities rely on citation-based impact indicators. However these indicators are usually globally computed, hindering their proper interpretation in applications like research assessment and knowledge discovery. In this work, we advocate for the use of topic-aware categorical impact indicators, to alleviate the aforementioned problem. In addition, we extend BIP! Services to support those indicators and showcase their benefits in real-world research activities.

5. Santos, E.A.d., Peroni, S. and Mucheroni, M.L. (2023). An analysis of citing and referencing habits across all scholarly disciplines: approaches and trends in bibliographic referencing and citing practicesJournal of Documentation, Vol. 79 No. 7, pp. 196-224. https://doi.org/10.1108/JD-10-2022-0234. Also available in Open Access at https://doi.org/10.48550/arXiv.2202.08469

Abstract
Purpose: In this study, the authors want to identify current possible causes for citing and referencing errors in scholarly literature to compare if something changed from the snapshot provided by Sweetland in his 1989 paper.
Design/Methdology/Approach: The authors analysed reference elements, i.e. bibliographic references, mentions, quotations and respective in-text reference pointers, from 729 articles published in 147 journals across the 27 subject areas.
Findings: The outcomes of the analysis pointed out that bibliographic errors have been perpetuated for decades and that their possible causes have increased, despite the encouraged use of technological facilities, i.e. the reference managers.
Originality/value: As far as the authors know, the study is the best recent available analysis of errors in referencing and citing practices in the literature since Sweetland (1989).

 

Media articles and blog posts

Do you
want to know
more?
We would be happy to hear from you. Your needs and ideas are very valuable to building a collaborative infrastructure.