Abstract
Subject classification is essential for navigating scientific literature, yet the influential All Science Journal Classification (ASJC) has limited practical applicability. Its limitations stem from reliance on an incomplete source list restricted to Scopus content, and from journal-level classifications that often misrepresent individual documents. The most significant recent development in ASJC-based classification is OpenAlex, but it narrows the framework by reducing the number of categories and enforcing single-label assignments—both of which diminish classification accuracy. In response, this study introduces the first open, multi-label, implementation of the ASJC taxonomy that more accurately classifies individual documents, including those published in general science or interdisciplinary journals. We develop a fine-tuned SciBERT model for multi-label classification across 307 ASJC subjects, trained on a large-scale Crossref dataset using title, abstract, and source title metadata. The model achieves a weighted F1-score of 0.892 on 307 subjects and 0.934 on its 26 parent subjects on a Crossref test set with full metadata. It maintains respectable performance-0.532 and 0.694, respectively—even without the source title information that ASJC classification relies upon. Our fine-tuning strategy includes selective metadata omission to mitigate overfitting and data augmentation for underrepresented categories. In addition, we introduce a tailored label-averaging method that enables assessment of the disciplinary orientation and comparison of individual documents and larger collections—such as researcher portfolios, institutions, and entire databases. To promote transparency, reproducibility, and further research, we openly release our model via Hugging Face (https://huggingface.co/asjc-classification), providing ready-to-use ASJC-based subject classification.
| Original language | English |
|---|---|
| Number of pages | 38 |
| Journal | Scientometrics |
| DOIs | |
| Publication status | Published - 01 Dec 2025 |
Fields of science
- 502015 Innovation management
- 502 Economics
JKU Focus areas
- Sustainable Development: Responsible Technologies and Management
- Digital Transformation
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver