Beyond the Hype: Assessing Limitations of Large Language Models in Support Ticket Anonymization

Activity: Talk or presentationContributed talkscience-to-science

Description

This work focuses on data anonymization in the domain of support ticket data. Textual data collected during the customer support process often carries sensitive personally identifiable information (PII). In the context of this work, data anonymization is defined as the redaction of PII from textual data. Up until now, Named Entity Recognition (NER) models based on transformer technologies are considered state-of-the-art when it comes to anonymization of unstructured, textual data. The recent success of LLMs in various NLP tasks raises the question of whether they can be effectively applied to the task of anonymization of support ticket data. So this study aims to close the research gap in the field of anonymization and de-identification of unstructured data identified by various scientific works, and applying it in a real-world use case of an industrial automation company’s support ticket process.

The results of the study reveal significant limitations of LLMs in terms of overall performance, particularly when facing real-world anonymization scenarios. The performance of the LLM-based approaches for PII detection in the given context is significantly worse than the results of the more traditional approach, as the LLM approach trails benchmark both in precision and recall. The established anonymization solution, which leverages an ensemble architecture of specific transformer and NER technologies, demonstrated superior performance in detecting PIIs and anonymizing the support ticket data. The findings of this study provide a nuanced understanding of the role and limitations of LLMs in the domain of support ticket anonymization. Despite the advancements and hype surrounding LLMs, their applicability can be limited by the nature of the task and the specific requirements of the domain.
Period22 Jun 2025
Event title27th International Conference on Human-Computer Interaction
Event typeConference
LocationGothenburg, SwedenShow on map
Degree of RecognitionInternational

Fields of science

  • 102020 Medical informatics
  • 102022 Software development
  • 102006 Computer supported cooperative work (CSCW)
  • 102027 Web engineering
  • 502050 Business informatics
  • 102040 Quantum computing 
  • 102016 IT security
  • 503015 Subject didactics of technical sciences
  • 509026 Digitalisation research
  • 102015 Information systems
  • 102034 Cyber-physical systems
  • 502032 Quality management
  • 211928 Systems engineering

JKU Focus areas

  • Digital Transformation