Multi-language dynamic taint analysis in a polyglot virtual machine

Jacob Kreindl, Daniele Bonetta, Lukas Stadler, David Leopoldseder, Hanspeter Mössenböck

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

Dynamic taint analysis is a popular program analysis technique in which sensitive data is marked as tainted and the propagation of tainted data is tracked in order to determine whether that data reaches critical program locations. This analysis technique has been successfully applied to software vulnerability detection, malware analysis, testing and debugging, and many other fields. However, existing approaches of dynamic taint analysis are either language-specific or they target native code. Neither is suitable for analyzing applications in which high-level dynamic languages such as JavaScript and low-level languages such as C interact.In these approaches, the language boundary forms an opaque barrier that prevents a sound analysis of data flow in the other language and can thus lead to the analysis being evaded. In this paper we introduce TruffleTaint, a platform for multi-language dynamic taint analysis that uses language-independent techniques for propagating taint labels to overcome the language boundary but still allows for language-specific taint propagation rules. Based on the Truffle framework for implementing runtimes for programming languages, TruffleTaint supports propagating taint in and between a selection of dynamic and low-level programming languages and can be easily extended to support additional languages. We demonstrate TruffleTaint’s propagation capabilities and evaluate its performance using several benchmarks from the Computer Language Benchmarks Game, which we implemented as combinations of C, JavaScript and Python code and which we adapted to propagate taint in various scenarios of language interaction. Our evaluation shows that TruffleTaint causes low to zero slowdown when no taint is introduced, rivaling state-of-the-art dynamic taint analysis platforms, and only up to ∼40x slowdown when taint is introduced.
Original languageEnglish
Title of host publication17th International Conference on Managed Programming Languages and Runtimes (MPLR´20)
PublisherACM Digital Library
Pages15-29
Number of pages15
ISBN (Print)978-1-4503-8853-5
DOIs
Publication statusPublished - Nov 2020

Fields of science

  • 102 Computer Sciences
  • 102009 Computer simulation
  • 102011 Formal languages
  • 102013 Human-computer interaction
  • 102022 Software development
  • 102024 Usability research
  • 102029 Practical computer science

JKU Focus areas

  • Digital Transformation

Cite this