Taint Analysis - Key Concepts Explained

TL;DR

Taint analysis is a static security analysis technique that tracks the flow of sensitive or untrusted data through a program without executing it.
It relies on clearly defined sources (origins of sensitive data), sinks (unsafe usage points), and sanitizers (code that neutralizes or validates data).
Taint analysis helps detect critical vulnerabilities such as SQL injection, XSS, data leaks, and misuse of sensitive information early in development.

Taint analysis is pivotal in software security, focusing on identifying software vulnerabilities by monitoring the flow of tainted data. This essential technique helps developers understand how data is handled within their applications, ensuring sensitive information is protected. By using taint analysis, organizations can significantly reduce risks like data leaks and injection attacks. As software systems grow more complex, understanding taint analysis becomes crucial for their development. In the following sections, we delve into the core concepts, operations, and applications of taint analysis.

What is Taint Analysis?

Taint analysis is a core technique used in Static Analysis Security Testing (SAST). It tracks the flow of data of particular interest through the program to ensure it is not mishandled. By integrating well-configured taint analyses into the development process, developers are aided to preemptively prevent security risks.

Unlike testing or dynamic analysis, static taint analysis does not execute the program under analysis, which allows taint analysis to consider any possible path of execution. This makes it indispensable for maintaining robust security practices and compliance with regulatory standards.

Key Concepts in Taint Analysis

Taint analysis is instantiated by a configuration that describes which flows of data through the program under analysis shall be followed and which shall not. There are three core configuration aspects: sources, sinks and sanitizers.

A source is an origin of sensitive data whose flow through the analysis target the taint analysis shall follow, typically a call to an API that retrieves user-controlled input or secrets.
A sink is a location that reads data that shall not be sensitive.
A sanitizer is code in the analyzed program that checks whether provided data contains sensitive information or alters the provided data such that the contained information shall no longer be considered sensitive.

As an example, consider a taint analysis that shall find saving plaintext passwords into a database. Sources would be all calls to routines that return the plaintext password value provided by the user. Sinks would be arguments of calls to routines that alter or add contents in/to the database. Sanitizers would be calls to routines that securely hash a given password.

How Taint Analysis Works

Taint analysis typically constructs one or multiple graphs that represent the analyzed program. By doing so, the problem of taint checking is reduced to graph reachability, which is efficiently solvable. For constructing said graph(s), taint analysis requires helper analyses, for example callgraph and pointer analysis.

The graph representation built by any static analysis is always an approximation of the analyzed program, since in general it is undecidable whether an arbitrary program may really propagate sensitive data to a sink at runtime. Hence, false positive and false negative findings cannot be excluded for all programs. Reducing the number of false findings is desirable but requires more costly approximations.

Applications of Taint Analysis

Many of the most prominent common security weaknesses and software vulnerabilities can be detected with static taint analysis, including but not limited to SQL injection, cross-site scripting, use-after-free, double-free, memory leaks and API misuse.

A common application of taint analysis is detecting various potential security issues. Taint can be used in two ways to do so. For detecting injection attacks, being “tainted” means that a value comes from the outside and should not be trusted. Such a value should not go directly to a database command (SQL injection) or be part of executable code (code injection). Taint analysis can show where such values flow and where additional sanitization might be needed to make sure the values are safe to use.

The opposite direction, however, can also be interesting. In this case, “tainted” can mean that data is private and should not leave trusted boundaries. Think of using taint analysis to follow the flow of private data such as patient data or banking information to ensure that it is not sent over networks or ends up in log files where it could be read in cleartext.

Ready to Reduce Risks in Your Software System?

Axivion has you covered. Axivion Static Code Analysis provides ready-to-use analysis rules that cover large extents of MISRA, CERT, AUTOSAR and CWE, many of which configure and use a taint analysis and produce detailed explanations of findings. Reach out to our experts if you want to explore your options.

Take the chance to register for our recently held webinar and get practical insights into this topic. Register here.

What is Taint Analysis and how to reduce risks in software systems?

Taint Analysis - Key Concepts Explained

What is Taint Analysis?

Key Concepts in Taint Analysis

How Taint Analysis Works

Applications of Taint Analysis

Ready to Reduce Risks in Your Software System?

Sign Up for Updates

Axivion Static Code Analysis

Related Articles

The ROI of Stability: How High‑Performing Teams Expand Beyond OSS Tools

7 Signs Your CUDA Code Base Needs Static Analysis

Safety-Critical CUDA: What Developers Need to Know