HiveBrain v1.2.0
Get Started
← Back to all entries
patternbashModerate

'find' template for handling any file

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
handlingtemplatefileanyforfind

Problem

After growing tired of all the ways in which a file loop can be broken (find -print | while read) or unreadable (find -exec with complex commands), I think I've managed to build a find template which can handle any and all files which could possibly exist on a Linux system (not so famous last words). Can you find a way to break it, by changing either the test_ variables or the environment? For example, is it possible to mess with file descriptor 9 outside the script so that it won't work?

The only requirement is "sanity." In other words, test_file_name and test_dir_path cannot contain \0 or /, test_file_path cannot contain \0 (or be more than 1 level deep, since mkdir for the sake of the test is run without -p), and /bin/bash must be a stable version of Bash 4.

``
#!/bin/bash
# Filenames can contain any character except only null (\0) and slash (/);
# here's some general rules to handle them:
#
# $'...' can be used to create human readable strings with escape sequences.
#
# ' -- ' in commands is necessary to separate arguments from filenames, since
# filenames can start with '--', and would therefore be handled as parameters.
# To handle parameters properly (like GNU tools) use
getopt.
#
#
find doesn't support this syntax, so we use readlink to get an absolute
# path which by definition starts with slash.
#
# The "$()" construct strips trailing newlines, so we have to add a different
# character and then strip it outside the "$()" construct.
#
#
IFS=` is necessary to avoid that any characters in IFS are stripped from
# the start and end of $path.
#
# '-r' avoids interpreting backslash in filenames specially.
#
# '-d '' splits filenames by the null character.
#
# '-print0' separates find output by null characters.
#
# Variables inside '$()' have to be quoted just like outside this construct.
#
# Use process substitution with "<(" instead of pipes to avoid broken pipes.
#
# Use file descriptor 9 for data storage instead of standa

Solution

This code is vulnerable to TOCTOU. There is a tiny gap between the time that "plain files" are read from the process substitution (find -type f ...) and the time that readlink(1) is called on those filenames.

An attacker could create a program that waits for your program to run and then quickly deletes one of those files found by find(1) and replaces it with a symlink to somewhere else. readlink(1) will then dutifully return the target of that symlink and this is the path that will be output. The target could be outside $absolute_dir_path and a file of any type (directory, device node, ...).

Eg:

#!/bin/bash

set -o errexit
set -o nounset
set -o noclobber

# setup directory containing two plain files
# hardcode absolute_dir_path to /tmp/dir for simplicity
mkdir -p /tmp/dir
rm -f /tmp/dir/a /tmp/dir/b
touch /tmp/dir/a /tmp/dir/b

# emulate OP's find loop, but with inserted actions
# performed by an attacker (in real attack these would
# happen in an external program).
exec 9< <( find /tmp/dir -type f -print0 )
while IFS= read -r -d '' -u 9
do
        file_path_x="$(readlink -fn -- "$REPLY"; echo x)"
        file_path="${file_path_x%x}"
        ls -l "${file_path}"
        # attacker jumps in here and does:
        rm /tmp/dir/b
        ln -s /etc/passwd /tmp/dir/b
done


Output:

-rw-r--r-- 1 martin martin 0 2011-03-31 10:56 /tmp/dir/a
-rw-r--r-- 1 root root 2119 2011-03-28 11:35 /etc/passwd


The risk can be mitigated by only seeking files in directories beneath which untrusted users do not have write access. Directories such as /tmp and /var/tmp are problematic though and this is hard to solve. See the source of (eg) tmpreaper for some ideas.

Code Snippets

#!/bin/bash

set -o errexit
set -o nounset
set -o noclobber

# setup directory containing two plain files
# hardcode absolute_dir_path to /tmp/dir for simplicity
mkdir -p /tmp/dir
rm -f /tmp/dir/a /tmp/dir/b
touch /tmp/dir/a /tmp/dir/b

# emulate OP's find loop, but with inserted actions
# performed by an attacker (in real attack these would
# happen in an external program).
exec 9< <( find /tmp/dir -type f -print0 )
while IFS= read -r -d '' -u 9
do
        file_path_x="$(readlink -fn -- "$REPLY"; echo x)"
        file_path="${file_path_x%x}"
        ls -l "${file_path}"
        # attacker jumps in here and does:
        rm /tmp/dir/b
        ln -s /etc/passwd /tmp/dir/b
done
-rw-r--r-- 1 martin martin 0 2011-03-31 10:56 /tmp/dir/a
-rw-r--r-- 1 root root 2119 2011-03-28 11:35 /etc/passwd

Context

StackExchange Code Review Q#1343, answer score: 13

Revisions (0)

No revisions yet.