Discussion:
[dev] [sbase] Bugs in find
Tavian Barnes
2018-09-27 00:48:07 UTC
Permalink
Hi! As the author of a find-compatible tool, whenever I find another
find implementation I run my testsuite against it to see if I find any
bugs in either one. sbase/find helped me identify many places in my
POSIX tests that use extensions to POSIX when they shouldn't, so
thanks! Here's the bugs that I found in sbase/find:

- Errors reported for broken symlinks in -L/-H modes -

$ mkdir foo
$ ln -s nowhere foo/broken
$ touch foo/bar
$ ln -s bar/baz foo/notdir
$ ./find -L foo
foo
foo/bar
./find: failed to stat foo/broken: No such file or directory
./find: failed to stat foo/notdir: Not a directory
$ ./find -H foo/broken
./find: failed to stat foo/broken: No such file or directory
$ ./find -H foo/notdir
./find: failed to stat foo/notdir: Not a directory

For -H/-L, POSIX says
(http://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html)
If the referenced file does not exist, the file information and type shall be for the link itself.
- Names longer than PATH_MAX are broken -

POSIX says
The find utility shall be able to descend to arbitrary depths in a file hierarchy and shall not fail due to path length limitations (unless a path operand specified by the application exceeds {PATH_MAX} requirements).
but:

$ name="0123456789ABCDEF"
$ name="${name}${name}${name}${name}"
$ name="${name}${name}${name}${name}"
$ name="${name:0:255}"
$ (mkdir deep && cd deep && for i in {1..17}; do mkdir $name && cd $name; done)
$ ./find deep
...
./find: failed to stat deep/...ABCDE: File name too long

Despite that, it exits successfully, which is probably also a bug.

- find -newer follows symbolic links -

$ mkdir foo
$ touch foo/bar
$ ln -s bar foo/baz
$ touch foo/qux
$ ./find foo -newer foo/baz
foo
foo/baz
foo/qux

foo/baz is not newer than itself though.

- find -perm doesn't support X -

$ ./find . -perm a+rX,u+w
./find: a+rX,u+w: invalid mode

About the mode format for -perm, POSIX says
It shall be identical in format to the symbolic_mode operand described in chmod
chmod supports X, so find should too.

If you want to run these tests yourself, clone
https://github.com/tavianator/bfs and run

$ ./tests.sh --bfs=path/to/sbase/find --posix

--
Tavian Barnes
Evan Gates
2018-09-27 15:03:53 UTC
Permalink
Post by Tavian Barnes
Hi! As the author of a find-compatible tool, whenever I find another
find implementation I run my testsuite against it to see if I find any
bugs in either one. sbase/find helped me identify many places in my
POSIX tests that use extensions to POSIX when they shouldn't, so
thanks!
Awesome! Those extensions sneak in don't they?
Post by Tavian Barnes
- Errors reported for broken symlinks in -L/-H modes -
$ mkdir foo
$ ln -s nowhere foo/broken
$ touch foo/bar
$ ln -s bar/baz foo/notdir
$ ./find -L foo
foo
foo/bar
./find: failed to stat foo/broken: No such file or directory
./find: failed to stat foo/notdir: Not a directory
$ ./find -H foo/broken
./find: failed to stat foo/broken: No such file or directory
$ ./find -H foo/notdir
./find: failed to stat foo/notdir: Not a directory
For -H/-L, POSIX says
(http://pubs.opengroup.org/onlinepubs/9699919799/utilities/find.html)
If the referenced file does not exist, the file information and type
shall be for the link itself.
Good catch.
Post by Tavian Barnes
- Names longer than PATH_MAX are broken -
POSIX says
The find utility shall be able to descend to arbitrary depths in
a file hierarchy and shall not fail due to path length limitations
(unless a path operand specified by the application exceeds {PATH_MAX}
requirements).
OK. Only the paths given as arguments can fail due to path lengths.
Got it.
Post by Tavian Barnes
$ name="0123456789ABCDEF"
$ name="${name}${name}${name}${name}"
$ name="${name}${name}${name}${name}"
$ name="${name:0:255}"
$ (mkdir deep && cd deep && for i in {1..17}; do mkdir $name && cd $name; done)
$ ./find deep
...
./find: failed to stat deep/...ABCDE: File name too long
Interesting. It looks like we'd have to either cd() or fstatat(). I'm
thinking we could use dirfd() to get an fd from the DIR* we already have.
Then we could pass the fd along in order to use fstatat(). If not for the
-path primary we could just pass the fd and the filename when recursing.
But because of -path it would probably make the most sense to pass the
fd, filename, and full path.
Post by Tavian Barnes
Despite that, it exits successfully, which is probably also a bug.
Good point. I wanted to keep on processing instead of exiting
immediately, but it makes sense to set a flag so we exit failure.
Post by Tavian Barnes
- find -newer follows symbolic links -
$ mkdir foo
$ touch foo/bar
$ ln -s bar foo/baz
$ touch foo/qux
$ ./find foo -newer foo/baz
foo
foo/baz
foo/qux
foo/baz is not newer than itself though.
Well that's just weird. In a brief look at the code I'm not sure why
that's happening.
Post by Tavian Barnes
- find -perm doesn't support X -
$ ./find . -perm a+rX,u+w
./find: a+rX,u+w: invalid mode
That seems to be missing from libutil/mode.c
Post by Tavian Barnes
If you want to run these tests yourself, clone
https://github.com/tavianator/bfs and run
$ ./tests.sh --bfs=path/to/sbase/find --posix
Cool!

Always happy to see someone excited about sbase. Thanks for reporting
the bugs. Would you be interested in submitting patches, too? That would
be most helpful. You can think of it as an excuse to see the inner
workings of another find implementation :-)
Tavian Barnes
2018-09-28 02:56:56 UTC
Permalink
Post by Evan Gates
Post by Tavian Barnes
Hi! As the author of a find-compatible tool, whenever I find another
find implementation I run my testsuite against it to see if I find any
bugs in either one. sbase/find helped me identify many places in my
POSIX tests that use extensions to POSIX when they shouldn't, so
thanks!
Awesome! Those extensions sneak in don't they?
Yeah they sure do.
Post by Evan Gates
[snip]
Interesting. It looks like we'd have to either cd() or fstatat(). I'm
thinking we could use dirfd() to get an fd from the DIR* we already have.
Then we could pass the fd along in order to use fstatat(). If not for the
-path primary we could just pass the fd and the filename when recursing.
But because of -path it would probably make the most sense to pass the
fd, filename, and full path.
fstatat() is better than chdir() IMO. Can't forget to chdir() back in
-exec for example. Your coding style says you use POSIX 2008 so we
can rely on it. There's additional complications if you want to
handle directory structures deeper than OPEN_MAX.

Any objections to changing find.c to use recurse()? That way this
could be fixed for a few tools at once.
Post by Evan Gates
Post by Tavian Barnes
Despite that, it exits successfully, which is probably also a bug.
Good point. I wanted to keep on processing instead of exiting
immediately, but it makes sense to set a flag so we exit failure.
Makes sense!
Post by Evan Gates
[snip]
Post by Tavian Barnes
foo/baz is not newer than itself though.
Well that's just weird. In a brief look at the code I'm not sure why
that's happening.
Looks like stat() instead of lstat() in get_newer_arg(). A strict
reading of the POSIX spec implies it should always be lstat() (the
-H/-L docs speak only about the path operands, not other arguments
that happen to be paths), but I don't know of any find implementations
that do that. GNU find and several others respect -H/-L for whether
to follow symbolic links passed to -newer. But the POSIX conformance
tests check for this.
Post by Evan Gates
Post by Tavian Barnes
- find -perm doesn't support X -
$ ./find . -perm a+rX,u+w
./find: a+rX,u+w: invalid mode
That seems to be missing from libutil/mode.c
I see. So this is missing from sbase/chmod too I guess. Guess I can
fix both at once :)
Post by Evan Gates
[snip]
Always happy to see someone excited about sbase. Thanks for reporting
the bugs. Would you be interested in submitting patches, too? That would
be most helpful. You can think of it as an excuse to see the inner
workings of another find implementation :-)
Yep, I'll start digging into it.
--
Tavian Barnes
Continue reading on narkive:
Loading...