LowEndBox - Cheap VPS, Hosting and Dedicated Server Deals

How to Find Files in Linux

How to Find Files in Linux
There’s nothing more frustrating than knowing that a file exists on your system but not knowing where it is. It’s like losing your car keys or misplacing your phone.

Sometimes you’ll have contextual clues – for example, system configuration files are usually in /etc. But even there, you’ve got 253 directories and subdirectories and over 1,730 files.

Let’s look at some tools to make find files easier.

Setting Up a VPS

I’ve setup a Debian 10 VPS and done the following:

  • created two users, named frank and mary
  • in /home/mary, I’ve created a directory called diary, placed some files in it (entry1.txt, entry2.txt, etc.), and made it mode 700 so only mary (and root) could see files in that directory.
  • in /home/frank, I created files called “plan1.txt” and “plan2.txt”, both of which are mode 600, so only frank (and root) can see them.

The mlocate package

mlocate (Merging Locate) is a package that builds a global database of files that can be queried to find files. locate has a long history in Unix (1982). slocate was an improvement on locate and mlocate is a further improvement which seeks to operate more quickly and not blow up the filesystem cache when it updates.

mlocate also only shows files that are available to the user, not all files. All these programs run as unpriveleged users, so you needn’t worry that user A is going to see user B’s file called /home/userb/my-secret-diary.txt

To install on Debian:

apt-get install mlocate

When installed, the database is empty, so trying to use the locate command produces an error:

root@vnc:~# locate hosts
locate: can not stat () `/var/lib/mlocate/mlocate.db': No such file or directory

Let’s get the database up to date:

# time updatedb

real 0m0.593s
user 0m0.018s
sys 0m0.112s

On an AMD EPYC 7601 with 25GB RAID SSD disk that 95% free, updating the database takes less than a second. For comparison, on 6TB of spinning disk RAID-1 on less-than-enterprise-grade storage with an i3 on my home fileserver, updating the database can take several minutes.

Let’s see what the mlocate database looks like:

root@vnc:~# locate -S
Database /var/lib/mlocate/mlocate.db:
3,260 directories
33,859 files
1,433,462 bytes in file names
640,990 bytes used to store database

Now that we’re up to date, I can query. As root:

root@vnc:~# locate hosts
/etc/hosts
/etc/hosts.allow
/etc/hosts.deny
/usr/lib/x86_64-linux-gnu/security/pam_rhosts.so
/usr/share/man/man5/hosts.5.gz
/usr/share/man/man5/hosts.allow.5.gz
/usr/share/man/man5/hosts.deny.5.gz
/usr/share/man/man5/hosts.equiv.5.gz
/usr/share/man/man5/hosts_access.5.gz
/usr/share/man/man5/hosts_options.5.gz
/usr/share/man/man8/pam_rhosts.8.gz
/usr/share/vim/vim81/ftplugin/denyhosts.vim
/usr/share/vim/vim81/ftplugin/hostsaccess.vim
/usr/share/vim/vim81/syntax/denyhosts.vim
/usr/share/vim/vim81/syntax/hostsaccess.vim
/usr/share/zsh/vendor-completions/_sd_hosts_or_user_at_host

If I wanted to do a case-insensitive locate, I could use locate -i.

Because I’m root, I can see files in mary’s home directory:

# locate entry1.txt
/home/mary/diary/entry1.txt

But if I’m frank, I cannot:

# su - frank
$ locate entry1.txt
$

updatedb will run nightly to update out of /etc/cron/cron.daily/mlocate.

One weakness of locate is that if you create a file during the day, you have to wait until the overnight run of updatedb (or run it manually) before the file is in the database. For real-time queries, we can use find.

Using find

Find is a very old Linux command that goes back to Unix version 5 (1978!). It does a real-time search of filesystems. Unlike locate, you can search with criteria besides just a name.

The general format for find is

find PATH EXPRESSIONS... ACTIONS...

Let’s say I wanted to find /etc/passwd. I would type:

find /etc -name passwd -print

This means

  • “go look in the /etc directory and all its subdirectories”
  • “match files named ‘passwd'”
  • “print out each file you find”

Here are the results:

# find /etc -name passwd -print
/etc/pam.d/passwd
/etc/cron.daily/passwd
/etc/passwd
#

The -print is optional so if you leave it off, you’ll get the same result.

Of course, I may not know the directory, so I could run this against the root filesystem:

# find / -name passwd -print
/etc/pam.d/passwd
/etc/cron.daily/passwd
/etc/passwd
/usr/bin/passwd
/usr/share/doc/passwd
/usr/share/lintian/overrides/passwd
/usr/share/bash-completion/completions/passwd
#

-iname is the case-insensitive parallel to -name.

With find, I can also find based on other criteria. Some examples to whet your appetite:

# dd if=/dev/zero of=/root/bigfile bs=1048576 count=512
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 0.490124 s, 1.1 GB/s
# find / -size +100M -print
/proc/kcore
find: ‘/proc/2381/task/2381/fd/6’: No such file or directory
find: ‘/proc/2381/task/2381/fdinfo/6’: No such file or directory
find: ‘/proc/2381/fd/5’: No such file or directory
find: ‘/proc/2381/fdinfo/5’: No such file or directory
/root/bigfile

Here I’ve created a 512MB file and then searched for files bigger than 100M (“-size +100M”). The errors are because I asked find to search root and that includes /proc, and during find’s run some processes that were running when it started finished and their files no longer existed.

I can also find files based on date:

# mkdir /backup
# touch -t 201008201111 /backup/some_old_backup.tar.gz
# touch /backup/current_backup.tar.gz
# ll /backup
total 8
drwxr-xr-x 2 root root 4096 May 24 18:42 .
drwxr-xr-x 19 root root 4096 May 24 18:40 ..
-rw-r--r-- 1 root root 0 May 24 18:42 current_backup.tar.gz
-rw-r--r-- 1 root root 0 Aug 20 2010 some_old_backup.tar.gz
# find /backup -mtime +30 -print
/backup/some_old_backup.tar.gz

That find expression (“-mtime +30”) menas “older than 30 days”.

There is much more you can do with find – for example, printing is not the only action. There are also a galaxy of expressions you can use to seach by: owner, group owner, newer than or older than other files, type of file, etc. Consult the find man page to learn all about find.

raindog308

No Comments

    Leave a Reply

    Some notes on commenting on LowEndBox:

    • Do not use LowEndBox for support issues. Go to your hosting provider and issue a ticket there. Coming here saying "my VPS is down, what do I do?!" will only have your comments removed.
    • Akismet is used for spam detection. Some comments may be held temporarily for manual approval.
    • Use <pre>...</pre> to quote the output from your terminal/console, or consider using a pastebin service.

    Your email address will not be published. Required fields are marked *