how to find all hard links to a file or folder in linux
In many operating systems such Linux and Unix, file linking is a commonly used feature. There are typically two different types of links that can be created: Soft or Symbolic Links vs Hard Links. Hard Links are special files that associates a name with an already existing file in the file system. This is one way to create an alias to the file content, using multiple names.
The hard links can be used and accessed just as any other file. When you modify the contents of the file using one of the aliases or hard link, then the changes are reflected when you access using other file names as well. This is a different way of creating aliases when compared to soft links. The soft link creates an alias to a file name while the hard link creates the alias to the file content.
Find if the file has hard links
Soft links are very easy to identify using ls command. When you list the file contents of a directory, the soft links are clearly marked using a -> marker and it contains the link as well as the path to the referenced file. Hard Links are harder to identify unless you know what you are looking for. It can still be identified using the same ls command but you will need to use the long listing format by using the -l command line option.
bash$ ls -l total 4 -rw-r--r-- 1 root root 71 May 10 13:37 dtwosource -rw-r--r-- 2 root root 0 May 10 13:37 hl1 -rw-r--r-- 4 root root 0 May 10 13:37 hl2 -rw-r--r-- 4 root root 0 May 10 13:37 hl3 -rw-r--r-- 4 root root 0 May 10 13:37 hl4 lrwxrwxrwx 1 root root 10 May 10 13:37 linkthree -> dtwosource
In the long listing format, the second column denotes the number of hard links to the file. This is the number between the permissions column and the owner name in the output. In the above example, the file hl1 has 2 links pointing to it, while hl2 and hl3 have 4 hard links to it.
The count is slightly different in the case of directories. Most file systems do not allow hard links on directories. This is to avoid looping as well as issues with computing parent directories. However, each of the directory have two implicit directories in them which "hard" links to the current directory and the parent directory (the . and .. entries). So, the link count for the directory is the count of all immediate sub-directories (level 1) plus one for the current directory (the . entry).
Most file systems implement hard linking using what is called reference counting. The file system keeps a count of all different names associated with a data section. The count is increased and decreased when a link is created or deleted. It is this reference count that is displayed in the ls command output that we checked for hard link count.
Find all hard links to the file
Once you find that the file have hard links, you might want to find all the links to the file as well. In order to do that you will need to find the inode entry of the file. This is the number that is used by the file system to identify the data section of the files.
As all hard links point to the same data section, all of them have the same inode number associated with it. This is the field that will we will use to find and identify all the hard links to this file. You can again use the ls command to find the inode number of the file. The ls command with the -i command line argument will show the inode number of the file.
bash$ ls -li total 4 136172648 -rw-r--r-- 1 root root 71 May 10 13:37 dtwosource 136172640 -rw-r--r-- 2 root root 0 May 10 13:37 hl1 136172639 -rw-r--r-- 4 root root 0 May 10 13:37 hl2 136172639 -rw-r--r-- 4 root root 0 May 10 13:37 hl3 136172639 -rw-r--r-- 4 root root 0 May 10 13:37 hl4 136172649 lrwxrwxrwx 1 root root 10 May 10 13:37 linkthree -> dtwosource
The first column in the output is the inode number associated with the file. The third column now denotes the link count of the file. In the above example the files hl2, hl3 and hl4 all show the same inode number, which is 136172639. That means that all of them point to the same file.
Once you have the inode number, you have can now search the file system for files based on this number. The best way is to use the find command which has a couple of different options. You can use the -inum command line option with the inode number.
bash$ find /path/to/search -inum 136172640
As you can see, if you can narrow down the directory to recursively search, then it will be faster. In the worst case, you just will have to search the entire file system using /.
The find command also have another option called -samefile which makes it easier to find links based on the file name. That means you will not have to look up the inode number for the file as above. So, if you want to find all the hard links to a file name hl2, then you can do
bash$ find /path/to/search -samefile path/to/hl2
Again, you can use / (instead of /path/to/search in the above example) to search the entire file system. You can specify either the relative or absolute path to the file.