how to find all files containing specific text in a linux filesystem?
Most search utilities work by crawling, parsing and generating an index of the file content so that they can be searched later. Many operating systems and desktop environment follow this approach to make searching faster.
We will see how to search through file content from the command line, without the use of such indexes or engines. Needless to say, these searches will take a much longer time depending on how much content or files you are searching.
The only command you need to know is the powerful grep and its variations such as egrep and fgrep. This command searches for patterns in the content of files that you can specified and then outputs the line that contains the pattern along with the filename, line numbers or any other information that you specify.
The generic syntax of the grep command is
grep <options> pattern input_files
So, if you want to search for a text say for example vodka martini in a set of files under the folder recipes, then you could do something like this
$ grep -irnH "vodka martini" recipes/
The most useful command line options of grep are
-i or –ignore-case: This is the switch that can be used to do case insensitive searches.
-r or –recursive: This option allows you search all files and folders recursively
-n or –line-number: This will print out the line number of the matched line with in the file
-H or –with-filename: This will print out the file name that has the text.
As I mentioned earlier, the more content it needs to search, the slower the search becomes. You can use some filters to make sure that only relevant files are searched which will make the operation faster.
For example, you can search only the files that have a extension .txt instead of all the files in the folder.
$ grep -inH "vodka martini" recipes/*.txt
In the earlier example, we searched only for files in the recipe folder that has a .txt extension. Sometimes, you want to search for more than one phrase or word. For example, let's say you want to search for vodka or martini. You can use regular expressions with the grep command
$ grep -inrH -e 'vodka\|martini' recipes/
For example, if you want to search through log files and print out all lines that have error, warning or severe in the line then you could do something like
$ grep -inH -e 'error\|warning\|severe' /var/logs/
You could use egrep instead of grep in such instances as well, it works pretty much the same way but you don't have to escape the OR (|) operator. egrep is essentially the grep command and a shortcut for grep -E. The -E or –extended-regexp interprets the pattern as an extended regular expression.
$ egrep -inHr 'error|warning|severe' /var/logs/
If you don't know the location of the file, then you are stuck with searching all the files in the filesystem. This is quite resource intensive, but you can use the find command to filter the files to something that is much more manageable.
$ find / -type f -iname "*.log" -exec egrep -inH 'error|warning' "{}" \;
grep does have the ability to interpret patterns as either a basic regular expression (-G or –basic-regexp), extended regular expression (-E or –extended-regexp), perl compatible regular expression (-P or –perl-regexp) or just as fixed string (-F or –fixed-strings).