how to copy the directory structure without the files in linux

Sometimes, you want to replicate just the structure of a directory to another location, without any of the files that reside inside the directory or the sub-directories. It is quite easy to copy over entire directory including the files by using the cp command. But the cp command does not provide an easy way to exclude files while copying or an option that will allow to just copy directory structure or folders.

There are still couple of command line options available to you, if you do not mind piping some commands together. Most of the commands to copy directory structure and its variations had either the find command or rsync in some fashion.

In the examples below, assume that the source directory has an absolute path /path/to/source and that the destination folder is at /path/to/dest.

Using find and mkdir

Most if not all of the options available will involve the find command in some way. That is because the find command is quite versatile and is an easy way to list just the files, the folders or both.

This option uses the mkdir command with the find command. This method also requires that you be inside the source folder while executing the command.

bash$ cd /path/to/source && find . -type d -exec mkdir -p /path/to/dest/{} \;

Using find and cpio

The cpio command in Linux is used to copy files to and from archives. We can re-purpose it in order to copy just the folders to another folder. This method has the additional advantage that you need not be in the source folder while executing the command, unlike the earlier method.

The cpio may not be installed by default in certain Linux distros. If that is the case, then you will need to install it before using this command.

bash$ find /path/to/source -type d | cpio -pd /path/to/dest/

Using rsync

rsync is yet another command that can be used to copy directory structure. This is also very useful if you like to keep two directory structures in sync over a period of time or you will need to copy directory structure in its entirety multiple times.

Using rsync also gives you the option to replicate by just updating the directories when run multiple times and also the ability to handle deletes in the source without having to remove the entire destination directory. You may use either of the following command line options to exclude files.

bash$ rsync -a --include '*/' --exclude '*' /path/to/source /path/to/dest

bash$ rysnc -a -f"+ */" -f"- *" /path/to/source /path/to/dest

We use the -a (or –archive) option that provides several useful functions, such as recursion, preserving the times, permissions and owner of the directories and symlink support. After the first replication, additional changes can be propagated without having to copy over the entire structure again.

bash$ rysnc -a -u --del --force --include '*/' --exclude '*' /path/to/source /path/to/dest

bash$ rysnc -a -u --del --force -f"+ */" -f"- *" /path/to/source /path/to/dest

The options used above are

-u (–update) : update the changed or modified directories or files only
-del : delete during the copy. The other options include –delete-before and –delete-after, that deletes files before or after the file transfers
–force : This option will delete the directories forcefully, even if the destination folder contains files or sub directories.

Excluding some sub-directories

All of the above options copies over the entire directory structure or updates them. But sometimes, you might want to exclude some sub-directories based on name. For example, you might want to exclude all the directories named logs that are scattered over various sub directories.

One option is to copy over all the directories and then delete the ones that are not needed. Even better is the option to not copy over the directories that are not needed.

bash$ find /path/to/source -type d -not -name logs| cpio -pd /path/to/dest/

This will not copy over directories with the path that ends with folder name logs. If the logs folder happens to have sub-folders then those will be copied thus copying the parent logs folder itself. In order avoid that, and ignore all the sub-directories under logs/ as well, use the following

bash$ find /path/to/source -type d -not -name logs -not -path */logs/* | cpio -pd /path/to/dest

You can do the same thing with the rsync command as well. The order in which the -f is provided in the command does matter, in that the first matched pattern is honored.

bash$ rysnc -a -f"- logs/" -f"+ */" -f"- *" /path/to/source /path/to/dest

Excluding some of the files and not all

All of the above method excludes all the files inside the source folder. Instead of excluding all the files, you can exclude files selectively based on some file criteria such as file names, extensions, owner, modification dates etc.

We will see how you can exclude all the log files (ie. files with a .log extension) while still copying over all the other files.

bash$ rsync -a -f"+ */" -f"- *.log" -f"+ *" /path/to/source /path/to/dest

Both the find and rsync commands are flexible enough to be used to copy directory structure, but the rsync is a better option for two main reasons: 1) It supports creation of directories and does need to be piped to another command such as mkdir or cpio and 2) the include and exclude options are much more flexible.

All of the above commands can be made into custom scripts so that it can used and re-used. This is especially handy if you do use them frequently.