Before we merge two or more directories, we need to define what we mean by “merge” in the context of directories. There can be a variety of requirements that could be named as merging of directories. Let’s assume that merging of (two or more) directories means “to create a new directory with files from all of the merged directories“. There can be other requirements associated with this, but we will tackle those as we go along.
Creating a new directory with files from multiple directories itself is not very complicated. The issue itself occurs when there are files in the directories with same or conflicting names. In such cases, we need to figure out what the name and content of the resulting file needs to be.
The merge process is essentially a combination of moving, renaming and deleting files. That means there are several different ways to achieve the required result by using different Linux commands such as find, mv, cp, rm, rsync etc.
Let’s assume that we want to merge two different folders named first/ and second/ …into a new folder named third/. You should be able extrapolate the other cases such as merging first/ and second/ into the same folder second/ etc.
Merging two directories with unique file names
Let say we want to merge two different directories (first/ and second/) each of which contain a set of files and some sub-directories in each of them. They contain files but all of those files have unique file names that do not clash with each other. This is probably the easiest of the cases that you would encounter.
We can use the cp or copy command here, as there are no file name conflicts. Each of the directories and it sub-directories will copy its contents into the new folder.
bash$ cp -fr first/ third/ && cp -fr second/ third/
You can remove the directories using the rm command after you have copied the files to the third/ folder.
bash$ rm -fr first/ second/
Although the cp will work in this situation, I would recommend that you use rsync utility to perform large scale directory syncing. When you don’t know for sure, always assume that there are file collisions. Let’s see the options that rsync provide to merge directories ….
Merging two directories with files with file name collisions
We will now assume the previous use case but with file names that may be the same which will cause file name collisions when merged. So, the simplest thing to do when there is a file name collision is to delete the previous file and overwrite it with the new file.
bash$ rsync -avz first/ second/ third/
-a or –archive: The archive option is a shortcut for many options (-rlptgoD), that supports recursion through folders, preserves file metadata etc.
-v or –verbose: verbose option that prints out what is being copied
-z or –compress: performs compression before transferring files. This may not be necessary if the directories on the same machine.
You can choose to overwrite the files only if the file is newer than the previous version. You can also choose to delete the source version, once the files have been successfully copied. The option to do this is -u (or –update) and –remove-source-files.
bash$ rysnc -auvz first/ second/ third/ --delete-after --remove-source-files
-u or –update: skips overwriting the file if the existing file has a newer modification time than the incoming file.
–delete-after: performs file deletions on the destination side after the transfer.
–remove-source-files: Remove the source file after the file has been transferred.
You can choose not to overwrite the files as well. This will result in the first file winning and all other files of the same name getting lost. You can use the –ignore-existing command line option to do that.