Sometimes you will need to create large files in Linux or even a file with desired size and content, especially if you are a developer. Such files are especially useful for testing purposes such as testing the upload speed of your internet connection. You might want especially large text files with actual readable characters or you might be okay with binary files. Either way, you want a quick and easy way to create multiple files of varying sizes.
One option is to do a quick search of your file system, and find existing files that are large in size to suit your requirements. You can perform a command line search to find large files based on the file size. Most times, it is just easier and faster to create some files of desired size.
There are several different commands that you should be able to use on your system in order to create large files. We will look at some of them, namely dd, fallocate and mkfile in see the options that are available with each of them.
dd is probably the most popular command that is used to create files of desired size. It is basically a utility that will help you to copy and convert a file, but it has enough command line options so as to be used to generate random files of desired size. The primary purpose of the dd command is mostly to replicate disk images and backup hard disks.
The advantages of using dd is that it is capable of creating files of exact size and that it is generally very fast, only limited by the write speed of your hard disk. The generic command format that can be used is
dd if=inputfile of=outputfile count=inputblocksize bs=blocksize seek=skipblock
The options in the above command provides the following functions
if: the input file to be used. If this is not provided, then the command reads from the stdin. Often a device files such as /dev/urandom or /dev/zero is used to generate files.
of: the output file name. If not specified, then the output is written to the stdout.
count: this is an integer which specifies as to how many input blocks need to be copied
bs: this is an integer that specifies as to how many bytes should be read and written at a time
seek: the number of blocks that needs to be skipped while writing the output. The default block size is 512, but you can change that with obs argument.
Note that using the seek argument and increasing the seek value will typically decrease the execution time of the command, but it will also result in a sparse file.
The total size of the file that is generated is dependent on the values provided to the count and bs arguments. The total size of the generate file in bytes is the product of count and bs. ie, size = count*bs.
If the content of the file is not important then you can create the file with a bunch of null values. You can use the /dev/zero device file as the input which will give you null values repeatedly when read from it.
bash$ dd if=/dev/zero of=output.bin count=1024 bs=1024
The above command will generate a output file named output.bin of size 1MB. (size = 1024 x 1024 = 1048576 bytes which is 1MB)
Sometimes, you need some “useful” content in the file, not just a series of null values. In that case, you can use /dev/urandom as your input. The content may not still be human readable, but you are guaranteed to have non-null characters including some end of line (EOL) characters. That means that you have a file with lines in them and can use the file as input to programs that expect some kind of data.
bash$ dd if=/dev/urandom of=output.bin count=4096 bs=1024
The above command will generate output.bin file that is about 4.2MB in size. (size = 4096 x 1024 = 4194304 bytes)
Another command that can be used to generate large files in Linux is the fallocate command. This command works by pre-allocating space for a file. That means it requires very little to no disk I/O which makes it much faster than the other commands.
This also means that you have not much control over the contents of the file, as the content remains unallocated.
bash$ fallocate -l
filesize: the desired size of the file
filename: the name of the output file
bash$ fallocate -l 1G output.bin
This will create a file called output.bin of 1GB in size.
The mkfile command is another option you have. This may not be available in all distros by default. This command works along the same lines as the fallocate command in the previous section. The file is typically filled with the ‘0’ (zero) characters. In order to create a file of 1G, use the following command…
bash$ mkfile 1G output.bin
Most times the file size is the important factor when the need arises to create a large file. But sometimes, you also want to have control over the content of the file. This is especially true if the file is going to used as an input to test any software utilities.
You can use the tr command found in Linux, along with /dev/urandom to achieve this. This will be much slower than any of the commands described especially for very large files, but gets you the content you need. In order to generate a 2MB file with only english characters and digits, you can use one of the following commands
bash$ tr -dc "A-Za-z0-9" < /dev/urandom | head -c 2048 > output.txt
If you want to include end of line characters, in order to generate lines as well into the file, then you can modify the above as below…
bash$ tr -dc "A-Za-z0-9\n\r" < /dev/urandom | head -c 2048 > output.txt
To generate a file of any size, but with a specific number of lines you can further modify this to use the -n option of the head command…
bash$ tr -dc "A-Za-z0-9\n\r" < /dev/urandom | head -n 100 > output.txt