Viewing Files & Comparing File Contents

less - View text file, with scrolling and searching
head - View top n lines of a text file
tail - View last n lines of a text file
diff - Compare contents of two files

Viewing Files

Let's assume we have the file mydata.log with the following contents:

This is the start of our data file: started at Mon March 1, 2025
- Code processing starting
- Data processing starting
- Analysis pass 1 running...
... completed after 30 seconds
- Analysis pass 2 running...
... completed after 30 seconds
- Analysis pass 3 running...
... completed after 30 seconds
- Analysis pass 4 running...
... completed after 30 seconds
- All passes complete
- All processing now complete
Script completed at Mon March 1, 2025

less

The less command opens up the text file and lets us browse it, forwards and backwards using the cursor keys. We can also use the / key to do simple search expressions. The q key quits and returns us to the command line.

$ less mydata.log

head

The head command shows the start of a file. By default it will show the first 10 lines of a file:

$ head mydata.log
This is the start of our data file: started at Mon March 1, 2025
- Code processing starting
- Data processing starting
- Analysis pass 1 running...
... completed after 30 seconds
- Analysis pass 2 running...
... completed after 30 seconds
- Analysis pass 3 running...
... completed after 30 seconds
- Analysis pass 4 running...
$

tail

The tail command shows the end of a file. By default it will show the last 10 lines of a file and then return to the command line:

$ tail mydata.log
... completed after 30 seconds
- Analysis pass 2 running...
... completed after 30 seconds
- Analysis pass 3 running...
... completed after 30 seconds
- Analysis pass 4 running...
... completed after 30 seconds
- All passes complete
- All processing now complete
Script completed at Mon March 1, 2025
$

It is often useful to use tail to show the latest contents of a log file or text data file which is being created. You can also use the -f (for follow) option to keep the file open and see the content being added in realtime:

$ tail -f mydata.log
... completed after 30 seconds
- Analysis pass 2 running...
... completed after 30 seconds
- Analysis pass 3 running...
... completed after 30 seconds
- Analysis pass 4 running...
... completed after 30 seconds
- All passes complete
- All processing now complete
Script completed at Mon March 1, 2025

Note how with -f it does not return to the command line - it will wait indefinitely for more output to be written to mydata.log. This is very handy when monitoring output files to ensure that they are still being written to.

To exit from a tail -f interface use Control + C key combination.

Comparing Files

It is often useful to be able to check if two files are the same. Some common scenarios:

If two scripts are exactly the same
If the output file from a job has produced the same as a previous run
If two input data sets are identical

Text

Assume two files containing ASCII text data which represent the output of two different processing runs, two different sets of data, or two variants of the same application script file:

We can use the diff command to scan for differences:

$ diff file1.txt file2.txt
2c2
< 2
---
> a
6c6
< 6
---
> b
10c10
< 0
---
> z
$

If the output from diff is blank, then it means the two files are identical. Any other output shows the differences between the two files in a format which can be used to create patches, automate changes etc. The lines (and content) which differ are shown using left (<) and right (>) chevrons to indicate if they are in file1 or file2.

The other useful piece of information is that the return code of the diff command is always 1 if the files are different:

$ diff file.txt file2.txt 
2c2
< 2
---
> a
6c6
< 6
---
> b
10c10
< 0
---
> z
$ echo $?
1
$

If the files are identical in content then no output is shown and the return code of the diff command is 0:

$ diff file.txt file.txt 
$ echo $?
0
$

Creating Patch Files

You can use diff to produce a patch file, which will apply the changes from the second file to the first. This is often useful for distributing changes to code or data, without distributing the entire original file.

Patch files can be generated from two files using the following diff options:

$ diff -Naru file1.txt file2.txt

In the example above, this would generate the following patch file:

--- file1.txt   2025-03-31 10:59:24.952670568 +0100
+++ file2.txt   2025-03-31 10:59:41.650482844 +0100
@@ -1,10 +1,10 @@
 1
-2
+a
 3
 4
 5
-6
+b
 7
 8
 9
-0
+z

Binary / Data Files

While you can use diff on binary files (that is, files which are not just text), it will only state if the files are different, not what the differences are.

Assuming the files binary1.dat and binary2.dat, both containing different data:

$ diff binary1.dat binary2.dat
Binary files binary1.dat and binary2.dat differ
$

Back to FAQ