====== Viewing Files & Comparing File Contents ======
* [[#less]] - View text file, with scrolling and searching
* [[#head]] - View top //n// lines of a text file
* [[#tail]] - View last //n// lines of a text file
* [[#text|diff]] - Compare contents of two files
===== Viewing Files =====
Let's assume we have the file ''mydata.log'' with the following contents:
This is the start of our data file: started at Mon March 1, 2025
- Code processing starting
- Data processing starting
- Analysis pass 1 running...
... completed after 30 seconds
- Analysis pass 2 running...
... completed after 30 seconds
- Analysis pass 3 running...
... completed after 30 seconds
- Analysis pass 4 running...
... completed after 30 seconds
- All passes complete
- All processing now complete
Script completed at Mon March 1, 2025
==== less ====
The ''less'' command opens up the text file and lets us browse it, forwards and backwards using the cursor keys. We can also use the ''/'' key to do simple search expressions. The ''q'' key quits and returns us to the command line.
$ less mydata.log
==== head ====
The ''head'' command shows the start of a file. By default it will show the __first__ **10** lines of a file:
$ head mydata.log
This is the start of our data file: started at Mon March 1, 2025
- Code processing starting
- Data processing starting
- Analysis pass 1 running...
... completed after 30 seconds
- Analysis pass 2 running...
... completed after 30 seconds
- Analysis pass 3 running...
... completed after 30 seconds
- Analysis pass 4 running...
$
==== tail ====
The ''tail'' command shows the end of a file. By default it will show the __last__ **10** lines of a file and then return to the command line:
$ tail mydata.log
... completed after 30 seconds
- Analysis pass 2 running...
... completed after 30 seconds
- Analysis pass 3 running...
... completed after 30 seconds
- Analysis pass 4 running...
... completed after 30 seconds
- All passes complete
- All processing now complete
Script completed at Mon March 1, 2025
$
It is often useful to use ''tail'' to show the latest contents of a log file or text data file which is being created. You can also use the ''-f'' (for //follow//) option to keep the file open and see the content being added in realtime:
$ tail -f mydata.log
... completed after 30 seconds
- Analysis pass 2 running...
... completed after 30 seconds
- Analysis pass 3 running...
... completed after 30 seconds
- Analysis pass 4 running...
... completed after 30 seconds
- All passes complete
- All processing now complete
Script completed at Mon March 1, 2025
Note how with ''-f'' it //does not// return to the command line - it will wait indefinitely for more output to be written to ''mydata.log''. This is very handy when monitoring output files to ensure that they are still being written to.
To exit from a ''tail -f'' interface use //Control + C// key combination.
----
===== Comparing Files =====
It is often useful to be able to check if two files are the same. Some common scenarios:
* If two scripts are exactly the same
* If the output file from a job has produced the same as a previous run
* If two input data sets are identical
==== Text ====
Assume two files containing ASCII text data which represent the output of two different processing runs, two different sets of data, or two variants of the same application script file:
1
2
3
4
5
6
7
8
9
0
1
a
3
4
5
b
7
8
9
z
We can use the ''diff'' command to scan for differences:
$ diff file1.txt file2.txt
2c2
< 2
---
> a
6c6
< 6
---
> b
10c10
< 0
---
> z
$
If the output from ''diff'' is blank, then it means the two files are identical. Any other output shows the differences between the two files in a format which can be used to create patches, automate changes etc. The lines (and content) which differ are shown using left (<) and right (>) chevrons to indicate if they are in //file1// or //file2//.
The other useful piece of information is that the //return code// of the ''diff'' command is always **1** if the files are different:
$ diff file.txt file2.txt
2c2
< 2
---
> a
6c6
< 6
---
> b
10c10
< 0
---
> z
$ echo $?
1
$
If the files are identical in content then no output is shown __and__ the //return code// of the ''diff'' command is **0**:
$ diff file.txt file.txt
$ echo $?
0
$
=== Creating Patch Files ===
You can use ''diff'' to produce a //patch// file, which will apply the changes from the second file to the first. This is often useful for distributing changes to code or data, without distributing the entire original file.
Patch files can be generated from two files using the following diff options:
$ diff -Naru file1.txt file2.txt
In the example above, this would generate the following patch file:
--- file1.txt 2025-03-31 10:59:24.952670568 +0100
+++ file2.txt 2025-03-31 10:59:41.650482844 +0100
@@ -1,10 +1,10 @@
1
-2
+a
3
4
5
-6
+b
7
8
9
-0
+z
==== Binary / Data Files ====
While you can use ''diff'' on binary files (that is, files which are not just text), it will only state if the files are different, not what the differences are.
Assuming the files ''binary1.dat'' and ''binary2.dat'', both containing different data:
$ diff binary1.dat binary2.dat
Binary files binary1.dat and binary2.dat differ
$
----
[[:faq:index|Back to FAQ]]