Monday, December 1, 2014

To find the files which appear in both file lists

This problem is always encountered in file processing, but in linux, the solution is really simple.

Firstly, sort the file lists

sort file_list_1 > sorted_list_1
sort file_list_2 > sorted_list_2

Then

comm -12 sorted_list_1 sorted_list_2


According  to the help page of comm, what the operation is doing:

Usage: comm [OPTION]... FILE1 FILE2
Compare sorted files FILE1 and FILE2 line by line.

With no options, produce three-column output.  Column one contains
lines unique to FILE1, column two contains lines unique to FILE2,
and column three contains lines common to both files.

  -1              suppress column 1 (lines unique to FILE1)
  -2              suppress column 2 (lines unique to FILE2)
  -3              suppress column 3 (lines that appear in both files)

...

Wonderful.



No comments:

Post a Comment