back to TIL index

Using comm as an Alternative to diff

March 14, 2025 linux shell cli text-processing tools

Today I discovered the comm command—an alternative way to compare files. Unlike diff, comm is particularly useful for working with sorted data.

The Problem

Every so often, I need to compare two sets of data, such as information from Service A vs B. My goal is to quickly identify what data exists in one but not the other.

What I Used to Do

I used to rely on diff with a set of specific flags:

diff --new-line-format="%L" --unchanged-line-format="" a.sorted b.sorted > missing.txt

Explanation:

A Simpler Alternative: comm

With comm, I can achieve the same result more concisely:

comm -23 a.sorted b.sorted > missing.txt

Explanation:

comm vs diff

While both comm and diff compare files line by line, they serve different purposes:

Featurecommdiff
Comparison BasisWorks on sorted filesWorks on any files
Sorting Required?YesNo
Output FormatThree-column output (unique & common)Contextual differences
Suppressing OutputCan hide columns (-1, -2, -3)No direct way to suppress lines
Finding Unique Linescomm -23 file1 file2diff --new-line-format="%L" --unchanged-line-format="" file1 file2
Finding Common Linescomm -12 file1 file2diff file1 file2 | grep '^ '
Best Used ForComparing sorted lists and extracting unique/common linesIdentifying line-by-line differences

My Takeaways

For this specific task, I now prefer comm because it’s simpler and more intuitive. A key advantage is that it forces me to remember that files must be sorted—something I often forget when using diff, leading to unexpected results.