comm is a shell command for comparing two files for common and distinct lines. It reads the files as lines of text and outputs text as three columns. The first two columns contain lines unique to the first and second file, respectively. The last column contains lines common to both. Columns are typically separated with the tab character. If the input text contains lines beginning with the separator character, the output columns can become ambiguous.
via Wikipedia infobox
comm is a shell command for comparing two files for common and distinct lines. It reads the files as lines of text and outputs text as three columns. The first two columns contain lines unique to the first and second file, respectively. The last column contains lines common to both. Columns are typically separated with the tab character. If the input text contains lines beginning with the separator character, the output columns can become ambiguous.
For efficiency, standard implementations of expect both input files to be sequenced in the same line collation order, sorted lexically. The sort command can be used for this purpose. The algorithm makes use of the collating sequence of the current locale. If the lines in the files are not both collated in accordance with the current locale, the result is undefined.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).