Find duplicate records in text file

Find duplicate records in text file

Example:
abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
tas 3420 3562 2123 1343 2176 7654 3252 8765 5643 3452
aer 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
tas 3420 3562 2123 1343 2176 7654 3252 8765 5643 3452

UNIX:

display the no of occurance and the record
> sort f1.txt|uniq -c
   2 abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
   1 aer 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
   2 tas 3420 3562 2123 1343 2176 7654 3252 8765 5643 3452

display only the duplicate records
> sort f1.txt|uniq -d
abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
tas 3420 3562 2123 1343 2176 7654 3252 8765 5643 3452

display distinct records
> sort f1.txt|uniq
abc 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
aer 1000 3452 2463 2343 2176 7654 3452 8765 5643 3452
tas 3420 3562 2123 1343 2176 7654 3252 8765 5643 3452

Reference:
How to find Duplicate Records in a text file
Shell: How To Remove Duplicate Text Lines
How to Remove Duplicate Lines in Unix

Windows:

Notepad++ can sort by line, and remove the duplicate lines at the same time.
  1. Open the menu under: TextFX-->TextFX Tools
  2. Make sure "sort outputs only unique..." is checked
  3. select a block of text (ctrl-a to select the entire document).
  4. click "sort lines case sensitive" or "sort lines case insensitive"
Reference:
remove duplicates from a text file in free editor

你可能感兴趣的:(Find duplicate records in text file)