Unix day — 3 : Removing dupes from file

How to clean up your dictionary file for bruteforce attack

Ned Poplaski (CISSP)
2 min readMay 18, 2021
How to find and remove duplicate words in your dictionary file

Why do you need to do this ?

At some point in time when you learn cracking passwords, you will have to end up creating your custom dictionaries . Over time these dictionaries grow as you add in more words . And that’s when you find it important to remove the duplicates from this file .

Enter Unix

View the dictionary file

$ cat 1.txt password
password123
hello
hello
password
password

Note : the words password and hello are duplicated

Unix Solution :

sort 1.txt | uniq > out.txt

This sorts the words in 1.txt and pipes to the uniq command , which gets all the uniq words in the list and pipes it out to a desired file ( out.txt ) which has only the unique files. Conversely, in case you want to get only the list of duplicate words , use this argument :

sort 1.txt | uniq -d > dupes.txt

Word of caution :

Sometimes the last word might not have a carriage return ( a new line ) . So this might be mistakenly taken as a unique word in the list

Note: some cracking software like hashcat does this check before each run . But it is good to have a clean dictionary to begin with.

--

--

Ned Poplaski (CISSP)

I share news and Lessons to make possible a safer cyber experience. cyber security educator. ex-McAfee, Consultant snyk.io,sonatype.