Passphrases and the Passphrase Token Attack

Never say “passphrase” around a pedant. Peasants and pheasants are OK, but pedants will bring up the passphrase token attack, frequently overstate the threat, or flat out get it wrong. It isn’t that these pedants can’t do math, it’s just that it didn’t occur to them to do the math.

Let’s start with some definitions before we disembowel the passphrase token attack.

Pedant (noun): A person you don’t want to talk about passphrases with

Passphrase (noun): A password composed of words. Note, it doesn’t have to exclusively be words

Token (noun):

  1. A sign of appreciation
  2. A member of the “The Tokens,” the band that recorded “The Lion Sleeps Tonight”
  3. A necessary component of a passphrase token attack

In a passphrase token attack, each word is a “token” that represents a collection of letters. In a typical brute force password attack you test combinations of characters. If you have a standard 95-character set, then there are 95^8 potential passwords for an 8 character password. This means that you are going to test a sequence of characters one at a time. You might start with all of the combinations of lowercase letters, then upper, then numbers, then symbols, and then start combining character sets.

For a passphrase token attack, you are testing entire words at a time. The word “plate” is a token representing the set of letters p-l-a-t-e that make up the word “plate,” and in that specific sequence. The word “of” is a token representing the letters “o” and “f” in that specific sequence. The word “cauliflower” is a token representing… you know.

Our passphrase is the three-word (token) phrase “plate of cauliflower.”

This is where some pedants fail their math tests. Some will claim that the three tokens are no better than a three-letter password. Can I get a head slap from you all, please?

Let’s do the math here. If the entire 95-character set is available, then 3 characters provide a maximum of 95^3 possible passwords. That’s about 857,375 potential passwords. OK, it’s exactly 857,375 potential passwords. And a three-word passphrase?

The number of potential permutations of passwords and passphrases alike is the size of the set raised to the power of the number of objects used. The printable ASCII set is 95 characters. Get ready to have your mind blown (or not), there are over 1,000,000 million words in the English language alone. And so, three words provide at least 1,000,000^3 potential passwords. Passphrases are passwords after all. That is 1 quintillion (1 followed by 18 zeros) passwords.

However, nobody is going to use or know a million or more words. An article in the Economist will provide you with some vocabulary size-related information. What it boils down to is that your “dictionary” will be limited to a subset of all of the words. The smaller the subset, the higher the risk that a required word isn’t in your dictionary. However, the smaller the subset, the less time it takes to go through all of the possible passphrases. Kind of. People will do things like use the number four to represent the word for. To cover that eventuality the number four must be in the dictionary. If we limit our dictionary to the 2,000 most commonly used words, then there are 2,000^3, or 8 billion potential 3-word passphrases in the small collection of words. While 8 billion is better than 875,375, it is still barely better than a 5-character password. Although I advocate for five, or more, word passphrases, the three-word passphrase isn’t as bad as it sounds. I’ll get back to that later.

Does it help to use words in a foreign language? One pedant claimed that it didn’t because they have foreign language words in their dictionary too. Wrong answer. Adding words in a foreign language means you increase the dictionary size, and that means a larger number of potential passphrases to test. That means it may take a lot more time to crack. I actually have a passphrase that includes one word each from two obscure languages. How big is a dictionary that includes a reasonable set of words from a few dozen different languages?

So, what is the magnitude of the passphrase token attack threat? Miniscule, if passphrase contains enough letters, and is not a common phrase. Unless an attacker knows that I use a passphrase, they will spend a long, long time exhausting a traditional brute force attack, which will eventually create the combination of characters that form the words in the passphrase. Essentially, they won’t get to the passphrase token attack. If they do start with the token attack, then a single word that is not in their dictionary will result in a passphrase that is 100% immune to the token attack.

The passphrase “plate of cauliflower” is 20 characters long. Before you say “but it doesn’t have uppercase, or numbers, or symbols,” do the math. There are 26 lowercase letters and one special character. The space between words is a character, but let’s pretend it is one of the 26 lowercase letters for the time being. Now we have 26^20. That’s almost 37,000 times more potential permutations than a complex 12-character password can provide. One more thing. Most people are probably going to start their passphrase with an uppercase letter, and that ups the count to 52^20. Fine, 53^20 = 20,896,178,655,943,100,000,000,000,000,000,000. Because we did use a space, the actual number is larger, but irrelevant in terms of current processing power.

Now, if my three-word passphrase is “dogs and cats,” then we have a problem. Although the 13-lowercase letter password provides 4 times as many permutations as a complex nine-character password, it’s a common phrase. An attacker is likely to start with common phrases because testing the 10 million most common phrases takes very little time.

The Composition of a Great Passphrase:

Use an original sentence. This defeats the dictionary attack looking for common phrases.

Make sure there are at least 16 characters, especially if you are only using one character set. Most experts recommend a minimum of 20 characters. You might want to create your passphrases in a word processor so as to easily get the character count.

Make your passphrase easy to remember. The entire purpose of using passphrases is to be able to make a long password that is easy to remember.

If you are concerned that your passphrase isn’t resilient enough to a passphrase token attack, then use one or more words from an obscure foreign language, or one or two obscure words from a common foreign language, such as Klingon or Pig Latin.

Let me give you an example of a stupendously great passphrase: piglet was one cute little dude.

The passphrase is original. The passphrase is easy to remember. The passphrase is 31 characters long. The passphrase makes me smile.

When you encounter a situation where the password has to be complex, make it easy. Like this Piglet was one cute little dude1! Still easy to remember and it meets the requirements for complexity.

Do bear in mind that there are three predators that present the exact same risk to passphrases as they do to conventional passwords.

  1. Phishing: If you tell me your password, why do I need to crack it?
  2. Keystroke loggers: You never saw it coming.
  3. Data breaches: Specifically breaches where a company uses little or weak encryption. They gave away your password.

All three weaknesses are overcome by one defense: Multi-factor authentication.

Be sure to save your passphrases as you change each one. Eventually, you can string them together for use in a poetry slam.

Randy Abrams
Senior Security Analyst
SecureIQLab