Robust Dictionary Attack of Short Simple
Substitution Ciphers
Edwin Olson
MIT Computer Science and Artiﬁcial Intelligence Laboratory
Computer Engineering
Cambridge, MA 02140
Email: eolson@mit.edu
Abstract— Simple substitution ciphers are a class of puzzles
often found in newspapers, in which each plaintext letter is
mapped to a ﬁxed ciphertext letter and spaces are preserved.
In this paper, we describe a system for automatically solving
them, even when the ciphertext is too short for statistical
analysis, and when the puzzle contains non-dictionary words.
Our approach is a based around a dictionary attack; we describe
several important performance optimizations, as well as effective
techniques for dealing with non-dictionary words. We present
quantitative performance results for several variations of our
approach as well as two other implementations.

I. I NTRODUCTION
Approaches to automatically solving simple substitution
ciphers fall into two basic groups. The ﬁrst family relies on
statistical analysis of ciphertext frequencies (including digrams
and N-grams) and permutes the mapping from ciphertext
to plaintext such that the plaintext is highly probable. This
family includes methods based on relaxation [1] and genetic
algorithms [2]; they are typically iterative in nature. These
approaches rely on having relatively long ciphertext passages
(hundreds of letters) in order for their measured statistics to be
meaningful. Notably, some methods can recover the plaintext
even when spaces are permuted along with the other 26 letters
[3].
The second family of approaches is based on dictionary
attacks [4], [5], [6]. These approaches can be exceptionally
fast and can operate on very short ciphertexts where the
statistics are virtually meaningless. This performance comes
at a cost: they typically require that spaces be preserved in the
ciphertext and require a large dictionary that contains most (if
not all) of the plaintext. Unfortunately, large dictionaries tend
to produce spurious (non-sensical) solutions. The method by
Hart [5] differs from other methods in this family, in that it
purposefully uses a very small dictionary (135 words) and
explicitly considers hypotheses that ciphertext words are not
found in the dictionary.
In this paper we present our approach, a dictionary-based
attack with several reﬁnements:
• At each level of the search tree, we perform a setintersection based optimization to reduce the search space
of child nodes.
• We allow the substitution of a single letter at a node in
the search tree, rather than requiring a whole word to be
substituted.

We dynamically select which ciphertext word or ciphertext letter to expand at each node. Our selection heuristic
was selected after comparing several static and dynamic
planning strategies.
• We employ three separate mechanisms for dealing with
non-dictionary words; the resulting algorithm can robustly handle ciphertexts with multiple non-dictionary
words, albeit with additional search costs.
• We compute posterior probabilities of candidate solutions
using a priori trigram probabilities; this allows multiple
solutions to be meaningfully ranked. This allows us to use
larger dictionaries: even if many solutions exist (many
non-sensical), the most likely solutions are automatically
identiﬁed and brought to the user’s attention.
Our approach is effective and fast on virtually all cryptograms, including those with very short ciphertexts (under 40
letters), even when non-dictionary words are present (including
proper names, and word-play). In addition, our software is
available under an open source license.
•

II. A PPROACH OVERVIEW
Our algorithm begins by parsing the ciphertext into a
number of ciphertext words. There are a number of challenges
even here. Hyphenated words, for example, could be treated
at least three different ways: the dictionary could contain a
list of acceptable hyphenated words (including the hyphen),
hyphens could be stripped (producing compound words),
or hyphens could be replaced with spaces (producing two
separate words). Casual English is wildly inconsistent with
regard to hyphenated words, and the ﬁrst two methods are
likely to produce non-dictionary words: we cannot reasonably
compile a list of all words that could be hyphenated. We opt for
the conservative approach of splitting hyphenated words into
separate words since it is more likely to result in dictionary
words.
Apostrophes present a similar dilemma: while there are
relatively few contractions, there are countless possessives. For
simplicity, we strip apostrophes, with the result that possessive
constructions masquerade as plurals.
For each ciphertext word, we look-up a list of candidate words with identical letter patterns. For example, both
“HELLO” and “SILLY” are suitable replacements for the
ciphertext “RXQQL”, all of which have the canonical letter

Algorithm 2 SolveRecursive(M ap)
1: if AllCipherTextKnown() then
2:
ReportFullSolution(M ap)
3:
return
4: C = PlannerSelectUnknownLetterOrWord()
5: has child = false
6: M ap = SelfIntersection(M ap)
7: for all P in Candidates(C) do
8:
if (IsConsistent(M ap, C, P )) then
9:
N ewM ap = AddMappings(M ap, C, P )
10:
SolveRecursive(N ewM ap)
11:
has child = true
12: if (has child = false) then
13:
ReportPartialSolution(M ap)
14: return

pattern “ABCCD”. We accelerate the dictionary look-up by
storing our dictionary on disk sorted by letter pattern, rather
than alphabetically.
Like most dictionary attacks, our algorithm is fundamentally
a depth-ﬁrst search (see Alg. 1 and Alg. 2). At every node in
the search tree, a ciphertext word is selected and its various
plaintext alternatives are substituted. Each substitution creates
a child node in which more of the plaintext is known than
at the parent node. The state of the solver is a map, which
records for each ciphertext letter the set of possible plaintext
letters. Initially, each ciphertext letter could be any plaintext
letter, i.e., has the set {A, B, C, ..., Z}.
When a node performs a substitution, it must make sure
that the substitution is consistent with the substitutions made
by its ancestor nodes:
•
•

No ciphertext letter may be mapped to more than one
plaintext letter. For example, (X=A, X=B) is forbidden.
A plaintext letter can only be mapped to once. E.g.,
(X=A, Y=A) is forbidden.

III. S ET- INTERSECTION CANDIDATE PRUNING
The set-intersection candidate pruning method considers
constraints resulting from the simultaneous consideration of
multiple candidate lists. It is a purely logic-inferential algorithm with no searching. It is an effective method: it can often
solve cryptograms by itself, without any search.
Let us begin with an example. Suppose we have two
ciphertext words and their candidate lists:

We extend the classical dictionary attack by allowing a
node in the tree to substitute a single letter, rather than a
whole word. This is an important capability that improves
performance, as we will show in Section VIII.
The particular choice of which ciphertext word (or letter) to
substitute at each node has a profound impact on the search
complexity. Some words have more candidate substitutions
than others and will yield more child search nodes. Conversely,
words that are longer and contain more frequently occurring
letters will, once substituted, more greatly constrain future
child nodes, resulting in fewer search nodes. The choice of
which ciphertext word or letter to substitute at each node is
the responsibility of a planner, which we will discuss in detail
in Section V.
A signiﬁcant optimization can performed on each node,
reducing the set of all candidate lists signiﬁcantly. This optimization is described in Section III. Its effect is to eliminate
candidate words for ciphertext words that have not yet been
substituted.
When the entire plaintext is known, we report a solution, but
we do not terminate the search. Many ambiguous solutions are
possible, and our algorithm enumerates them all. Additionally,
whenever we reach a “dead-end” in the search tree (i.e., a node
that has no children), we report a partial solution; these are
useful when non-dictionary words are present, as described in
Section VI.

MCDMRCNSFX
deadweight
disdainful
gregarious
perplexity

MSCNPPRX
aﬂutter
bedrooms
gorillas
proceeds
typhoons

Note that the candidate list for MCDMRCNSFX requires
that M ∈ {d, g, p}, while the candidate list for MSCNPPRX
requires that M ∈ {a, b, g, p, t}. Clearly, only M ∈ {g, p}
satisﬁes both constraints. We can thus eliminate all of the
candidates where this joint constraint is not satisﬁed, yielding:
MCDMRCNSFX
gregarious
perplexity

MSCNPPRX
gorillas
proceeds

Now note that the candidate list for MCDMRCNSFX requires that C ∈ {r, e} while the candidate list for MSCNPPRX
requires that C ∈ {o, r}. Satisfying both constraints requires
that we assign C = r, yielding a unique decryption: “gregarious gorillas”.
We now describe the algorithm in general, as listed in
Alg. 3. We begin by selecting one ciphertext word and its
corresponding list of plaintext candidates. Any candidates that
are not consistent with the puzzle’s current map are discarded.
Assuming that the word’s plaintext is one of the remaining
candidate words, we compute the possible set of ciphertext-toplaintext letter mappings. After considering all of the candidate
words, we then compute the intersection of this set with the
puzzle’s current map. This intersection becomes the puzzle’s
new map.

Algorithm 1 Solve(M ap)
1: for all X in {A, B, C, ..., Z} do
2:
if (UserProvidedClue(X) then
3:
M ap(X) = { UserClue(X) }
4:
else
5:
M ap(X) = {A, B, C, ..., Z}
6: SolveRecursive(M ap)

2

Index 0

We iterate over the ciphertext words, performing an intersection operation for each, until the puzzle’s map has reached
steady-state, i.e., until no more reductions are possible.
Algorithm 3 SelfIntersection(M ap)
1: {Initialize Map to full sets}
2: repeat
3:
for all C in CiphertextW ords do
4:
{Initialize NewMap to empty sets}
5:
for all X in {A, B, C, ..., Z} do
6:
N ewM ap(X) = {}
7:
for all P in Candidates(C) do
8:
if (IsConsistent(M ap, C, P )) then
9:
N ewM ap =AddMappings(N ewM ap, C, P )
10:
M ap = Intersect(M ap, N ewM ap)
11: until no reductions performed
12: return M ap

Candidates
eliminated
by root node
(level 0)
Candidates
eliminated
by level 1

firstcand
le wo

disab

rd

firstcand

Index N-1

Fig. 1. Candidate list data structure. Each ciphertext word has an array of
candidate words. As successive nodes in the search eliminate candidates, the
eliminated candidates are shufﬂed to the end of the list of eliminated words
and f irstcand is incremented. A node can re-enable the words it disabled
by simply changing the pointer f irstcand.

SetIntersection is run at every node in the tree (including
the root): it typically results in signiﬁcant reductions to the
size of the candidate lists, which reduces the branch factor
of the depth-ﬁrst search. Since run-time cost is exponential in
the branching factor, this results in signiﬁcant speed-ups, as
quantiﬁed in Section VIII.
We note that this algorithm does not detect all possible
inferences. For example, suppose that three separate candidate
lists require that X ∈ {a, b}, Y ∈ {a, b}, and Z ∈ {a, b, c}.
It can be deduced that Z = c, but this would not be detected
by the algorithm described above. Our efforts to exploit these
inferences have indicated that the search costs to identify them
generally exceeds the beneﬁt.

At each node in the search tree, we push the value of
f irstcand, onto a stack. Disabling a word is an O(1) operation: the candidate to be disabled is swapped with the
candidate at f irstcand, and f irstcand is incremented. Reenabling all of the words is done by popping the value
f irstcand off the stack; no shufﬂing of words is required,
resulting in an O(1) operation.
Note that this scheme does not preserve the order of the
candidates, but the order is not important for correctness.
(In fact, the candidate lists are presorted according to the
frequency of the occurrence of each candidate. This does not
change the total run-time of the algorithm, but it does cause
the search to explore the most likely solutions ﬁrst. In other
words, the most likely solution tend to be found earlier in
the search. The permutation of the candidate order due to the
enabling/disabling logic described above might seem to defeat
this optimization. However, the root node is totally unaffected
by the permutation, and the root node’s search order has the
greatest effect by far.)

IV. M AINTAINING C ANDIDATE L ISTS
As we have described the algorithm thus far, we build
a list of plaintext candidates for each ciphertext word and
make extensive use of the function IsConsistent to skip over
candidates that are not consistent with the puzzle’s current
map. If implemented in this manner, a great deal of CPU time
would be spent in IsConsistent eliminating the same candidates
over and over again.
When a word is eliminated at a node in the search tree,
it can be eliminated from all of its children as well. Conceptually, when a candidate is ruled out, it is moved into a
“disabled” list; only “enabled” words are searched at children
nodes. Naturally, when a node has ﬁnished iterating over its
candidates (and is about to return to the parent node), all of
the candidates it disabled must be re-enabled. It is important
that both of these operations–disabling individual words and
re-enabling all the words that were disabled at one node in
the tree– be very fast.
Our solution is to store each ciphertext word’s candidate
list in an array divided into two parts: the ﬁrst part contains
disabled words, and the second part contains enabled words
(see Fig. 1). An index, f irstcand, maintains the index of the
ﬁrst enabled word. This is initialized to zero at the root node,
meaning that no candidates are disabled.

V. P LANNING
At each node in the search tree, the planner selects an
unknown ciphertext letter or word for substitution. The choice
is critical for good performance, since each option involves a
different amount of work for a different amount of plaintext
recovery. Planners fall into one of two basic categories:
• Static: The ciphertext words are sorted according to some
metric when the puzzle is initialized, and the ith word is
always selected at depth i in the search tree.
• Dynamic: At every level of the search tree, the ciphertext
word or letter to substitute is chosen, independently of
what other nodes at the same level of the tree have
done. Some metric is computed for each ciphertext word,
which could include the amount of plaintext that would
be recovered if that ciphertext word was selected, and/or
the number of candidate words enabled for that ciphertext
word. The best ciphertext word or letter is then selected.
3

Static planners are attractive since they do not require
computation at each level of the search tree. However, they
are unable to take exploit opportunities that arise in the
middle of a search: the size of candidate lists is sensitive
to the particular plaintext substituted made at earlier nodes
in the tree, not just which ciphertext word was substituted.
For example, substituting “RXQQL=hello” may greatly reduce
the candidate list for one ciphertext word while having little
effect on a second; substituting “RXQQL=silly” could have
the opposite effect. Dynamic planners, capable of exploiting
these situations, almost invariably outperform static planners.
We consider only dynamic planners for the remainder of this
paper.
In either static or dynamic planners, there are a number
of plausible cost metrics that could be used to determine the
best word or letter to substitute. Suppose, for example, that
a particular ciphertext word C has N plaintext candidates. If
substituting C gives us L mappings (i.e., we learn the correct
translation of L ciphertext letters), we could deﬁne the cost
as:
cost(N, L) = N 1/L

cost(N, L) = N

This cost function is, in aggregate, the best performing
of the cost functions we evaluated. Like the linear metric, it appears to be successful due to the effects of the
SetIntersection algorithm. Performing a substitution– even
a single letter– reduces the candidate lists so dramatically
that it is worth doing it, even though the amount of plaintext
recovered at that node is rather small.
While the effective branching factor is principled, it
is out-performed by heuristics that attempt to model the
beneﬁts of the SetIntersection algorithm. However, the
SetIntersection algorithm’s performance varies from puzzle
to puzzle. When SetIntersection does not reduce the candidate lists substantially, the other methods can out-perform it.
It is possible that a more complicated metric could be devised
that would better estimate the efﬁcacy of SetIntersection;
this would lead to an even faster algorithm.
VI. N ON -D ICTIONARY W ORDS
Non-dictionary words generally cause dictionary attacks to
fail since substituting the incorrect plaintext word usually leads
to a contradiction.
Our algorithm includes three strategies for dealing with
non-dictionary words. The ﬁrst strategy is to report partial
solutions at any terminal node in the search tree. By the time a
contradiction is encountered, many letter mappings have been
made; often, many of them are correct. Thus, we report a
partial solution; a human user can often trivially determine
the correct answer from such a partial solution.
The second strategy for dealing with partial solutions is to
ignore a subset of the ciphertext. If we are unable to solve
the puzzle with all words enabled, we try solving the puzzle
with one word disabled at a time. If we are still unsuccessful,
we try solving the puzzle with all combinations of two words
disabled at a time, then three. This strategy is sufﬁcient to solve
all puzzles with up to three non-dictionary words. Note that
the Self Intersection algorithm operates as usual, though it
also ignores the disabled words.
Puzzles with more than three non-dictionary words pose
an additional challenge. At this point, trying to blindly guess
which words are non-dictionary words becomes unreasonable.
Instead, we do two things:
• We disable the Self Intersection algorithm. The
Self Intersection algorithm considers all the ciphertext
words at every node; if a non-dictionary word is considered, candidate lists for all other words are immediately
affected, even at the root node. The result is that candidates for in-dictionary words can be erroneously pruned
out, leading to failure.
• We switch to a Random planner. Any time that the
planner picks a ciphertext word whose plaintext is in the
dictionary, a signiﬁcant amount of plaintext is recovered.
If the planner picks a few dictionary words, a human can
usually discern the entire correct plaintext.

(1)

This cost metric, which we call the effective branching
factor (EBF) metric, computes the effective branching factor
B that would occur in the hypothetical situation that only
one mapping is discovered per search node. Note that the
total work of this hypothetical situation is N = B L , which
matches the actual amount of work N at this node. It is an
appealing metric because it captures the fact that the cost of a
depth-ﬁrst search is the product of the branch factors at every
depth of the tree. It provides a principled way of computing
a one-dimensional ranking from two different statistics of a
ciphertext word: the number of candidates and the amount of
plaintext that would be recovered.
A simpler metric, the linear metric, is the number of
candidates per amount of plaintext recovered:
cost(N, L) = N/L

(3)

(2)

Eqn. 2 tends to more heavily penalize words with many
candidates and, to our initial surprise, generally out-performs
the previous metric. We attribute this to the effects of the
SetIntersection algorithm: at each node in the tree, the
number of candidates is reduced substantially. It is thus advantageous to make a substitution as quickly as possible (i.e.,
pick a word with few candidates), and let the SetIntersection
optimization reduce the branching factor of child nodes. In
other words, a ciphertext word with many candidates is bad,
even if the effective branching factor is small, because all of
those candidates must be considered without the beneﬁts of
SetIntersection along the way.
Is it advantageous to run SetIntersection at the earliest
opportunity, without regard to how much plaintext would be
recovered? The lazy cost function does exactly that:
4

The quick brown fox jumped over the lazy dogs.
The quick frown box jumped over the lazy dogs.
The jacky frown box palmed over the quiz dogs.
The quick franz wax jumped aver the glob days.
Bye lamps grown fox jacked over bye quiz doth.
She fatly grown pox jambed over she quiz dock.

1000
900
800
700

TABLE I

600

“T HE QUICK BROWN
FOX JUMPED OVER THE LAZY DOGS .”

time (s)

A SUBSET OF THE 1750

POSSIBLE SOLUTIONS TO

500
400
300
200

The combination of disabling the Self Intersection algorithm and the Random planner ensure that the entire search
space will eventually be searched. The success of this strategy
is dependent upon the Random planner selecting ciphertext
words that are in the dictionary. (In contrast, note that our
second strategy relies on guessing the set of words that are not
in the dictionary.) In practice, the Random planner must only
identify a few dictionary words correctly before an adequate
partial solution is found. are In addition, we provide a method
for users to indicate those words that are likely to be nondictionary words. Many newspaper cryptograms consist of
famous quotes, with the authors attributed at the end; it is
easy to pick out the proper names in this case.

100
0
Lazy

Linear

EBF

No
SetIntersection

No single letters

Static

Fig. 2. Performance results. In each experiment, only a single parameter
is altered; otherwise, the lazy planner with all optimizations is used. The
lazy planner is typically faster than the EBF, linear, or static planners.
Set-intersection and the use of single-letters show dramatic performance
improvements.

In the case of “the quick brown fox...”, the desired solution
was ranked ﬁrst, out of 1750.
In our implementation, we compute chi2 values rather than
probabilities. The best solutions are kept in a Min-Heap [7]
with a ﬁxed maximum capacity (nominally 500). When the
solution set grows too large, we can quickly extract the worst
solution using the standard heap-remove method.

VII. R ANKING S OLUTIONS
Particularly with short cryptograms, many puzzles have
multiple solutions. If the puzzle contains non-dictionary
words, the problem is compounded since our mechanisms
for dealing with non-dictionary words involve discarding
constraints that would otherwise eliminate many possible
solutions.
For example, the puzzle “The quick brown fox jumped over
the lazy dogs” has 1750 different solutions (see Table I), even
when using our smallest library with 110,916 words. (Keep
in mind that the dictionary contains common abbreviations,
foreign words, and proper names.) With 1750 possibilities,
some computer assistance is needed to bring the most likely
candidates to the operator’s attention.
We do not attempt to enforce rules of grammar; instead,
we compute the posterior probability of each solution using
a table of trigram probabilities. Our probability tables include
the occurrence of spaces, but not punctuation. We found that
trigram tables produced a noticeable improvement in solution
ranking over digrams.
Recall that our algorithm reports partial solutions, in addition to full solutions. In full solutions, all ciphertext letters
are mapped to plaintext letters; in partial solutions, some
letters are not mapped. This presents a small difﬁculty in
producing a posterior probability, since we do not know the
entire outcome. We opt to treat unknown letters by selecting
the trigram entry that matches and has the lowest probability.
This creates a desirable bias towards more-complete solutions
over less-complete ones.
Our trigram ranking system is very effective; the intended
answer usually appears within the ﬁrst handful of solutions.

VIII. R ESULTS
We have performed quantitative measurements of the performance impacts of our improvements over naive-dictionary
based attacks (see Fig. 2). We compare three different dynamic
planners, a static planner, the impact of the set-intersection
optimization, and the impact of allowing a planner to select
a single-letter substitution (rather than requiring a node to
substitute an entire ciphertext word). In each case, we have
modiﬁed only a single experimental variable: when not otherwise speciﬁed, the lazy planner is used with all optimizations.
Each column in the plot represents the total run time for the
algorithm on a set of 21 cryptograms; colors within a column
indicate the run time on a particular puzzle. We have opted for
a cumulative time metric, rather than arithmetic or geometric
mean, since it best models the experience of a user trying a
variety of puzzles. Note that many puzzles are easily solved (in
milliseconds) by all implementations, and their contribution to
the total run time is not readily discerned from the ﬁgure. In
all cases, either the correct plaintext or a readily interpretable
plaintext (with only minor errors due to non-dictionary words)
was recovered.
The performance of the “lazy” planner with our other
improvements is signiﬁcantly better than other permutations.
The choice of dynamic planner (lazy, EBF, or linear) impacts
performance within about a factor of three. The impact of the
set intersection algorithm is even more substantial: it reduces
run time by a factor of 5.45. The ability to substitute a single
letter improves performance by a factor of 6.79. Dynamic
5

Algorithm

Puzzles solved

Time (solved puzzles)

Our time

Hart
Dunn

24%
38%

0.143 s
18.664 s

1.045 s
0.567 s

Our algorithm can be applied to non-English languages
with appropriate modiﬁcation of the parser (the handling of
diacritical marks is more complex in other languages), and an
appropriate dictionary. The solver itself makes no assumptions
about the number of letters in the language, for example.
In the future, we are considering more elaborate statistical
analysis of proposed plaintext solutions. These improvements
could not only improve the posterior rankings of candidates,
but could direct the search towards more probable candidates
in the ﬁrst place. On the other hand, our existing algorithm
only struggles on short puzzles, and short puzzles often contain
statistically unusual puzzles; the added complexity may not be
worth the effort. This is an area for future experimentation.

Fig. 3. Comparison to other algorithms. Cumulative run times for puzzles
successfully solved are given, versus the cumulative run time for our approach
on the same set of puzzles. Hart and Dunn’s methods solved only a fraction
of the puzzles in our benchmark set. Hart’s method is typically very fast, and
in small number of cases that it is successful, is noticeably faster than our
approach. Dunn’s method is both less successful and slower than ours. Note
that our method successfully solved all the puzzles in the benchmark set.

planning is the biggest win of all, improving performance
by a factor of over 13. We do not claim that all of these
improvements are cumulative, but the point remains that
dramatic speedups have been obtained.
We also compared our algorithm’s performance to two
other implementations whose source code was readily available: Hart’s method [5] which is a dictionary attack with
an intentionally minimal dictionary, and a more conventional
dictionary attack by K. Dunn [6] using a 133k word dictionary
and employing a simple static planner. Neither method does
well on our benchmark set (see Fig. 3): they produce gibberish
on the majority of the test cases. Hart’s algorithm fails on 76%
of our tests, though when it does work, it is generally faster
than our algorithm. Dunn’s algorithm fails on 62%, and is
typically slower than our algorithm.
We note that the average difﬁculty of our benchmark puzzles
is quite high: many puzzles contain non-dictionary words, and
some of the puzzles have only very common letter patterns.

ACKNOWLEDGEMENTS
We are indebted to Pete Wiedman, an avid solver and creator
of cryptograms, for his extensive testing of our software.
His reports of performance anomalies led to some of the
improvements described in this paper. He also contributed
extensively to the dictionary.
R EFERENCES
[1] S. Peleg and A. Rosenfeld, “Breaking substitution ciphers using a
relaxation algorithm,” Commun. ACM, vol. 22, no. 11, pp. 598–605, 1979.
[2] R. Spillman, M. Janssen, B. Nelson, and M. Kepner, “Use of a genetic algorithm in the cryptanalysis of simple substitution ciphers,” Cryptologia,
vol. XVII, no. 1, pp. 31–44, 1993.
[3] T. Jakobsen, “A fast method for cryptanalysis of substitution ciphers,”
Cryptologia, vol. 19, no. 3, pp. 265–274, 1995. [Online]. Available:
citeseer.ist.psu.edu/jakobsen95fast.html
[4] M. Lucks, “A constraint satisfaction algorithm for the automated
decryption of simple substitution ciphers,” in CRYPTO, 1988, pp.
132–144. [Online]. Available: citeseer.ist.psu.edu/lucks88constraint.html
[5] G. W. Hart, “To decode short cryptograms,” Commun. ACM, vol. 37,
no. 9, pp. 102–108, 1994.
[6] K. Dunn, “Ciphergram solution assistant,” 1997. [Online]. Available:
”http://www.gtoal.com/wordgames/kdunn/CSA unix/csa.htm”
[7] T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to
Algorithms, 1st ed., ser. The MIT Electrical Engineering and Computer
Science Series. The MIT Press, 1990.

IX. AVAILABILITY
Our software, including source code, is available
under the GNU Public License (GPL). The project’s
website is at http://www.blisstonia.com/software/Decrypto.
An
on-line
version
is
available
at
http://www.blisstonia.com/software/WebDecrypto.
X. C ONCLUSION
We have presented a robust dictionary-based attack on simple substitution ciphers. Our approach employs dynamic planning, set-intersection optimizations, and several mechanisms
for dealing with non-dictionary words. Multiple solutions are
ranked according to their posterior probability using a table
of trigram probabilities. We also employ an effective yet
simple data structure that allows candidate lists to be efﬁciently
maintained.

6