# Research papers

From Zodiac Ciphers

- Zhong. "Cryptanalysis of Homophonic Substitution Cipher Using Hidden Markov Models." 2016
- Abstract: We investigate the effectiveness of a Hidden Markov Model (HMM) with random restarts as a mean of breaking a homophonic substitution cipher. Based on extensive experiments, we find that such an HMM-based attack outperforms a previously developed nested hill climb approach, particularly when the ciphertext message is short. We then consider a combination cipher, consisting of a homophonic substitution and a column transposition. We develop and analyze an attack on such a cipher. This attack employs an HMM (with random restarts), together with a hill climb to recover the column permutation. We show that this attack can succeed on relatively short ciphertext messages. Finally, we test this combined attack on the unsolved Zodiac 340 cipher.

- Vobbilisetty, Rohit, et al. "Classic cryptanalysis using hidden Markov models." Cryptologia (2016): 1-28.
- Abstract: In this article, the authors present a detailed introduction to hidden Markov models (HMM). They then apply HMMs to the problem of solving simple substitution ciphers, and they empirically determine the accuracy as a function of the ciphertext length and the number of random restarts. Application to homophonic substitutions and other classic ciphers is briefly considered.

- Prajapat, Shaligram, et al. "Cryptic Mining in Light of Artificial Intelligence." (2015)
- Abstract: The analysis of cryptic text is hard problem, and there is no fixed algorithm for generating plain-text from cipher text. Human brains do this intelligently. The intelligent cryptic analysis process needs learning algorithms, co-operative effort of cryptanalyst and mechanism of knowledge based inference engine. This information of knowledge base will be useful for mining data(plain-text, key or cipher text plain-text relationships), classification of cipher text based on enciphering algorithms, key length or any other desirable parameters, clustering of cipher text based on similarity and extracting association rules for identifying weaknesses of cryptic algorithms. This categorization will be useful for placing given cipher text into a specific category or solving difficult level of cipher textplain text conversion process. This paper elucidates cipher textplain text process first than utilizes it to create a framework for AI-enabled-Cryptanalysis system. The process demonstrated in this paper attempts to analyze captured cipher from scratch. The system design elements presented in the paper gives all hints and guidelines for development of AI enabled Cryptic analysis tool.

- Nuhn, Malte, Julian Schamper, and Hermann Ney. "UNRAVEL—A Decipherment Toolkit." ACL-IJCNLP Proceedings Volume 2: Short Papers: 549. (2015)
- Abstract: "In this paper we present the UNRAVEL toolkit: It implements many of the recently published works on decipherment, including decipherment for deterministic ciphers like e.g. the ZODIAC-408 cipher and Part two of the BEALE ciphers, as well as decipherment of probabilistic ciphers and unsupervised training for machine translation. It also includes data and example configuration files so that the previously published experiments are easy to reproduce."

- Vobbilisetty, Rohit. "Cryptanalysis of Classic Ciphers Using Hidden Markov Models." (2015).
- Abstract: "Cryptanalysis is the study of identifying weaknesses in the implementation of cryptographic algorithms. This process would improve the complexity of such algorithms, making the system secure. In this research, we apply Hidden Markov Models (HMMs) to classic cryptanalysis problems. We show that with sufficient ciphertext, an HMM can be used to break a simple substitution cipher. We also show that when limited ciphertext is available, using multiple random restarts for the HMM increases our chance of successful decryption."

- Serengil, Sefik Ilkin, and Murat Akin. "Attacking Turkish texts encrypted by homophonic cipher." Proceedings of the 10th WSEAS International Conference on Electronics, Hardware, Wireless and Optical Communications. 2014.
- Abstract: "Homophonic cipher is developed as an alternative to substitution cipher to compose more resistant ciphertexts against to the frequency analysis attacks. Nevertheless, Attacking with taking advantage of characteristic vulnerabilities of the language is probable. In this paper, characteristic vulnerabilities of the Turkish Language for homophonic cipher are exposed and attacking approaches are illustrated."

- Nuhn, Malte, Julian Schamper, and Hermann Ney. "Improved Decipherment of Homophonic Ciphers." (2014)
- Abstract: "In this paper, we present two improvements to the beam search approach for solving homophonic substitution ciphers presented in Nuhn et al. (2013): An improved rest cost estimation together with an optimized strategy for obtaining the order in which the symbols of the cipher are deciphered reduces the beam size needed to successfully decipher the Zodiac-408 cipher from several million down to less than one hundred: The search effort is reduced from several hours of computation time to just a few seconds on a single CPU. These improvements allow us to successfully decipher the second part of the famous Beale cipher (see (Ward et al., 1885) and e.g. (King, 1993)): Having 182 different cipher symbols while having a length of just 762 symbols, the decipherment is way more challenging than the decipherment of the previously deciphered Zodiac- 408 cipher (length 408, 54 different symbols). To the best of our knowledge, this cipher has not been deciphered automatically before."

- Yi, Jeffrey. "Cryptanalysis of Homophonic Substitution-Transposition Cipher." (2014).
- Abstract: "Homophonic substitution ciphers employ a one-to-many key to encrypt plaintext. This is in contrast to a simple substitution cipher where a one-to-one mapping is used. The advantage of a homophonic substitution cipher is that it makes frequency analysis more difficult, due to a more even distribution of plaintext statistics. Classic transposition ciphers apply diffusion to the ciphertext by swapping the order of letters. Combined transposition-substitution ciphers can be more challenging to cryptanalyze than either cipher type separately. In this research, we propose a technique to break a combined simple substitutioncolumn transposition cipher. We also consider the related problem of breaking a combination homophonic substitution-column transposition cipher. These attacks extend previous work on substitution ciphers. We thoroughly analyze our attacks and we apply the homophonic substitution-columnar transposition attack to the unsolved Zodiac-340 cipher."

- "Cipher Type Detection" (Malte Nuhn and Kevin Knight), Proc. EMNLP, 2014
- Abstract: "Manual analysis and decryption of enciphered documents is a tedious and error prone work. Often—even after spending large amounts of time on a particular cipher—no decipherment can be found. Automating the decryption of various types of ciphers makes it possible to sift through the large number of encrypted messages found in libraries and archives, and to focus human effort only on a small but potentially interesting subset of them. In this work, we train a classifier that is able to predict which encipherment method has been used to generate a given ciphertext. We are able to distinguish 50 different cipher types (specified by the American Cryptogram Association) with an accuracy of 58.5%. This is a 11.2% absolute improvement over the best previously published classifier."

- Dhavare, Amrapali, Richard M. Low, and Mark Stamp. "Efficient cryptanalysis of homophonic substitution ciphers." Cryptologia 37.3 (2013): 250-281.
- Abstract: "Substitution ciphers are among the earliest methods of encryption. Examples of classic substitution ciphers include the well-known simple substitution and the less well-known homophonic substitution. Simple substitution ciphers are indeed simple— both in terms of their use and their cryptanalysis. Homophonic substitutions are also easy to use, but far more challenging to break. Even with modern computing technology, homophonic substitutions can present a significant cryptanalytic challenge. This paper focuses on the design and implementation of an efficient algorithm to break homophonic substitution ciphers. We employ a nested hill climb approach that generalizes the fastest known attack on simple substitution ciphers. We test our algorithm on a wide variety of homophonic substitutions and provide success rates as a function of both the ciphertext alphabet size and ciphertext length. Finally, we apply our technique to the “Zodiac 340” cipher, which is an unsolved message created by the infamous Zodiac killer."

- Nuhn, Malte, and Hermann Ney. "Decipherment Complexity in 1: 1 Substitution Ciphers." ACL (1). 2013.
- Abstract: "In this paper we show that even for the case of 1:1 substitution ciphers—which encipher plaintext symbols by exchanging them with a unique substitute—finding the optimal decipherment with respect to a bigram language model is NP-hard. We show that in this case the decipherment problem is equivalent to the quadratic assignment problem (QAP). To the best of our knowledge, this connection between the QAP and the decipherment problem has not been known in the literature before."

- Nuhn, Malte, Julian Schamper, and Hermann Ney. "Beam Search for Solving Substitution Ciphers." ACL (1). 2013.
- Abstract: "In this paper we address the problem of solving substitution ciphers using a beam search approach. We present a conceptually consistent and easy to implement method that improves the current state of the art for decipherment of substitution ciphers and is able to use high order n-gram language models. We show experiments with 1:1 substitution ciphers in which the guaranteed optimal solution for 3-gram language models has 38.6% decipherment error, while our approach achieves 4.13% decipherment error in a fraction of time by using a 6-gram language model. We also apply our approach to the famous Zodiac-408 cipher and obtain slightly better (and near to optimal) results than previously published. Unlike the previous state-of-the-art approach that uses additional word lists to evaluate possible decipherments, our approach only uses a letterbased 6-gram language model. Furthermore we use our algorithm to solve large vocabulary substitution ciphers and improve the best published decipherment error rate based on the Gigaword corpus of 7.8% to 6.0% error rate."

- Berg-Kirkpatrick, Taylor, and Dan Klein. "Decipherment with a Million Random Restarts." EMNLP. 2013.
- Abstract: "This paper investigates the utility and effect of running numerous random restarts when using EM to attack decipherment problems. We find that simple decipherment models are able to crack homophonic substitution ciphers with high accuracy if a large number of random restarts are used but almost completely fail with only a few random restarts. For particularly difficult homophonic ciphers, we find that big gains in accuracy are to be had by running upwards of 100K random restarts, which we accomplish efficiently using a GPU-based parallel implementation. We run a series of experiments using millions of random restarts in order to investigate other empirical properties of decipherment problems, including the famously uncracked Zodiac 340."

- Ravi, Sujith, and Kevin Knight. "Bayesian inference for Zodiac and other homophonic ciphers." Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011.
- Abstract: "We introduce a novel Bayesian approach for deciphering complex substitution ciphers. Our method uses a decipherment model which combines information from letter n-gram language models as well as word dictionaries. Bayesian inference is performed on our model using an efficient sampling technique. We evaluate the quality of the Bayesian decipherment output on simple and homophonic letter substitution ciphers and show that unlike a previous approach, our method consistently produces almost 100% accurate decipherments. The new method can be applied on more complex substitution ciphers and we demonstrate its utility by cracking the famous Zodiac-408 cipher in a fully automated fashion, which has never been done before."

- Dhavare, Amrapali. Efficient attacks on homophonic substitution ciphers. Diss. San Jose State University, 2011.
- Abstract: "Substitution ciphers are one of the earliest types of ciphers. Examples of classic substitution ciphers include the well-known simple substitution and the less well-known homophonic substitution. Although simple substitution ciphers are indeed simple - both in terms of their use and attacks; the homophonic substitution ciphers are far more challenging to break. Even with modern computing technology, homophonic substitution ciphers remain a significant challenge. This project focuses on designing, implementing, and testing an efficient attack on homophonic substitution ciphers. We use an iterative approach that generalizes the fastest known attack on simple substitution ciphers and also employs a heuristic search technique for improved efficiency. We test our algorithm on a wide variety of homophonic substitution ciphers. Finally, we apply our technique to the “Zodiac 340” cipher, which is an unsolved ciphertext created in the 1970s by the infamous Zodiac killer."

- Corlett, Eric, and Gerald Penn. "An exact A* method for deciphering letter-substitution ciphers." Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2010.
- Abstract: "Letter-substitution ciphers encode a document from a known or hypothesized language into an unknown writing system or an unknown encoding of a known writing system. It is a problem that can occur in a number of practical applications, such as in the problem of determining the encodings of electronic documents in which the language is known, but the encoding standard is not. It has also been used in relation to OCR applications. In this paper, we introduce an exact method for deciphering messages using a generalization of the Viterbi algorithm. We test this model on a set of ciphers developed from various web sites, and find that our algorithm has the potential to be a viable, practical method for efficiently solving decipherment problems."

- Raddum, Håvard, and Marek Sýs. "The zodiac killer ciphers." Tatra Mountains Mathematical Publications 45.1 (2010): 75-91.
- Abstract: "We describe the background of the Zodiac killer’s cipher, and present a strategy for how to attack the unsolved Z340 cipher. We present evidence that there is a high degree of non-randomness in the sequence of ciphertext symbols in this cipher, suggesting it has been constructed in a systematic way. Next, we use this information to design a tool for solving the Zodiac ciphers. Using this tool we are able to re-solve the known Z408 cipher"

- Erickson, Derrick, and Michael Hausman. "A Dominant Gene Genetic Algorithm for a Substitution Cipher in Cryptography." (2009)
- Abstract: "This study is about breaking the substitution cipher in cryptography. The substitution cipher replaces every letter in a document with a different letter. This makes the document unreadable unless one can find the key to decrypt the document. A genetic algorithm is a way to combine the Darwin theory and genetics to converge on the solution after many generations or iterations. This genetic algorithm will decode the message by using unigram, bigram, trigram, and four-gram statistics to build the key. These statistics will not only determine the cost of a chromosome, how "good" a solution is, but will determine whether or not a gene is dominant. The mating function focuses on building a chromosome with the dominant genes. All dominant genes are passed to the child chromosomes to quickly converge on a solution. The mutation function searches "local" solutions by manipulating genes to make the overall solution better. After many generations, the encrypted message should look close to the original message."

- Basavaraju, Pallavi Kanagalakatte. "Heuristic Search Cryptanalysis of the Zodiac 340 Cipher." (2009).
- Abstract: "The Zodiac 340 cipher is one of the most famous unsolved ciphers of all time. It was allegedly written by “the Zodiac”, whose identity remains unknown to date. The Zodiac was a serial killer who killed a number of people in and around the San Francisco Bay area during the 1960s. He is confirmed to have seven victims, two of whom survived [1], although in taunting letters to the news media he claims to have killed 37 people. During this time, an encrypted message known as the Zodiac 408 cipher was mailed to 3 different newspapers in the San Francisco bay area. This was a homophonic cipher and was successfully decoded. Within a few days he sent out another cipher that was 340 characters long [4]. This cipher, which is known as the Zodiac 340 cipher, is unsolved to date. Many cryptologists have tried to crack this cipher but with no success. In this project, we implemented a novel genetic algorithm in an attempt to crack the Zodiac 340 cipher. We have attacked the cipher as a homophonic cipher where each cipher symbol is mapped to only a single English letter, but each English letter can be mapped to multiple cipher symbols. In the genetic algorithm, we implemented two variants of crossover: simple and intelligent. The simple crossover looks for commonly occurring substrings, without looking for actual English words in a putative decrypt. The intelligent crossover counts the number of actual English words that can be found in a putative decrypt when evaluating each solution. We implemented a dictionary lookup for quickly identifying English words for the intelligent crossover. The genetic algorithm using a combination of simple and intelligent crossovers was able to identify many English words in various putative decrypts but no solution was found."

- Ravi, Sujith, and Kevin Knight. "Attacking decipherment problems optimally with low-order n-gram models." proceedings of the conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2008.
- Abstract: "We introduce a method for solving substitution ciphers using low-order letter n-gram models. This method enforces global constraints using integer programming, and it guarantees that no decipherment key is overlooked. We carry out extensive empirical experiments showing how decipherment accuracy varies as a function of cipher length and n-gram order. We also make an empirical investigation of Shannon’s (1949) theory of uncertainty in decipherment."

- Oranchak, David. "Evolutionary algorithm for decryption of monoalphabetic homophonic substitution ciphers encoded as constraint satisfaction problems." Proceedings of the 10th annual conference on Genetic and evolutionary computation. ACM, 2008.
- Abstract: "A homophonic substitution cipher maps each plaintext letter of a message to one or more ciphertext symbols [4]. Monoalphabetic homophonic ciphers do not allow ciphertext symbols to map to more than one plaintext letter. Homophonic ciphers conceal language statistics in the enciphered messages, making statistical-based attacks more difficult. We present a dictionary-based attack using a genetic algorithm that encodes solutions as plaintext word placements subjected to constraints imposed by the cipher symbols. We test the technique using a famous cipher (with a known solution) created by the Zodiac serial killer. We present several successful decryption attempts using dictionary sizes of up to 1,600 words."

- Kulich, Martin. "CRACKING OF THE SUBSTITUTION CIPHERS IN CLASSICAL CRYPTOGRAPHY.", Faculty of Information Technology (FIT), Brno University of Technology (2008, approx.)
- Abstract: "The thesis provides a tool for automatic monoalphabetic substitution ciphers cracking using most common words dictionary. It is based on some previous solutions for english language which were not designed for cracking ciphertext without word divisions. This thesis is focused on cracking ciphertexts without word divisions. It also looks at specifics of czech language - declension and its influence on dictionary size."

- Banks, Michael James. "A Search-Based Tool for the Automated Cryptanalysis of Classical Ciphers." The University of York Department of Computer Science May (2008).
- Abstract: "The field of classical cryptography encompasses various forms of simple pen-and-paper ciphers that were in widespread use until the early 20th century. Although these ciphers have long been surpassed by modern cryptographic systems, they can still be challenging to break using manual methods alone. Indeed, there exist several well-known classically-encrypted cryptograms which, at present, remain unbroken. Automated cryptanalysis of classical ciphers has been carried out in existing research, using optimisation techniques in conjunction with appropriate heuristics to evaluate the validity of decryptions. However, this work is largely limited to a few kinds of simple ciphers and the results obtained by some researchers have been criticised by others as being suboptimal. Building on the approaches used by earlier work, a flexible software tool is constructed to perform automated cryptanalysis on texts encrypted with various kinds of classical ciphers. The tool is expressly designed to support the tailoring of cryptanalysis to particular styles of ciphertext, featuring an extensible framework for defining ciphers and supporting different evaluation heuristics and optimisation algorithms. The efficacy of the tool is investigated using a selection of sample ciphertexts and unsolved cryptograms. Topics for further research into automated cryptanalysis are proposed."

- Dao, Thang. Analysis of the zodiac 340-cipher. Diss. San Jose State University, 2007.
- Abstract: "The main purpose of this project is to determine whether the method used in the Zodiac 340-cipher (Z340) letter was a homophonic substitution, an improved version of the well-known simple substitution. A homophonic substitution employs a "one-to-many mapping" technique, as opposed to the "one-to-one mapping" of a simple substitution. Due to the complexity of the homophonic substitution, an exhaustive solution to the Z340 is not possible in a feasible amount of time. This research proposes an approach to implement an automated solution to a homophonic substitution based on a hill-climb technique. The software will be used to attempt to solve the Z340. Even if the software fails to solve the Z340, useful conclusions could be drawn. The objective is to reduce the number of methods that could have been used to encrypt the original message."

- Samuel, D.. "Code breaking in law enforcement: A 400-year history." Forensic Science Communications 8.2 (2006).
- Abstract: "The introduction profiles a 2004 case in which the FBI decrypted an enciphered message a jailed man wrote to his brother, which contained incriminating references to hiding evidence and moving the victim's body. The second case involved Unabomber Theodore Kaczynski (1978-1996), who kept notebooks in which he logged his crimes in a handwritten numerical code in both English and Spanish. The decryption and translation of these notebooks sealed the case against him. The third case reported involved the Zodiac killer, who has yet to be identified. He wrote ciphers related to his serial murders for publication in newspapers. Zodiac' most famous cipher was broken within a few hours by a husband and wife team of amateur code breakers. Other Zodiac ciphers, however, remain unsolved. The fourth case mentioned is the "Hollow Nickel Case" (1953-57), which involved a newspaper boy's accidental discovery of a microphotograph of a numbered code inside a hollow nickel that split when he dropped it on a sidewalk. The numbered code was a Soviet spy's one-time pad encryption system. The code was not broken until 1957, after a Soviet KGB officer defected to the United States. This eventually led to the conviction of a Soviet spy known by his alias of Rudolf Abel. Other cases include the work of cryptanalysts William and Elizabeth Friedman; the decryption of coded telegrams related to the Teapot Dome Scandal of 1924; and ciphers used in messages between conspirators in Lincoln's assassination and the Confederate government of Jefferson Davis. Also described are the use of ciphers in communications between Mary, Queen of Scots, and her coconspirators in the plot to kill her cousin Elizabeth, Queen of England. 18 references"

- Garici, Mohamed Amine, and Habiba Drias. "Cryptanalysis of substitution ciphers using scatter search." Artificial Intelligence and Knowledge Engineering Applications: A Bioinspired Approach. Springer Berlin Heidelberg, 2005. 31-40.
- Abstract: "This paper presents an approach for the automated cryptanalysis of substitution ciphers based on a recent evolutionary metaheuristic called Scatter Search. It is a population-based metaheuristic founded on a formulation proposed two decades ago by Fred Glover. It uses linear combinations on a population subsets to create new solutions while other evolutionary approaches like genetic algorithms resort to randomization. First, we implement the procedures of the scatter search for the cryptanalysis of substitution ciphers. This implementation can be used as a framework for solving permutation problems with scatter search. Then, we test the algorithm and show the importance of the improvement method and the contribution of subset types. Finally, we compare its performances with those of a genetic algorithm."

- Delman, Bethany. "Genetic algorithms in cryptography." (2004).
- Abstract: "Genetic algorithms (GAs) are a class of optimization algorithms. GAs attempt to solve problems through modeling a simplified version of genetic processes. There are many problems for which a GA approach is useful. It is, however, undetermined if cryptanalysis is such a problem. Therefore, this work explores the use of GAs in cryptography. Both traditional cryptanalysis and GA-based methods are implemented in software. The results are then compared using the metrics of elapsed time and percentage of successful decryptions. A determination is made for each cipher under consideration as to the validity of the GA-based approaches found in the literature. In general, these GA-based approaches are typical of the field. Of the genetic algorithm attacks found in the literature, totaling twelve, seven were re-implemented. Of these seven, only three achieved any success. The successful attacks were those on the transposition and permutation ciphers by Matthews, Clark, and Gr¨undlingh and Van Vuuren, respectively. These attacks were further investigated in an attempt to improve or extend their success. Unfortunately, this attempt was unsuccessful, as was the attempt to apply the Clark attack to the monoalphabetic substitution cipher and achieve the same or indeed any level of success. Overall, the standard fitness equation genetic algorithm approach, and the scoreboard variant thereof, are not worth the extra effort involved. Traditional cryptanalysis methods are more successful, and easier to implement. While a traditional method takes more time, a faster unsuccessful attack is worthless. The failure of the genetic algorithm approach indicates that supplementary research into traditional cryptanalysis methods may be more useful and valuable than additional modification of GA-based approaches."
- List of attacks mentioned in the paper

- Pimenidis, Lexi. "HOMOPHONE KRYPTOGRAPHIE." (2002) (German-language paper. Mentions Zodiac codes.)
- Abstract: "Homophone Substitutionsverschlusselung kann verwendet werden, wenn traditionelle Algorithmen nicht mehr genug Sicherheit bieten, sowie als Basis fur stark ere Verschlusselungen. Durch das Einbringen von Zufallselementen und die Verwendung von Mengen anstatt einzelner Zeichen, soll eine Kryptoanalyse erschwert werden. Diese Arbeit zeigt anhand von einfachen Beispielen Moglichk eiten, Grenzen und Erweiterungen von homophoner Kryptographie auf."

- Jakobsen, Thomas. "A fast method for cryptanalysis of substitution ciphers." Cryptologia 19.3 (1995): 265-274.
- Abstract: "It is possible to cryptanalyze simple substitution ciphers (both mono and polyalphabetic) by using a fast algorithm based on a process where an initial key guess is refined through a number of iterations. In each step the plaintext corresponding to the current key is evaluated and the result used as a measure of how close we are in having discovered the correct key. It turns out that only knowledge of the digram distribution of the ciphertext and the expected digram distribution of the plaintext is necessary to solve the cipher. The algorithm needs to compute the distribution matrix only once and subsequent plaintext evaluation is done by manipulating this matrix only, and not by decrypting the ciphertext and reparsing the resulting plaintext in every iteration. The paper explains the algorithm and it shows some of the results obtained with an implementation in Pascal. A generalized version of the algorithm can be used for attacking other simple ciphers as well."

- Forsyth, William S., and Reihaneh Safavi-Naini. "Automated cryptanalysis of substitution ciphers." Cryptologia 17.4 (1993): 407-418.
- Abstract: "We use simulated annealing to provide an automated method for the cryptanalysis of mono-alphabetic substitution ciphers. We prove the convergence of the algorithm and study its performance for a specific cooling schedule. We discuss the merits of this approach and show that it provides a simple, fast and elegant solution to the cryptanalysis problem which is also promising for more complex types of block ciphers."

- King, John C., and Dennis R. Bahler. "An algorithmic solution of sequential homophonic ciphers." CRYPTOLOGIA 17.2 (1993): 148-165.
- Abstract: "REMOVE_HOMOPHONES is a new cryptanalytic algorithm for the reduction of a sequential homophonic cipher without word divisions into a simple substitution cipher [8]. Sets of homophones, defined in the cipher alphabet, are detected algorithmically, without the use of either frequency analysis or trial-and-error backtracking, in a ciphertext-only attack. Given the output of REMOVE_HOMOPHONES, a simple substitution cipher, probabilistic relaxation [9,13] can complete the algorithmic solution of sequential homophonic ciphers without word divisions."

- Spillman, Richard, et al. "Use of a genetic algorithm in the cryptanalysis of simple substitution ciphers." Cryptologia 17.1 (1993): 31-44.
- Abstract: "This paper considers a new approach to cryptanalysis based on the application of a directed random search algorithm called a genetic algorithm. It is shown that such a algorithm could be used to discover the key for a simple substitution cipher."

- Stahl, Fred A. "A homophonic cipher for computational cryptography." Proceedings of the June 4-8, 1973, national computer conference and exposition. ACM, 1973.
- Abstract: "Computational cryptography deals with the storage and processing of sensitive information in computer systems by en-cipfiering. Sensitive information is information that for one reason or another must be protected from being disclosed to individuals without proper authorization. The need for systems to be secure from unauthorized access to sensitive information has been well documented. Cyrptographic techniques appear to be one of the most simple and secure methods of providing this much needed protection."

- Language models - A list of a few papers that involve the use of probabilistic language models for cryptanalysis.