Difference between revisions of "Corpus search results"
Line 1: | Line 1: | ||
− | + | This experiment involved a systematic search for words and phrases shared between Zodiac's correspondences and a large corpus. The content of Zodiac's correspondences were reduced to a stream of alphabet characters with no spacing or punctuation. The corpus was similarly reduced. Then, all possible substrings of each of Zodiac's correspondences were compared to items in the corpus, and matches are organized from largest to smallest. Matches of the same length are organized from most frequently found to least frequently found. | |
− | [[Corpus Search Results - Page 2]] | + | The corpus used for this experiment was the almost 30,000 books from the [http://www.gutenberg.org/wiki/Gutenberg:The_CD_and_DVD_Project Project Gutenberg April 2010 DVD]. |
+ | |||
+ | * [[Corpus Search Results - Page 1]] | ||
+ | * [[Corpus Search Results - Page 2]] |
Revision as of 06:15, 23 June 2012
This experiment involved a systematic search for words and phrases shared between Zodiac's correspondences and a large corpus. The content of Zodiac's correspondences were reduced to a stream of alphabet characters with no spacing or punctuation. The corpus was similarly reduced. Then, all possible substrings of each of Zodiac's correspondences were compared to items in the corpus, and matches are organized from largest to smallest. Matches of the same length are organized from most frequently found to least frequently found.
The corpus used for this experiment was the almost 30,000 books from the Project Gutenberg April 2010 DVD.