Profile

cnoocy: green a-e ligature (Default)
(boing!) Cnoocy Mosque O'Witz

Page Summary

Expand Cut Tags

No cut tags
cnoocy: green a-e ligature (Default)
[personal profile] cnoocy
On the morning of Saturday, November 27, I was thinking about NaNoGenMo. NaNoGenMo is National Novel Generation Month, which challenges participants to use technology to generate a novel (very loosely defined) of at least 50,000 words during the month of November. While doing so, I came up with an idea that I wanted to attempt.

You may notice that coming up with the idea on the 27th gave me four days to complete the task, two of them workdays. But I succeeded!


What I made


To start with, here's a picture:
A crossword grid labeled "Grid #4107: 9-5-5-5-5-5-5-5". There are Across and Down entries listed next to the crossword.

There are 6,170 of these, in three volumes. The title of the work as a whole is Reference Settings for all 15x15 Contiguous Black-Square Cryptic Grids with Word Lengths of 5, 7, and 9 Letters with the three volumes Volume I: 1-Across Length = 5, Volume II: 1-Across Length = 7, and Volume III: 1-Across Length = 9. (These links go to a GitHub display page, but there's a download link there if you like.)

The idea is that this is an imaginary 1970's project from IBM or some similar company, to produce a sample word setting for all possible grids for cryptic crosswords within a set of constraints:

  1. Grids are 15x15 and use black squares rather than bars

  2. Every even letter is unchecked

  3. Words may be of 5, 7, or 9 letters

  4. Grids must be contiguous

  5. Grids have rotational symmetry



This leads to grids like you see above, with eight rows and columns, each being split either at the middle, just before, or just after. Since the grids have rotational symmetry, each can be fully specified by just listing the lengths of the first word in the first four rows and columns. So that's the code you see for the grid above: the first across clues in the first four rows have length 9, 5, 5, and 5, and the first down clues all have length 5. The Across and Down listings are one possible set of entries that fit in the grid. You can fill out the whole thing if you like, but the easy thing to see is that the first eight Down entries start with every other letter in the first two Across entries. There are some other touches on the actual pdf: running heads, page numbers, a title page, even a documentation joke on page 2.



How I made it


tl, dr: https://github.com/mmoskowitz/NaNoGenMo2021 and https://github.com/mmoskowitz/grimdrake, but here's more details.

One of the reasons I started this project was that I knew I could use Grimdrake. Grimdrake is now up on GitHub, and it's a tool for working with grid puzzles. I'll be writing up a more formal release of Grimdrake once I polish it some, but the short version is that Grimdrake is the program I wrote when I realized I had no idea how to create certain types of grid by hand. And then it expanded to include code for filling those grids or any other. If you've solved a puzzle I've written since 2004 that has letters that interact, Grimdrake helped create that puzzle. But in creating this, I used it more as a separate library, which led to some polishing of the code in that library.

Grimdrake isn't the whole story, though. The code in my NaNoGenMo2021 repository is what does the iteration for creating the grids that were then filled by Grimdrake.* The iteration through grids uses the code I mentioned earlier, but the final result doesn't contain 38 grids, because the constraints include a requirement that the grids be contiguous. So before I filled the grids I checked for that. I could have implemented a contiguousness algorithm, but it was easier to just check for the specific configurations that could make a grid non-contiguous.** Once I had the grid generation working, I passed each one to Grimdrake to fill, and then wrote the resulting grid along with its number, its code, and its entries to a target directory in a file named by its code. This task started around 9 pm on Sunday night, and I didn't pick up the project (other than watching the task's progress) until it completed a little before 5pm on Monday. Convenient!

Then it was time to lay the content out. First I needed to parse the files I had created. This was fairly straightforward, with a simple script that even my limited current grasp of Python can handle. I had been testing it with just a few files by running it on, e.g., 9-9-9-9-5-5*.txt, but once I had it working, I was pleased to see that it could parse the thousands of files in 9*.txt in a few seconds.

Then I turned to producing the laid-out file. I already have a PostScript file that I adapt to product printable cryptics, but it's designed for one page with one grid, not 1,032 pages with two grids each. So I started by adapting the PS file to move as much functionality out of the page and into the definitions of procedures as possible. I also made sure to fully comply with the PS Document Structuring Conventions to give the eventual PDF conversion an easy time as possible. Then I just needed to write the Python to include the file of procedures and generate a lot of minimal pages. Again I was pleased at how quickly the finished thousand-page files were generated, parsed, and converted to PDF.

On Tuesday morning, I made the last adjustments to the layout, regenerating the contents multiple times. Once I was satisfied, I posted my completed texts to the NaNoGenMo issues list and got the "completed" badge on my submission. I had succeeded!

*The word list for filling them is adapted from SCOWL which is really nice if you want to order your words in a rough frequency order.

**5x5 corners, 5x7 corners, straight cut through the middle, cut through the middle with one jog, and cut through the middle with two jogs.


What I learned


Mostly I learned that I should do more recreational programming. I also learned that I should set up my projects better to begin with; there was some adjustment as I turned Grimdrake from a program for generating one grid to a tool for generating thousands. The algorithms didn't change, but I found a number of places where I hadn't taken that kind of usage into account. A lot of this was changing code that assumed it could write to the screen whenever to code that wrote to a filehandle and only when asked.

I also produced a usage file: this is the frequency, in descending order, of all 19,233 words that ended up in grids. Despite my wordlist and thus my priority for word usage being based on frequency in English writing, the top words are mostly not everyday ones. The top ten are "nightclub", "honeycomb", "rhinoceri", "crescendi", "potpourri", "eucalypti", "bathtub", "spaghetti", "alibi", and "impromptu." This is the result of the same letters not being common everywhere in a word, so that uncommon word-ending letters like B and I are needed to match the middles of words they cross. This also works the other way, so that the top 30 contains both "sadists" and "sises"*** to meet the demand for words to cross the ends of words ending in S.

***plural of "sis." (Yes, really.)

Date: 2021-12-01 09:02 pm (UTC)
yomikoma: (capital_r)
From: [personal profile] yomikoma
> Mostly I learned that I should do more recreational programming.

Strong agree on this! I'm tempted to pick one and clue it.

Most Popular Tags

Style Credit

Page generated Jun. 29th, 2025 01:42 pm
Powered by Dreamwidth Studios