Crossword filling - New problem

Back to General discussions forum

Quandray     2020-08-06 16:59:34
User avatar

Hi Rodion,

Please consider this for a new problem

You are given a partially completed crossword grid, such as

.IL.#.I.U.  
A##O#I#A#.  
.UG.U.#.N.  
###F#E#E#A  
..X.R#S##W  
I##E#T.EA.  
S#O#.#E###  
.E.#.T.LI.  
O#E#N#T##O  
.A.IC#.AR.  

A . is an empty cell and there are 26 of them
A # is an unused cell

Each of the empty cells has to be filled with one of the 26 letters of the alphabet.
Each of the 26 letters can only be used once.
The words in the crossword must be valid words. In that grid, English.
Ideally there should only be one solution.

I can provide a (long) list of valid English words.

In case it helps, that grid contains

.
.T.LI.
N
C

There are two possible four letter words that end in NC, but only one of those words gives a letter that will fit into .T.LI.

Rodion (admin)     2020-08-07 07:01:18
User avatar

Grae, Hello!

This really looks suitable by difficulty - at least I don't think I know clever way of solving it, after brooding upon it since yesterday! I only think of brute force trying possible words with some heuristic to chose those having less variants first :)

However for myself, it is more necessary to come up with algorithm for generating input data for this! (and this gives another coding exercise) The shape of grid probably could be the same, but it would be good people get different crossword every time the page is loaded.

Coul you help with some ready idea for this? Quick googling hints again a kind of brute-force but we then should remove some of letters...

Each of the 26 letters can only be used once.

Is this important for some algorithm of solving you have in mind? It is about letters to be added, right - not those already existing in the grid?

Ideally there should only be one solution.

This seemingly relates to algorithm of generating input data. One of limitations is that we can't spent too much time on generation (so the page doesn't take too long to load). As a last resort we can precalculate a hundred of input datasets and provide randomly from them...

I can provide a (long) list of valid English words.

In some problems we already use words.txt (which in turn was stolen from standard [unix words list](https://en.wikipedia.org/wiki/Words_(Unix) but with hyphenated words etc removed). If this will do, it will be good.

However I can't at once solve your crossword manually either. I put dugout in the 3-rd horisontal line (not finding other alternative) and then I can't find .OOF.E to cross with it in 4-th vertical column. So I suspect your list may have more words.

Aha. That hints the problem should also give limited time to respond, say 3 minutes, so that person won't solve it manually.

Quandray     2020-08-07 08:12:26
User avatar

Solving: For each empty cell, I'd work out which letters can/can't go there. For my mini example, only SYNC and ZINC will fit vertically, but there are no words that will fit YT.LI. , so it must be an I. That limits what the other empty cells in IT.LI. can be.

Generating: You can use different grids and different shaped grids. You need a crossword that only uses words in the list and that uses all letters A-Z, then remove one of each letter A-Z.

Each of the 26 letters can only be used once: Maybe "Each of the 26 letters can only be added once" would be better. The problem is to fit each letter A-Z into the empty cells.

Only one solution: not essential, but it makes checking the answer easier, and maybe makes the problem easier (to solve)

Word lists: My list has a lot more words than your words.txt

.UG.U. : My list includes JUGFUL and MUGFUL

Time Limit: That sounds good. I solved the example puzzle manually and it took a lot longer than 3 minutes.

Goodwin Lu     2020-08-08 16:56:37

as an extension, perhaps Sudoku solver could be a puzzle as well?

Rodion (admin)     2020-08-09 08:43:08
User avatar

Dear Friends, sorry for delay :)

perhaps Sudoku solver could be a puzzle as well?

Please note again, to create problem on sudoku solving I need first to invent code for sudoku generator! The same trouble as with crossword puzzle...

Well, while I'm trying to come up with something working for crosswords, please feel free to have some recreation with couple of new tasks!

Quandray     2020-08-12 08:10:58
User avatar

I have code that will solve the puzzle above and one other. Smaller grids are better than larger. I'll start on a generator now.

Quandray     2020-08-13 08:47:10
User avatar

I have a file of over 105000 words up to 10 characters.
I have 4 grids; they can be of variable sizes/shapes.
I have a C++ solver that can solve a puzzle with a unique solution, or say it can't with return code 4.
I have a C++ generator that, given a grid, can usually produce a puzzle and the solution, but sometimes fails with return code 4.
I think the generator takes too long to be run when the problem page is reloaded; I think the inputs need to be precalculated.

The next step is to write a script to run 100 or maybe 1000 times to
select one of the grids randomly
run the generator (until it gets rc 0)
split the puzzle input from the solution
run the solver on the puzzle input
if we get rc > 0 ignore that puzzle input
check that the solvers solution matches that produced by the generator
if it does we have valid puzzle input, otherwise ignore that puzzle input (and maybe investigate what happened)

If I produce some puzzles, how do you want them presented? With/without solutions? Separate files/all in one file?

Rodion (admin)     2020-08-13 12:27:54
User avatar

Grae hello!

I'm away from keyboard now, but this sounds inspiring!

There are few questions:

  • how fast is generation when not fails?
  • does speed depend on dict size?
  • do you generate input with strictly single solution (if so, is it faster to generate with non-unique solution?)

As for storing precalc, I'll think a bit about most compact way and share...

Thanks a lot!

Quandray     2020-08-13 13:22:49
User avatar

My perl script takes about 16 seconds to generate 10 inputs
It takes roughly the same time (maybe slightly longer) using your words.txt
After generating an input file, I don't know if it has a single solution until I run it through my solver. Most have single solutions.
Running it through the solver confirms that the generation hasn't had any problems and that there is a solution.
You are welcome to try it.

Rodion (admin)     2020-08-15 11:57:54
User avatar

Dear Graeme, Hi, and sorry for delay!

My perl script takes about 16 seconds to generate 10 inputs

I'm very poorly acquainted with Perl, but I suspect it has similar speed to PHP in which checker code is... 1.6 second for loading the page, really, is less or more tolerable, so we can try even "live-generation" option!

The next step is to write a script to run 100 or maybe 1000 times to

Well, if we choose "precalc" version, generally 100-150 samples are enough. There is no specific limit on checker code, but due to inexplicable features of the site shorter things are handled easier, e.g. few kilobytes (so I still think I'll try to find some way of represending dataset compactly).

If you mean you already have such a dataset generated - you are welcome to email it to me - or if not, I would be glad to try extending the script you created.

Then I'll invent some more code to check user's answer fits the input data and everything is ready!

Quandray     2020-08-15 12:32:35
User avatar

Hi Rodion, no problem with any delay. I've tried 3 e-mail addresses for you, but it looks like none of them reached you. Please can you try e-mailing me.

Please login and solve 5 problems to be able to post at forum