The quest for better spell-checking
May 15, 2012 2 Comments
When I was first adding spell-checking to FocusWriter I tried using enchant, but I had some problems that stopped me. The biggest was that I couldn’t get it working on the Mac. I also couldn’t figure out how to support user installed dictionaries. Because of those issues I decided to instead use hunspell.
Over time users have requested support for languages not supported by hunspell, such as Finnish. I was recently contacted again for that very reason and I decided this time to tackle it. My first approach was just to switch the source code to using enchant on all platforms, which was easy. I solved handling user installed dictionaries by using enchant’s C API and telling the plugin where to look (enchant’s C++ API doesn’t expose setting parameters for plugins as far as I can tell). The tricky part was getting it to work on the Mac. I spent almost a week trying to compile glib and enchant, trying to get them properly linked into a framework, and then getting enchant to detect its plugins. I’m not confident in the end result and not sure it would work on other people’s computers.
Undeterred, I came up with a slightly different approach. I now use enchant on Linux or Windows and use NSSpellChecker on the mac. That only took me a couple of hours to whip together, which made me wonder why I put myself through the pain of trying to get enchant working there in the first place. NSSpellChecker handles finding the words along with checking their spelling, so Mac users should now have inline spell-checking that matches their other programs!
There are still a few rough spots. I can’t find any way to control whether or not NSSpellChecker ignores words with numbers or words all in uppercase. I also can’t find any way to tell it to look in places other than ~/Library/Spelling for user installed dictionaries which is obviously non-portable. And finally it seems NSSpellChecker only scans the list of user installed dictionaries when it is first created, so to use a dictionary after it is installed requires a restart of FocusWriter.
There was only one rough spot on Windows, and that was that I couldn’t get the enchant plugin to find the data files for Voikko (kind of important, since adding Finnish support was what started all of this). I fixed that by modifying the plugin source and adding support for a dictionary path. Just for fun I also changed it to use the thread-safe API of Voikko.
Along the way I took the time to dramatically clean up the internal dictionary API inside of FocusWriter. I had originally written it in a way to hide the fact that there was a shared database of dictionaries, but that made the code kind of awkward and unclear. I think it is a much cleaner design now.
The new code also makes it very easy for me to add the ability to change what language to use for checking the spelling of each document. I have not yet exposed this through the program interface, because I can’t decide how it should be stored. RTF and ODT (I think) both allow you to specify the spell-checking language for a document, but obviously there is no way to embed that in a plain text file. I could just store what language a document is in inside of the session, but that wouldn’t help with file exchange with other word processors.
Despite the headaches and unsolved issues, I have had a lot of fun working on this.