I was working on a presentation today and happened to come across an article by Peter Norvig (Director of Research at Google) on how to build a statistical spelling corrector: http://norvig.com/spell-correct.html.
The basic idea is rather straightforward and Peter does a good job of explaining it, even including a sample program in Python. It would be interesting if Peter had dug deeper into how Google uses query logs to build its language and error models for its search term suggestion feature (i.e., “Did you mean:”), although he steered clear of that. He only described a generic spelling corrector that is trained from generic text documents that one can find off the Web. I suppose that makes sense for a tutorial article since none of us have access to Google’s query log.
Alternatively, or if you just don’t care to build your own corrector, there’s a Yahoo Spelling Suggestion API that you can use. It’s free if you’re making less than 5,000 queries per day.