nlp - Algorithm to determine if a word could be English? -


i have list of strings need check against english dictionary. don't want start checking every piece of gibberish in list. first, want check if string english word.

does know of algorithm or @ least rules need apply verify word?

for example:

no spoken word can start more 3 consonants, , if there are 3 initial consonants in word, first 1 must "s".

finding word in data structure going fast (e.g. use bloom filter (mind false positives!), or set) chances not worth doing efficiency reasons.

if want provide suggestions, @ peter norvig's spell checking implementation.

if want go way, i'd construct frequencies of follows b existing text see whether given sequence contained within english words.


Comments

Popular posts from this blog

linux - Using a Cron Job to check if my mod_wsgi / apache server is running and restart -

actionscript 3 - TweenLite does not work with object -

jQuery Ajax Render Fragments OR Whole Page -