language agnostic - Machine learning of word structure -


i working on system can create made fanatsy words based on variety of user input, such syllable templates or modified backus naur form. 1 new mode, though, planned machine learning. here, user not explicitly define rules, paste text , system learns structure of given words , creates similar words.

my current naïve approach create table of letter neighborhood probabilities (including special end-of-word "letter") , filling scanning input letter pairs (using whitespace , punctuation word boundaries). creating word mean probabilities every letter follow current letter , randomly choose 1 according probabilities, append, , reiterate until end-of-word encountered.

but looking more sophisticated approaches (probably?) provide better results. not know machine learning, pointers topics, techniques or algorithms appreciated.

i think independent words (an names), simple markov chain system (which seem describe when talking using letter pairs) can perform well. feed lexicon , throw seed generate new name based on learned. may want tweak prefix length of markov chain nicely sounding results (as pointed out in comment question, 2 letters better one).

i once tried elvish , orcish names dictionaries , got satisfying results.


Comments

Popular posts from this blog

linux - Using a Cron Job to check if my mod_wsgi / apache server is running and restart -

actionscript 3 - TweenLite does not work with object -

jQuery Ajax Render Fragments OR Whole Page -