Fojiba-Jabba Notes

From SlugWiki
Revision as of 20:01, 8 June 2006 by Edchen (Talk)

Jump to: navigation, search

Fojiba-Jabba is the module of Cruft Alarm supporting Automatic Text Generation.

Theoretical Foundations

Fojiba-Jabba uses techniques from Markov Chain- and Recursive Transition Network-Theory.

Markov Chains

One method of text generation involves Markov Chains. In theory, Markov Chains can produce a delightfully quirky text; in practice, they sort of suck.

Process

The process can be summarized as follows:

  • The user specifies an initial word and the number of sentences desired in the text.
  • Fojiba-Jabba, having previously analyzed a set of texts in order to gather statistics on which words follow which words, uses these data to generate the next word.
  • This process repeats until the desired number of sentences is obtained.

Problems

There are, however, several problems with this method:

  • The corpus available is too limited to attempt anything but an Order-1 Markov Chain, as anything higher results in what is essentially the original text itself.
  • An Order-1 Markov Chain is often too retarded to produce anything but rather ungrammatical (and clearly fake) sentences.

Possible Solutions

  • Use highly advanced linguistic knowledge to improve grammaticality (e.g., a noun or an adjective must follow a determiner). A Brill Part-of-Speech Tagger or the Stanford Parser may be useful here.
  • Use google to find likely following words, or to increase the dataset somehow.

Recursive Transition Networks

Less idiosyncratic.