Difference between revisions of "Cruft Alarm Notes"
|Line 8:||Line 8:|
The post class manages the information in each cruft email. It
The post class manages the information in each cruft email. It .
|Line 28:||Line 28:|
Cruft Alarm requires dictionaries in order to work.
Cruft Alarm requires dictionaries in order to work. dictionaries :
Revision as of 21:32, 18 May 2006
Cruft Alarm is a sophisticated computer program that screens several mailing lists for desirable items. It is written in the Ruby programming language.
Two classes, crufter.rb and post.rb, form the basis of Cruft Alarm.
The crufter class connects to the cruftalarm-at-gmail.com email account. If there are any new messages, it gives them to the post class, which manages the information in each message. The crufter class then prints this information.
The post class manages the information in each cruft email. It currently cleans up an email and finds a location and items from an email.
Gmail is as the email account of choice for several reasons:
- Using an Athena account would require hardcoding in someone's username and password.
- Yahoo! Mail and hotmail do not support POP3 (or something like that).
By using Gmail to check for new messages, Cruft Alarm in fact receives emails considerably faster than do normal Athena accounts (in contrary to what was initially believed). According to Ruth Shewmon, this is because Gmail updates its email servers more often than Athena (or something like that).
Cruft Alarm receives e-mails from the gmail account cruftalarm-at-gmail.com. cruftalarm-at-gmail.com is subscribed to the Athena mailing list, the-companion. the-companion, in turn, is subscribed to the Athena mailing lists reuse and free-food.
Mailing lists cruftalarm/the-companion needs to be added to: freefood.
Natural Language Processing
Cruft Alarm is rumored to have passed the Turing Test.
HTML tags are annoying. Thus the post class removes them (cf. the 'cleanup' method).
Cruft Alarm requires dictionaries in order to work. Cruft Alarm's current dictionaries are the following:
- Cruft: this is a list of desirable cruft items (e.g., Pentium IV).
- Food: this is a list of food items (e.g., Bertucci's pizza).
- Location: this is a list of locations at MIT (e.g., Walcott).
- Next: this is a list of words X such that if X Y appears in the email, where Y is another word, then X Y will want to be returned (e.g., if 'outside 10-250' appears in an email, and 'outside' is a word in Next, then it is desirable to return 'outside 10-250').
- Prev: this is analogous to the Next dictionary.
- Remove: this is a list of words to remove from an email (e.g., 'of'). Reasons for why these words might want to be removed may be discussed later.
The necessity of the Next, Prev, and Remove dictionaries is currently in question.
(cf. the 'get_items' method in the post class) In order to find the cruft items in a message, the post class checks each of the words in the message with the words in the cruft dictionary.
(cf. the 'get_location' method in the post class) In order to find where the items in a message have been posted, the post class does the following:
- It looks for any words beginning with 'NE', 'E', or 'W' (e.g., NE42, E50, and W20).
- It looks for any words containing a hyphen (e.g., 26-100).
- It checks the words in the message against the words in the location dictionary.
- Cruft items are often posted in list form. For example, a reuse email may go as follows:
I have accumulated too much crap. Following items will be left in EC, in the Wood stairwell (on the West parallel, closest to the triangle building). In a box...look for it!
- lamp with a missing foot. Comes with light bulb!
- 50 cent bulletproof DVD with over 50 songs and 12 music videos!
- mechanical panda bear, does all sorts of tricks. comes with a bottle
- Maxwell House Hazelnut coffee + godiva chocolate box filled with splenda packets
- antique hourglass ...actually an hour!
- Blueberry shower gel
- book: Flaubert - "Three Tales"
- book: Martin Page - "How I Became Stupid"
- Red Gel toothpaste
Find a way to extract items from such lists.
Ideas to Improve
- For an item not in the dictionaries, use froogle to determine how good the item is (depending on its price), and also to get its category.
To Do (General)
- Add telephone capabilities, so that Cruft Alarm can dial a telephone number if a desirable item is posted.
- Set up Cruft Alarm's visual interface (the LCDs).
- Set up version control. (?)
- Add Cruft Alarm to more mailing lists (especially freefood).
- Add http://foo-removing capabilities to the cleanup method in post.rb
- Add Markov Chain-Artificial Intelligence-Natural Language Processing-Advanced Modelling Genomic Genetic-Behavior to Cruft Alarm.