Sentiment analysis is a technique used to infer or interpret some emotional context in communication - in this case, usually text. It is a growing field for analysts and researchers who would like to determine consensus opinions on products (as an example) by automatically interpreting thousands of twitter feeds and microblogs. There are plenty of detractors of the method who make a reasonable case that the techniques are unreliable and non-robust - especially for small-form communication (like a single sentence), but that lack of success doesn't make it any less fun to play with.
I started out by wanting to make a mood light which would reflect the mood of aggregate twitter users in real-time. For now I am working with sentiment analysis methods to control the expressions on an animatronic/robotic face.
In this very basic form, I am using a word list that associated a valence (i.e., degree of emotion) with certain words - all un-indexed words are assumed neutral and given a valence of . The valence for each word in the sentence is summed and that value expresses the sentiment of the sentence. For example...
The poor[-2], sad[-2], lonely[-2] man was sad[-2] that his beautiful daughter had died[-3].
The total valence for this sentence is [-8], and is indeed a sad sentence. Of course it is quite easy to cause this method to fail - especially with sarcasm and double-negatives. This is one reason this method is not robust for small sample sizes. I am also working on more robust techniques and methods.
The word valence list I used for this test was the AFINN lexicon developed by a Dutch researcher. It gives 2400+ english words a rating from [-5] (sad) to [+5] (happy). Another list I plan on using is the Affective Norms for English Words (ANEW) list. It rates words on three dimensions - pleasure, arousal and dominance. The most recent ANEW database is not available to mere commoners, but the data from the original 1999 paper is available online.
Comments on Software
The sentiment analysis is perfomed on a Win32 machine running a Perl script. The script calculates the valence of a sentence then sends one of 5 single-character commands to the Picaxe 18M2 to command the servo into a facial expression (neutral, small frown, big frown, small smile, big smile). I am still just learning Perl and the Perl/Tk GUI widgets. My largest problem was actually with the Miscrosoft Speech API (SAPI) module. I am still having problems with timing and with the SAPI object grabbing too many strings at once - I suspect it is a problem with event handlers and other stuff I know nothing about. It doesn't affect the sentiment analysis or commanding the Picaxe, but I would like to sync the expressions on the face to the synthesized voice as it reads long passages. I will contnue to struggle with Win32::OLE until I can have it work the way I want it to.
This is really just a prototype to see if I like this approach for generating a smile - and I do. I don't need an articulate mouth to synch with speech - I'm really just wanting to display expressions. And I think I can do without the "o" shape that would require both upper and lower lips. For the next iteration I am thinking I will fix the location of the drive pins (i.e., though-holes, not slots) but provide a method of sliding/stretching/translation somewhere along the lip. I have also though about replacing the single-servo with drive belts with two separate servos - one for each corner of the mouth. But I am attracted to the symmetry and simplicity of the single-servo mechanism.
Hardware: New face with this mouth style (perhaps modify the mechanism), eyes (pan & tilt), upper eye-lids and eye-brows. The picaxe with be getting replaced with an Arduino clone.
Software: More sophisticated methods of sentiment analysis using the ANEW dataset. Better implementation of the speech synthesis module.