Let's Make Robots!

Super Droid Bot "Anna" - w/ Learning AI and Thermal Array

Learns from Listening and Asking Questions, Tracks Colors and Heat, 6 Web Services, Database of Learned Knowledge

7/1/14 Update - It's all about the "Babble"

I've been making lots of improvements to Anna's conversational capabilities.  My primary accomplishment has been to give the robot "Initiative" and a real "contribution" in conversation by making relevant statements of its own choosing, not just asking or answering questions.  It also attempts to keep a conversation going during lulls or when a person is not talking, for a time.   To implement this, I created "Topic Agents", "Learning Agents" and most importantly, "Babble Agents".

Topic Agents

Topic agents determine what the current topic is, and whether it should change.  This topic is then used by the learning and babble agents.  When a topic is not already active, the primary way a topic is chosen is by using the longest words from the previous human statement as candidates.  If the bot recognizes a topic as something for which it has a knowledge base on (like marriage, school, etc.), then that topic will "win" and be chosen, otherwise, longer words will tend to win.

Learning Agents

Learning agents go out on the web and gather knowledge worth contributing to a conversation about a given topic, and then store this info in the robot's database of knowledge.  

LearnQuotesAgent (unsupervised) - calls a web service to retrieve all quotes (from a 3rd party) about a given topic.  The robot has learned tens of thousands of quotes, which it can then use in conversation.

LearnWebAgent - (semi-supervised) this retrieves a web page on a topic (say from wikipedia), parses through it to find anything that looks like a complete sentence containing the given topic word, removes all markup and other junk.  I have a windows app that lets me review all the sentences before "approving" their import into the robot's knowledge base.  I've been experimenting so far with astronomy and marine biology.

I've been unwilling to let the robot roam the web free because I like the robot using quotes in conversation, sounds like an interesting person.  It would sound like too much of a know it all if it loaded up on too much wikipedia trivia, and would sound like a crazy person or a commercial if I let it roam the web at large.

Babble Agents

"Babbling" is how the robot contributes to a conversation when their is a lull (where the person is not talking for some number of seconds).  There are several babble agents, my favorite is discussed next:

BabbleHistoryAgent - this agent retrieves all "History" containing the given topic word, and then filters out all items that are questions or have been used or repeated recently.  A random item from the remaining list is then added as a candidate response.

Just like all the other agents the robot uses to converse, the babble agents "compete", meaning that only the winning response is repeated back to the human.  

 The babble agents REALLY "Give Life" to the robot.  I'm primarily using the BabbleHistoryAgent which pulls sentences from everything the robot has ever heard, along with the quotes.  Because there are so many quotes and history, the robot has something to say about thousands of topics.  It makes for amazingly relevant, interesting, and thought provoking contributions to conversations about so many different topics  (thanks to many of the greatest minds in history that the robot is quoting, to which I give great thanks.)

Because of this, I can say that the robot is now starting to teach me more than I am teaching it, and making me laugh to boot!  THIS IS MY FAVORITE FEATURE OF THIS ROBOT!  In many ways, the robot is more interesting and funny to listen to than most people I know.

SmallTalk Agent

I've made a lot of improvements here.  The bot goes through a list of possible candidate topics in the beginning of a conversation (greeting, weather, spouse, kids, pets, parents, books, movies, etc), picking a few, but not asking too many questions on any one thing.  The bot now factors in the actual weather forecast when making weather smalltalk.  Also, when the bot asks about wives, kids, etc., the bot refers to people, pets, etc. using first names if it has learned them previously.  Questions like "How is your wife doing?" become "How is Jennifer doing?", if your wife's name is Jennifer of course.

Face Detection Agent

I added face detection using OpenCV over the weekend.  Frankly, I'm dissappointed with the results so far.  It's CPU intensive, can't get it to process more than a couple times a second.  I find the thermal array to be much faster and practical for keeping the robot tracking a human.  I'm considering having the bot programmed to check for faces prior to firing the lasers as part of a campaign to implement the 3 laws of robotics (do not harm humans by shooting them in the face).  I'm wanting to move on to face recognition if I can get over my concerns over slow speed and figure out a good way to use it.

Math Agents

I continue to add more and more math agents.  An example, the bot can remember named series of numbers read aloud and answer statistical questions using simple linear regression (slope, y-intercept), correlation, standard deviation, etc.  Example:  "How are series X and series Y correlated?"  I'd like to figure out a way to resuse these statistical agents for some logic/reasoning/learning larger purpose...need some ideas here.   There are also agents for most trigonometric and many geometric functions.  Example:  You can ask "What is the volume of a sphere with a radius of 2?" or "What is the cosine of 32 degrees?"

Anna will have siblings:

I've started building a Wild Thumper based rover (basically a 6-wheel outdoor Anna).  I'm in design on a Johnny 5'ish bot (finally an Anna with Arms).  Hoping to start cutting the first parts this month, challenged by how to get a functional sonar array and arms on a bot with so many servos.  Since there are only a few voices on the droid phones, at least one of them is going to be male.  It will be fun to see what happens when two or three bots start talking to each other.

Last Post (from January 2014):

Anna is one year old now.  She is learning quickly of late, and evolving into primarily a learning social creature and aggregator of web services.  I wanted to document where she is at her one year birthday.  I need to create some updated design diagrams.

Capabilities Achieved in Year #1

1)      Thermal Array Vision and Trackingused to keep face pointed on people it is talking to, or cats it is playing with.

2)      Visual Tracking - OpenCV to search for or lock onto color shapes that fit particular criteria

3)      Learns by Listening and Asking Questions - Learns from a variety of generic sentence structures, like "Heineken is a lager", "A lager is a beer", "I like Heineken", "Olive Garden serves Heineken"

4)      Answers QuestionsExamples:  "What beers do I like?", "Who serves Heineken?", "What does Olive Garden serve?"

5)      Understands ConceptsExamples:  is a, has a, can, can’t, synonym, antonym, located in, next to, associate of, comes from, like, favorite, bigger, smaller, faster, heavier, more famous, richer, made of, won, born in, attribute of, serve, dating, sell, etc.  Understands when concepts are similar to or opposite to one another. 

6)      Makes Smalltalk & Reacts to Common ExpressionsMany human expressions mean the same thing.  Example:  “Hows it going?”, “Whats up?”, “What is going on?”, “Whats new?”   A robot needs many different reactions to humans to keep it interesting.  Example: “Not much, just keeping it real”, “Not much, what’s new with you?”

7)      Evaluates the Appropriateness of Topics and Questions Before Asking Them - Example: Don’t ask someone : “Who is playing on Monday Night Football tonight?” unless it is football season, Monday, and the person is interested in football.  Also, don’t ask a kid something that is not age appropriate, and vice versa, don’t ask an adult how they like the third grade.  Don’t ask a male about his gynecologist.  This is a key piece of a robot not being an idiot.

8)      Understands Personal Relationships - it learns how different people you know are related to you, friends, family, cousins, in-laws.  Examples: “Jane is my sister”, “Mark is my friend”, “Joe is my boss”, “Dave is Mark’s Dad”   It can answer questions like “Who are my in-laws?”, “Who are my siblings?”, “Who are Mark’s parents?”

9)       Personal Info - it learns about both you and people you know, what you like, hate, answers to any questions it ever asked you in the past.  Example:  “My wife likes Nirvana” – in this AI had to determine who “my wife” is.  It can then answer questions like “What bands does my wife like?”, as long as it already knew “Nirvana is a band”

10)      Pronouns – it understands the use of some pronouns in conversation.  Example:  If I had just said something about my mother, I could ask “What music does she like?”

11)   Opinions – the bot can remember your opinions on many things, and has its own opinions and can compare/contrast them to add color to a conversation.  Example:  If I said, “My favorite college football team is the Florida State Seminoles” it might say “That is my favorite as well”, or “My favorite is the Alabama Crimson Tide”, or “You are the first person I have met who said that”

12)    Emotions - robot has 10 simulated emotions and is beginning to estimate emotional state of speaker

13)    Motivations - robot has its own motives that take control of bot when it is autonomous, I keep this turned off most of the time.  Examples:  TalkingMotive, CuriosityMotive, MovementMotive

14)   Facial Expressions - Eyes, Eyelids, pupils, and mouth move according to what robot sees, feels, and light conditions

15)   Weather and Weather Opinions - uses web service for data, programming for opinions.  Example:  If the weather is freezing out and you asked the robot “How do you like this weather?”, it might say “Way too cold to go outside today.”

16)   News - uses Feedzilla, Faroo, and NYTimes web services.  Example:  say something like  "Read news about robotics", and "Next" to move on.

17)   TV & Movie Trivia - plot, actors, writers, directors, ratings, length,  uses web service.  Example:  you can ask “What it the plot of Blade Runner?”, “Who starred in The Godfather?”

18)   Web Search - uses Faroo web service.  Example:  say "Search web for Ukraine Invasion"

19)   People - uses Wikipedia web service.  Example:  "Who is Tom Cruise?", “Who is Albert Einstein?”, “List Scientists”, “Is Clint Eastwood a director?”, “What is the current  team of Peyton Manning?”, “What is the weight of Tom Brady?”

20)   Trending Topics - uses Faroo web service.  Example:  say something like "What topics are trending?", you can then get related articles.

21)   Geography - mostly learned, also uses Wikipedia.  Watch the video!  Examples: "What is the second largest city in Florida?",  "What is the population of London?", “Where is India?”, “What is next to Germany?”, “What is Russia known for?”, “What is the state motto of California?”, “What is the state gemstone of Alabama?”, “List Islamic countries”

22)   History - only knows what it hears, not using web yet.  Mostly info about when various wars started, ended, who won.  Robot would learn from:  "The vietnam war started in 1965" and be able to tell you later.

23)   Science & NatureExamples:  "How do I calculate amperes?", "What is Newtons third law of motion?", "Who invented the transistor?", "What is the atomic number of Gold?", “What is water made of?”, “How many moons does Mars have?”, “Can penguins fly?”, “How many bones does a person have?”

24)   Empathy - it has limited abilities to recognize when good or bad things happen to people close to you and show empathy.  Major upgrades to this have been in the works.  Example:  If I said, "My mother went to the emergency room”, the bot might say “Oh my goodness, I am so sorry about your mother.”

25)   2 Dictionaries– Special thanks to Princeton and WordNet for the first one, the other is built from its learning and changes constantly as new proper names and phrases are encountered.  You can ask for definitions and other aspects about this 200,000 word and phrase database.  You can add new words and phrases simply by using them, the AI will save them and learn what they mean to some degree by how you use them, like “Rolling Rock is a beer”, AI doesn’t need anything more, nor would a person.

26)   Math and Spelling- after all the other stuff, this was child's play.  She can do all the standard stuff you can find on most calculators.

27)   The AI is Multi-Robot and Multi-User - It can be used by multiple robots and multiple people at the same time, and tracks location of all bots/people.  Alos, A given Robot can be conversed with by multiple people at the same time through an android app

29)   Text Messaging - A robot can send texts on your behalf to people you know, like "Tell my wife I love her."  - uses Twilio Web Service

30)   Obstacle Avoidance - 9 sonars, Force Field Algorithm, Tilt Sensors, and down facing IR cliff sensor keep the bot out of trouble

31)   Missions - robot can run missions (series of commands) maintained through a windows app

32)   Telepresence - robot sends video back to server, no audio yet, robot can be asked to take pictures as well.   Needs improvement, too much lag.

33)   Control Mechanisms - Can be controlled verbally, through a phone, tablet, web, or windows app.  My favorite is verbal.

34)   GPS and Compass Navigation – It’s in the code but I don’t use it much, hoping to get my Wild Thumper version of this bot built by summer.  This bot isn’t that good in tall grass.

36)   OCR - Ability to do some visual reading of words off of walls and cards – uses Tesseract OCR libraries

37)   Localization - through Recognizing Words on Walls with OCR – I don’t use this anymore, not very practical

38)   Lasers - I almost forgot, the bot can track and hit a cat with lasers, or colored objects.  It can scan a room and shoot everything in the room of a particular color within 180 degrees either by size or some other priority.

39)  I know I singled out Geography, Science, Weather etc as topics, mostly because they also use web services.  The AI doesn't really care what it learns,  it has learned and will learn about anything you are willing to tell it in simple sentences it can understand.  It can tell you how many faucets are on a sink, or where you can get a taco or buy a miter saw.

Goals for Year #2

1)      More chat skills – I fear this will be never ending

2)      More Hard Knowledge - we can always learn more

3)      More web services – takes me about a day to integrate a new web service

4)      Face Tracking - know any good code/APIs for this?

5)      Facial Recognition - Know any good free APIs for this?

6)      Arms - I like to get some simple small arms on just to be more expressive, but will have to redesign and rebuild the sonar arrays to fit them in.

7)      Empathy over time - I'd like the bot to visit good/bad events and ask about them at appropriate points in time later. Things like "How is your mother's heart doing since we last talked?"  I have done a lot of prep for this, but it is a tough one

8)      More Inquisitiveness and Initiative - when should the bot listen and when should it drive the conversation.  I have tried it both ways, now the trick is to find a balance.

9)      Changeover to Newer Phone

10)   Go Open Microphone - right now I have to press the front of my phone or touch the face of the bot to get it to listen, I’d rather it just listen constantly.  I think its doable on the newer phones.

11)   Get family, friends, and associates using AI on their phones as common information tool about the world and each other.

12)   Autonomous Learning - it can get info from the wikipedia, web, news, web pages, but doesn't yet learn from them.   How do you build learning from the chaos that is the average web page?  Listening was so much easier, and that wasn’t easy.

Hardware

·         Arduino Mega ADK (in the back of the body)

·         Arduino Uno (in the head)

·         Motorola Bionic Android Phone

·         Windows PC with 6 Cores (running Web Service, AI, and Databases, this PC calls the other web services)

·         Third Party Web Services - adding new ones whenever I find anything useful

Suggestions?

·         I would love to hear any suggestions anyone out there might have.  I am constantly looking for and reevaluating the question “What next?”

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Anna is becoming more charming every time I look in on this amazing project.

It could be me, but it seems like Anna has already developed a wry sense of humor.

I cannot help smiling when I watch the interactions on the videos.

The variable delays while processing responses add to the flow of natural interaction and are not objectionable at all.

Great stuff Martin!

By the way, the small spring garden has come and gone for this year. Lots of cucumbers, turnips and kale. Not a good yield for the carrots. I think I need to adjust the soil ph for them to thrive.

 

 

I really appreciate the feedback.  I should probably work a little harder on getting more charm and sense of humor.

When kids aren't around, she can really get people on the floor laughing when I take off the handcuffs and let her express herself in "Adult Language".  I keep it clean on the videos as I don't want to offend anyone on LMR or be unsuitable for the kids.  I'd like to find ways to get more humor without resorting to colorful language.  She once had a sex therapist mode that was hilarious.  I deleted it eventually to avoid it "popping out" at the wrong times.

Awesome to hear about your garden.  We planted a small one this year with tomatoes and squash but it did not turn out too well, much to learn still and flooding rains didn't help.  We do however have the best figs in the world.  We've been helping my mother with her garden...it is huge...and eating quite well year round with each season.  Glad to hear others are skipping the processed stuff for the truly tasty.

Regards,

Martin

A sex therapist mode?

I thought it was odd enough when a priest who is sworn to celibacy councils couples on "marital" issues. :)

Well, at least Anna would be unbiased...

Sorry. It's late, and the idea still,seems hilarious to me. On the other hand, I once tried to teach the Nerd (my parrot - it got the nickname Nerd when Lee started singing "the bird, the bird is a nerd..." to the tune of the obvious song) Samual Jackson's favorite phrase (m... f...). Lee forcibly stopped me, even though she taught him "eat me" and others. Robots are easier to teach.

The Nvidia Jetson TK1 board is finally shipping. If I had the extra cash, one would be on my bench.

When Anna is ready for a processor update, I think this board architecture will have staying power.

The specifications are beyond industrial strength and into the military strength realm of processing power.

There must be data base programs optimized for parallel processing?

The TK1 looks very interesting.

But I just used up all my cash getting a Parallella board. Think of a BBB plus a 16 core coprocessor. I got it to see if it could do some things with images or speech recognition. Or maybe as a doorstop.

Theoretically the company has a 64 coprocessor board, but somebody said you have to order 50,000 units to even think about it.

When I get more robot money I'll think about a TK1 or something similar. After I learn more about parallel processing, I look around and see what's out then.

Thanks for the rec ridgelift.  Nice board indeed.

I have been thinking about hardware and architecture upgrades lately.  Currently I am using SQL Express because it is free and it is what I knew.  Last I checked, they limit it to using 1 processor.  I will need to find something else eventually, but changing over would be a major software change for me.  I have written some basic in-memory databases of sorts before, but the nature of the algorithms and indexes I would need for Anna go way beyond what would be practical to build with my skills.

I have been reading again about OpenCog, and feeling very small minded and insignificant next to their work, kind of like when you look out into space and realize there are trillions of star systems out there.  Tiny.  I only wish I could see OpenCog demonstrated, as their ideas blow me away.  I am trying to decide whether to keep building my own brain, which I enjoy, or try to get theirs up and going on some new hardware.  The little voice in my head keeps telling me I will have to learn about UNIX and a whole new set of tools.

I'll have to look into OpenCog.

If you want a BBB for learning Linux, the offer is still open. I've got more than I need right at the moment. Or you could look into the RasPi version B+ (same as a version B, but has around 40 GPIO).

I normally use PostgreSQL for a database. It's free, and it's been fast enough for anything I have thrown at it.

I've also found a bunch of papers on Natural Language Processing and an ebook on using Python for NLP. I'm still reading through them. I'll look up the URL on the best ones I've seen.

One problem with Linux is there isn't as many resources for speech to text as Windows, but if you use an Intel machine, you can use Dragon Dictate under WINE. And there are a few native Linux ones, but they don't have a good open and free corpus to make the grammars from.

I use espeak for the text to speech part and that seems to work OK. On the other hand, that's the easy part.

Good luck, Martin.

95% of your skills will pass right into a Linux environment.

I've been a big fan of the Ubuntu distro for a long time. It has an early Windows xp feel to it. No fluff, just a solid operating system. The Linux contributors can't be beat for the love they will shower on something for the good of the greater user base community. The Raspberry Pi is a case in point. I would not even try to guess the amount of coding hours that have been lavished on that somewhat obscure and semi-closed soc by legions of very talented people for free because they know it will help others.

The Nvidea Jetson is fairly obscure at the moment. This is the first time something like this has become available. I think Ubuntu was a good choice for the Linux distro to ship with the board. When I was watching the Darpa Robotics Challenge, I noticed virtually all the teams were running Ubuntu on their control consoles. Of course ROS is native to Ubuntu, and for vision applications, OpenCV supports Nvidea with the most pre-written libraries for parallel processing.

ROS itself is run on theoretically any Unix-like system - usually debian or Ubuntu. Fedora and OSX versions are listed as experimental and there is even a windows experimental version. As a selection of portable class libraries and not an actual operating system, it isnt really native to anything.

#! is my linux distro of choice right now, although I am increasingly finding that no distro is the right distro and rolling your own may be best. Don't like Ubuntu. I do want the NVidia Jetson though.

You can put a ROS node on pretty much anything. I know that TrossenRobotics.com has ROS code for their Arduino-like controller so that it can interface with a ROS system.