Let's Make Robots!

Artificial Intelligence framework

MarcusB had recently posted his Variable Stochastic Learning Automaton code. 


Basically, what his code did was to do the math to randomly select an action when an event occurs. The robot would wait for input from the user as to whether it was a good action or not.  If it was good, the next time the event occurs, it is a higher chance for that event while the chances for the others go lower.  Eventually, the best course of action to an event bubbles up as a higher priority. The idea is similar to how a baby responds to stimuli in their environment.

The algorithm I thought was fascinating.  His code really caught my imagination because it should be relatively easy to fit his algorithm into something that would be more reusable than the straight C it is written in.  So, I did the easy stuff and created base classes that create reentrant code etc.  This is a framework that developers can hook into to define their own events and actions.  I only give a framework; to make something like this work still requires a lot of work and code but is a good start to a difficult problem.

Event - abstract base class - must be overridden

bool RunLogic() - overridden  method must return true when the event occurs - this signals to call OnStart() method
bool OnStart() - while RunLogic() returns true, OnStart() is called on each scan until it successfully returns true - success triggers framework to get a random action and then try it.
bool OnComplete() - last method called after an event has been triggered - it runs with each scan until it returns true, then calls CheckIfActionIsFavorable() and updates action probabilites
bool CheckIfActionIsFavorable( ) - overridden method determines whether action was favorable or not - what this method returns is what is used in the probability matrix

Action - class can be overridden or use the object as a "container" for child actions

bool OnStart() - overridden method is called as the first method on a custom action - this code will be run with every scan until method returns true
bool OnComplete() - last method called after an action is run - this code will be run with every scan until method returns true

Actions can also have child actions as well which won't start until the previous action in the list is complete.  The code is designed to be cooperative multi tasking meaning that it yields if it has nothing to do.  There should never be a delay(...) in the code that overrides my code.

Example, using a very basic robot with two motors and bumpers on the front - I create a custom class for each event and potential actions in an Arduino.  I have an Arduino so that is what I am testing with, but could use anything really.  All code to drive the robot is in the action's OnStart() methods.  These probabilities drive the robot's personality.  Ultimately, I see a database to drive events, actions and save probabilities on a separate "big" controller which communicates to another "small" processor that does the fiddly bits like encoders.  Perhaps a RasPi with a Dagu Mini driver doing the IO.


CollisionEvent (if one or both bumper limit switches are closed)



On each loop() call, I then check my list of events to see if an event occurred.  If it occurs, I then randomly select an action, try it, decide if it is favorable or not and then update probabilities.  Eventually, GoBackwards, GoBackwardsAndTurnLeft, GoBackwardsAndTurnRight will each be close to 33% which is what you would expect.  I can write the event to wait 5 seconds and then if I haven't run into anything in those 5 seconds, decide whether it is a favorable or unfavorable action that was chosen.  Whatever.  The robot has trained itself how best to respond to an event.  It is up to the developer to build in the success or failure criteria for an event.

Note:  this program uses around 11k of 32k of SRAM and around 300 bytes of global memory on my test Arduino.  That is actually fairly small and a lot more could be done on an Arduino, but a real world problem is soon going to be way too big.

I was going to show a video, but the bot looks like a drunk moose during mating season as it rams walls, smashes into things and then the video camera ran out of room on its SD card. If people really want a video I can oblige.

These events could be anything.  An event could be the robot sees a human face, it hears someone speak, etc.  Using this algorithm, a robot could learn how to best keep people engaged.  Or what statements to say when it sees someone and gets that person to spend more time with them.  What would be very cool is if a robot can look at its environment or patterns in what events are occurring and from that create its own custom events and actions. 

I also include the class LearningEvent which basically allows one to dynamically learn criteria for an event.  I haven't thought this through as well yet, so this idea may get trashed but I like the idea of being able to teach a robot what criteria can be used to generate an event.  For instance, if one puts an ultrasonic sensor on a bot that previously only had bumpers one could use this class to "learn" what minimum distance is before a bumper is touched.  Based on this value, it then can "learn" what the best strategy is to deal with the event when it occurs.  It might also be used to tell how successful using a hand to grab a soda can is.  After each try which isn't 100%, it can retrain by trying different numbers.  This could also be used to figure out the P, I and D on a loop.  I am sure this will change; I appreciate input and ideas on this.

This is a very simple example, and the code works fine with only one event.  With larger event queues, there will be problems that have to be dealt with such as two events happening simultaneously.  Whose actions do we choose?  The bot will also need emergency events that override all other events (over amperage on motors - stop!), default actions (ok, no events are occurring, what should the robot be doing?).  I am really going to need another class to arbitrate these issues, so more cuts coming.  I lso need tto change my code to use better random number generation per what MarcusB had. 

Thanks MarcusB for the idea and for doing the math.  I changed your code a little bit, but I think it keeps the spirit if not the exact math of what you were trying to do.  I know this is an interim cut of the code, so any ideas or suggestions are welcome.  I can also do more documentation if people seem interested in pursuing this idea since I only have a brief description of the classes.

I do my development in MS Visual Studio and then download to the Arduino when ready.  So there are some #define NOARDUINO etc which allow me to seamlessly go from PC to Arduino worlds with the same code base. 





LearningAutomaton.zip41.51 KB

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I am using the algorithm in the moment for a chatbot. Instead of actions the robot chooses randomly topics. According to your response the robot learns after a while, about which topics you want to talk and about which topics not so much.

Thanks for your interests and thanks for describing the algorithm in an easy understandable not mathematical way, which is quite difficult for me.

I thought your description was pretty clear.  I certainly had little trouble understanding the gist of what this was supposed to do.  The math didn't make sense until I actually stepped through the code in a debugger.

Thanks for doing the hard part on this - the math.  I did change it a bit but think my changes reduced controller cycles, although could have broken it.  If you have a minute and can verify I didn't mess up your math, that would be great.  I plan to replace the randomSeed() and random() Arduino calls with the entropy calls to make sure we are getting truly random values. 

You brought up a use case I hadn't thought of. I suppose one could have the event be that one gets a response from a user (or a particular amount of time goes by and you don't get a response) and the potential actions could be topics with specific statements connected to each topic (action in my object model).  It doesn't fit well into my object model; I will need to think about this some more.




Another possibility would be, if single actions do not lead to a success over a certain time, to combine basic actions and apply them again to the environment. Let's say, two of the pre-defined actions are "Go back" and "Turn left", and none of them lead to a success over the time, a new randomly assembled action would be for instance "Go back-Turn left". With just a few pre-defined actions you would get a very big number of possible permutations, which leads to a very complex behavior of the robot.

That is a good idea.  It is kind of what we do as people; we reevaluate our success (or lack thereof) with certain behaviors and then try to think of new ways to deal with situations. 

I had also thought about how to randomize all of the inputs and outputs into a matrix of potential events and actions and then let the robot go.  I was concerned about how to build in safe guards to make sure the robot doesn't go from full forward to full backward and also how to manage a complex series of events.  Ok, A happened, so how long do I wait for B to happen?  In the mean time, C, D and E events happened which kicked off random action Z.  Do I reset event A?  But what if event A, B, C, D, E in order are a pattern that I want?  What do I do if I get pattern A, B, C, G, H, D, E?  How do I know if any other atomic actions occurred and at what point do I reset the event to start from scratch?

These issues aren't show stoppers, just need to think it through carefully.  Let me know if you have any other ideas.  I will think about how to fit this into an overally object oriented design.





I would build the safeguards into the Actions rather than the Events. In other words, GoFullBackwards should have the responsibility of checking the current speed to see how it can work. Basically any Event that changes motor speed including turning had to check for this because if the bot is going full forward it would have to slow down in order to turn (at least with differential steering).

But this problem crops up in any type of event driven programming.

For this type of case I would have a way of pushing another action then the current action into the event stack, at the position of the current event.

For the case of backing up when you're going too fast, I'd put in a SlowDown event then try the GoFullBackwards event. Or else just bounce it back and respond with some sort of "Sorry, I can't do this now" response.

I would also like to apply this algorithm, especially to a chatbot, one with a body.  

Thank you to MarkusB, Bill for the work and everyone else who is participating.

For me, there is this gray area I am getting into where the person is not holding up their end of the chat.  I want the bot to be polite for a time, speak from time to time on its own initiative in an attempt to continue a conversation/topic, attempt to change topics, and begin to speak less and attend to other bot behaviors more (like scanning or driving around), if the person is not engaging in dialogue.

As soon as I can find the time to digest this work you guys have done, I'll try to find something more simple to apply it to.  For me it will probably be in C# on a server since I'm running a database with thousands of possible "Events".

Bill, I also found your mention of "writing code in Visual Studio, running it on PC and also on Arduino later" to be fascinating.  I've never done that.  I would love to see a post on that if you ever find the time.