Let's Make Robots!

Software/Hardware infrastructure project to control the vehicle over Internet

Drive around, stream video back to the driver.

We are working on the software and hardware infrastructure to control vehicles (car, tank, quad-copter, etc.) over the internet. This is complete opensource project with all the source code available here.

Video streaming is made in real-time from on-board camera using adaptive algorithm to keep stable framerate which is required for comfortable control. BeagleBoard is used as on-board computer. In addition to high-level data processing like video compression, our goal was to control servos also from BeagleBoard without additional micro-controllers. GPIOs and two built-in PWM generators are used. The only additional electronic required are voltage level shifters to convert 1.8V to standard 5V required for servos and speed controllers.

Hacked RC racing car were used as a first prototype. Currently we are working on the much slower tank which is more convenient to use at home because RC car is just too fast for indoor. Our main focus currently is to implement stereo vision and then build quad-copter with this software and hardware.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I took a look at your blog. Very interesting project. I hope you will share more info on LMR as it develops!

Frits just posted about his own multi-coptor project. It would be great to see if you manage to adapt your vision processing to a flying platform.

Hit the wrong button to reply... sorry :-)

Thank you very much for the positive feedback!

We are almost done with the next iteration of the software and hardware and I am currently in process of writing a series of articles about whole system architecture and separate components. It will include things like generating precise PWMs with Xenomai real-time Linux extensions, real-time communication over I2C bus, video streaming, OpenGL-based visualization at driver side, OpenCV-based anaglyph image generation and much more.

I will definitely post further information here as soon as it will be ready.

Better later then never :-) - here, as promised, is more info about the project: http://letsmakerobots.com/node/29746

and even more details are available here: https://github.com/veter-team/veter/wiki

Thank you very much for the positive feedback!

This is a very nice robot... It's very similar to what I had in mind for my own 3d vison robot, I even bought the same cameras.

I went for a different type of brain though: an atom based pc with an arduino or two to interface it to the "real" world.

> This is a very nice robot... I

Thanks! :-)

> I went for a different type of brain though: an atom based pc with an arduino

The very first versions of mine was also made with Atom-based miniITX board and microcontroller. The advantage of this solution is that Atom CPU is rather powerful and there are more things you can do on board. However, I decide to use BeagleBoard without microcontroller. Rasons for it are: a) BeagleBoard requires considerably less power (much longer working time on accus); b) using DSP it is not a problem to compress h264 video in real-time; c) BeagleBoard has enough I/O pins so there is no need for additional microcontroller which makes the system more compact and cheaper; d) BeagleBoard is smaller. Taking in account that I am going to do complicated image processing and decission making *not* on-board but on the "driver" side or even on the special separate computer (GPU, etc.), BeagleBoard has enough horse power to deal with all tasks need to be done on-board.

 

I had thought about adding 3D to a telepresence system using a couple of webcams like you guys are doing.  Hasn't gotten off the ground.  Still just rolling around my head but I would love to bounce ideas about 3D video transmission and display techniques.

Sorry for the slow reaction. I was on vacation.

Regarding stereo vision - we decide to go the simplest way. So what we are doing is just merge side by side two captured frames into the wide one using Gstreamer's videomixer element. After that, the whole thing are the same as for normal single camera sorce until the last visualization step. There, we are using simple fuction written with OpenCV which splits the wide image in two parts and generates anaglyph picture.

You can get a general idea how to use the videomixer element by looking at this example:
https://www.gitorious.org/veter/veter/blobs/master/misc/car.config#line58
It is not exactly what is necessary (did not checked the code in yet for some reasons) but is very close.

The function to generate anaglyph image from two merged frames could be found here:
https://www.gitorious.org/veter/cockpit/blobs/master/src/toanaglyph.cpp#line29

So you are compressing and transmitting the side-by-side video stream to the driver computer, and then the driver computer is generating the anaglyph for display - correct?  It seems like since you are using anaglyph, you would save half your bandwidth by generating the anaglyph on the robot and transmitting an anaglyph video stream.  Not only would you reduce your video image size by half, but it seems that the color change to anaglyph works like a lossy filter.  It would reduce the amount of information in the image (fewer colors to encode) and make the image even more compressible.  The additional processing time on the robot would probably be offset by a greatly reduced transmission size and you might see a lot less total latency.  Maybe I'm missing something in the details?

Of course this only applies for anaglyph.  If you were trying to do something cooler like display the dual images on a 3D monitor or VR glasses then you would need both images intact.  I have not done any testing myself but have done some thinking about the best way to transmit the dual images.  Side-by-side is a very logical choice and is probably best, but another possibility would be interlaced - the reason being that each image should be nearly the same.  The same pixels in each image should be highly correlated so the interlaced image should be very compressible (since vertically adjacent pixels would often but not always be the same color).  It would be an interesting experiment to see which compressed better - the two side-by-side images with the large vertical discontinuity in the middle or the interlaced images with smaller local discontinuties.

>So you are compressing and transmitting the side-by-side video
>stream to the driver computer, and then the driver computer is
>generating the anaglyph for display - correct?

Yes. Correct.

>It seems like since you are using anaglyph, you would save half
>your bandwidth by generating the anaglyph on the robot

Your considerations about advantages of generating the anaglyph image on the robot is absolutely correct. As you pointed out, anaglyph generation leads to the loss of the information. And this is why we decide not to do it. Our goal is to make the complete data available on the driver side. The reason for it is possible further processing ranging from higher quality (full color) stereo vision to obsticle detection, etc. So we do not want to lose information which might be important for the processing. In general, our strategy is to use the robot to collect the sensor data (including video) and make as much processing as possible on the driver computer or even on the dedicated cluster (CUDA, OpenCL, OpenCV, etc.) for complex decissions in real-time.

>but another possibility would be interlaced - the reason being
>that each image should be nearly the same.

Very interesting idea! I did not consider it before.

>It would be an interesting experiment to see which compressed
>better - the two side-by-side images with the large vertical
>discontinuity in the middle or the interlaced images with smaller
>local discontinuties.

It would be definitely interesting experiment! I am also believe that it might be interesting enough to make a publication on some conference about the results. That is actually the reason why I am trying hard to build a community around this project and encourage robotics and computer vision students to use what we have done as a base for their further experiments. I am even thinking about offering the kit with required hardware to make it easier to start.