Let's Make Robots!

Real-time enough? About PWMs and shaky servos

We are currently working on the next incarnation of our software and hardware platform for vehicles controlled over the Internet. Among others, we decide to investigate the way to control our robot without micro-controllers. In fact, BeagleBoard has a lot of GPIOs available. In addition, three hardware PWM generators and I2C bus should be enough at least to control the tank with camera, compass, GPS, and sonar. We believe that BeagleBoard is also powerful enough to accomplish all these tasks (including DSP-based real time video encoding) without additional micro controllers.If this claim is true, than it could be an advantage from the price, space and complexity point of view.

Hardware setup
There is a lot of devices can be controlled using Pulse Width Modulation (PWM). The very typical example is the standard servo motor. Also, it is very typical to use PWMs to talk to motor controllers which in turn control DC motors. That is why we decide to test the real-time performance of our whole system by implementing PWM generator using one of the available GPIO pins. Standard PWM signal (what we need to generate) is represented in the following picture:
Servo motor PWM timing diagram

Pulses has 20ms distance (50Hz) and servo position is encoded by the pulse width with ~0.7ms to ~2.0ms meaning left and right most positions respectively (actual values typically vary a little depending on the servo model and vendor).

From the hardware point of view, level shifting is necessary because GPIO pins on BeagleBoard have 1.8V output but we need 5V to control the servo. For these purposes we used TI's TXS0108E voltage-level translator with corresponding SSOP to DIP adapter (a little bit difficult to solder but is doable if you did not drink a lot of bear the day before :-) ).

Software setup
Pins on the output connector on BeagleBoard can be configured to play different roles. This configuration process is called PinMuxing. For me personally it was and still a little bit confusing where and how to do it. There are in general three places where you can do it: in u-boot, in kernel and in your application with direct memory access over /dev/mem.

The first two ways will give you file system interface with /sys/class/gpio/gpio* entries in file system. Standard read() and write() functions could be used with these files to control GPIO direction and output.

The third way will require to map the physical memory using /dev/mem and then read/write to special addresses to control the output of the GPIO pin. To make it work properly, the kernel config option CONFIG_OMAP_RESET_CLOCKS, which is enabled in the default BeagleBoard defconfigs (and is also enabled in the kernel used by Angstrom distribution we are using) should be switched off. This is a kernel power saving feature and somehow it leads to the kernel oops in this scenario.

Let's trigger GPIO
The obvious and straightforward approach to generate PWM signals mentioned above could be described with the following pseudo-code:

  for(;;) {
    set_gpio_output_to_high();
    sleep(something_between_0.7_and_2.0ms);
    set_gpio_outout_to_low();

    sleep(20ms);}
This loop can be actually implemented even as a shell script. As an alternative, it is possible to use read() and write() function in C to implement set_gpio_output_to_high() and set_gpio_output_to_low(). What is important however, is to ensure that the pulse width remains exactly the same as long as you want the servo to stay in certain position. This is the place where the problem arise - Linux is not a real-time system and that is why there is no guarantee that sleep() function will return exactly after requested amount of time. Context switching between kernel and user space (as happens when using kernel functions such as read() and write()) may also add to the unpredictable timing. As a result the servos will become "nervous". They will shake periodically instead of staying in the desired position.

Problem
In order to achieve stable servo positioning two sources of the undeterministic timing behavior should be eliminated:

  1. Unpredictable delays caused by context switching between kernel and user space.
  2. Unpredictable thread sleep time with standard sleep() function

Solution 1 - eliminating context switches
Inspired by this and this postings, we decide to go the way with direct memory access made from our application to trigger the GPIO. As a result, we write very simple test program which implements PWM generation loop as following:
  for(;;) {
      //set_data_out has offset 0x94
      gpio[OFFSET(0x6094)]=0x40000000;
      sleep(up_period);
      //clear_data_out has offset 0x90
      gpio[OFFSET(0x6090)]=0x40000000;
      sleep(20 * 1000); }

Here we replace the call to write() kernel function with direct memory access to trigger GPIO158. This solution should solve the problem with undeterministic timing resulted by context switching. So let us see what we can achieve with this implementation. To make this post not so boring, instead of showing typical oscilloscope measurements, we decide to place a video of our current testing vehicle where the video camera is mounted on the servo we are controlling with our PWM generation application.

The "No Xenomai, no system load" video fragment illustrates the servo behavior when running the PWM generation with direct memory writes. No other load was put on the system (~99% system idle).

nitially, the servo is turned away from the camera. After a couple of seconds, we run the test application which rotates the servo towards the camera to illustrate that we are actually controlling the servo. What can be clearly observed is that servo is shaking. In addition, there is a typical noise from the servo could be heard which is also a sign that the position is constantly changing.

The situation become much worth (as illustrated by "No Xenomai, high system load" video fragment) if we put considerable load on the system. In this case we run another process in background which creates about 98% system load (~1% system idle shown by top).

In fact, those results were easy to predict. It is also obvious that, the version with write() function (not speaking about corresponding shell script) would show even worth results. That is why we decide even not to test this configuration.

Solution 2 - predictable timing with Xenomai
So now it becomes clear that we need deterministic sleep() function behavior to achieve stable servo positioning. There are two options we were consideing: PREEMPT_RT kernel patches and Xenomai. Speeking about preempt_rt, we did not manage to find such configuration of the Linux kernel used by Angstrom/OpenEmbedded which works on BeagleBoard and has preempt_rt patches for it.

Installing Xenomai was also not a straightforward task. There were problems applying patch to the kernel we were using. However, we were able to solve them. In addition, according to our understanding, Xenomai is considered as "more real-time" then preempt_rt because it runs whole Linux kernel as one of it's task in parallel with other real-time tasks and can preempt it almost at any point if necessary. So we decide to continue our experiments with Xenomai. At the time of writing, we were using kernel version 2.6.35.9 with Xenomai version 2.5.6.

After getting Xenomai running, we rewrite the test PWM generation application. The core generation loop was running as a real-time thread with high priority (99). We also replace sleep() function invocations with corresponding Xenomai functions:
  for(;;) {
      //set_data_out has offset 0x94
      gpio[OFFSET(0x6094)]=0x40000000;
      rt_task_sleep(up_period);
      //clear_data_out has offset 0x90
      gpio[OFFSET(0x6090)]=0x40000000;
      rt_task_wait_period(NULL); }


Test results were much better! Actually, without system load, servos controlled by Xenomai application was perfectly stable. That is why there is no video for this case - there is nothing to see :-) because servo just remains at the defined position. No shaking, no noise.

As the next test, we put the system under heavy load either by running xeno-test program in background (puts 100% load on the system) or running our own vehicle control software which performs real-time video compression using DSP-enabled h264 codec and communicates intensively over the network with vehicle operator software (puts about 60% load on the system). Under these conditions, unfortunately, servo starts shaking again as illustrated by the "With Xenomai and high system load" video fragment.

Nevertheless, this behavior is better then the one without Xenomai and in fact the best we were able to achieve so far. According to our (non formal) observations, what is typically causing increasing shaking is extensive console IO (connected with SSH over the USB WLan adapter on the BeagleBoard) or in general network activity.

Running the Xenomai's latency application in parallel with our test program reveals the latency of around 40 microseconds. Based on what we read in Internet, 40 microseconds is considered "normal/OK" latency for Linux/Xenomai running on ARM at 600MHz. Taking in account typical pulse width of about 1 millisecond, 40 microseconds is about 4%. Is this 4% in fact what we actually see on the video? If yes, does it mean that even with Xenomai it is not possible to control servos reliably even under moderate system load?!

We would really appreciate any feedback if someone familiar with the matter can provide some thoughts about the way to improve the situation. We are somehow do not want to accept the fact that Linux running on such a powerful system such as BeagleBoard can not be enough real-time to generate 1-2ms wide pulses at 50Hz with latency considerably less then 40us? We are really hope that there are some tweaks can be made to achieve much better real-time performance.

Software availability
Our whole software (including Xenomai-based control module) is open-source and available online here.

The relevant Xenomai-based servo control sources could be found here.

The whole source code repository contains the OE layer for Angstrom to generate image with all patches and dependencies to run our on-board software. This software, among others, illustrates how to control servo motors with direct memory access and Xenomai, how to communicate over I2C not using Linux kernel (for better real-time), how to work with I2C sensors such as compass and sonar, how to receive location from GPS sensor, how to compress video in real-time using DSP-enabled codecs with Gstreamer, how to send video and other sensor data over the network in real-time and much more.
At the client (driver) side we are using OpenGL and high performance GLSL-based rendering to display 3D driver console with live video applied as a texture to the defined surface, displaying map for received GPS position in real-time using openstreetmap.org and much more.
We are currently working hard on documenting all the software and hardware we have made so far. There is already some documentation for the project available here and here.

We would be very glad to receive any comments regarding question about Linux/Xenomai real-time performance stated in this post above, our project in general and of course, contributions are also very welcome.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

In the second video scenario, where you have not used Xenomai, have you experimented with renice values to give your PWM process a higher priority with the kernel process scheduler?

Also, have you experimented at all with the BeagleBone's onboard hardware PWM?

From reading the AM335x PWM Driver's Guide (http://processors.wiki.ti.com/index.php/AM335x_PWM_Driver's_Guide) it looks like EPWMA and EPWMB can be configured for:

1) Two independent PWM outputs with single-edge operation

2) Two independent PWM outputs with dual-edge symmetric operation

3) One independent PWM output with dual-edge asymmetric operation

Maybe options 2 or 3 would yield stable results and without the requirement for realtime kernel patches?

 

>In the second video scenario, where you have not used Xenomai, have you experimented with renice valuesto give your PWM process a higher priority with the kernel process scheduler?

Yes. It improves the situation slightly, but by far not to the level which could be acceptable.

> Also, have you experimented at all with the BeagleBone's onboard hardware PWM?

Yes. They work just fine. We control two main motors with them. But this article is about PWM generation with GPIO for the case where more PWMs are needed. Or even more generally - to understand what level of real-time behavior could be achieved with Linux on BeagleBoard.

 

So it's safe to say that EPWMA and EPWMB function without any type of jitter or instability?

Other than your userland PWM process, what type of system loading on the BeagleBoard are you seeing percentage-wise when all of your other processes are running?  For example, if you are only running at 50% load on everything other than your PWM process and have plenty of extra clock cycles to spare, would it be possible to run your GPIO toggling routines at a much higher speed (say 10X or 100X your current toggle rate) to monopolize more of the process scheduler's resources and thereby increase the resolution/accuracy of your PWM method?

> So it's safe to say that EPWMA and EPWMB function without any type of jitter or instability?

Yes. Since they are HW PWM generators, they work very stable independent of the system load.

> Other than your userland PWM process, what type of system loading on the BeagleBoard are you seeing

In addition to the PWM generation, I am doing H264 video compression (running on DSP and CPU) and couple of other things. As a result, overall system load is about 80%.

> would it be possible to run your GPIO toggling routines at a much higher speed (say 10X or 100X your current toggle rate) to monopolize more of the process scheduler's resources

Maybe. I do not know. But in general, I do not think this is a right way to go. It burns CPU without real need for it. That is why, I decide to move the PWM generation into the kernel (module) by means of writing Xenomai's RTDM driver. Also, instead of using threads, I switch to timers. All these together finally lead to the accepted results.

I am currently in process of writing the article about it. If you are interested, you can take a look at this post to see the performance of the "intemediate" step, i.e. kernel module using threads (not timers).

what the hell kind of bear were you drinking when you put together the questions required to create an account on this site?

Which site? Which questions?

...and was it a Grizzley bear, a Polar bear or a small brown bear?  (smiles)

 

—Sorry, GP, I realise it was just a typo, but I couldn't resist.

When you put to sleep a realtime task you lost your realtime. instead of the for loop you need to work with realtime task period with the time beetween realtime task like the minimum time that you need. Per example, if you have a work cycle of 300 ms of your PWM, you can use a time of 300 ms of your task period, but if you need to generate a 20 ms pulse I/O and work cycle of 300 ms of your PWM, you need to use 20 ms of the task period, and use a counter, when it is 15 (300 ms / 20 ms) you can change your pwm if you need, and put you counter to 0.

I hope that you understand me.

 

First of all, thank you for your interest and suggestsions!

> When you put to sleep a realtime task you lost your realtime

If you are reffering to the "Solution 1" section and corresponding code snippet here, then I agree with you. That is the reason why we decide to continue with Xenomai real-time supervisor as described in section "Solution 2". We start the periodic task and use rt_task_sleep() and rt_task_wait_period() functions instead of plain sleep().

Despite clear improvement, the results were not good enough. This article describes how we further improve timing for considerable better results. But, as mentioned in this article, there is still room for improvement. As suggested in the Xenomai mail list, I've implemented timer-based (instead of task-based) alternative and only the variant with spin-waiting led to the behavior which we can accept. I am currently in process of writing the blog post about the final results.

You didn´t understand. When you put to sleep a task, you lost the control of the realtime, it will depend of your load. You need to change the delay with periodic task that control the time with the period time.