Tag Archives: C

NixCore X1 with RGB LED Strip

My friends here in Colorado started a new company called NixCore, a Linux enabled processor board that takes 1 Watt of power. They asked me to take their new NixCore X1 product for a spin and see what I could make with it. I looked around the lab/office and came upon my RGB LED strip from China. Since this RGB strip uses a variant of SPI I thought it would be a good test for the little NixCore X1 board.

The NixCore X1 is an Ralink RT5350 SoC (System on Chip) processor running at 360MHz, with 8MB of flash, and 32MB of RAM. The board takes only 1 watt of power (less than 200mA at 5V) and has all the outputs you would expect from a microcontroller, I2C, GPIO, PWM. It also has software based SPI (Since the SPI port is used by the flash) which is still pretty fast. I’ve worked with embedded Linux systems and know how much of a pain it is to get a driver running, so having a Linux install with an SPI driver exposed to userspace was a godsent. With some commands, a simple C file, and a Buildroot compiler I was able to port my Mbed code to the NixCore X1 pretty easily. All I had to do is make a new SPI device on some GPIO pins, open the “/dev/spidev1.0” device and start writing data to it, the driver takes care of all the hard stuff. Using C you can fopen(“/dev/spidev1.0″,”w”) and then write as any other file. Here are the steps:

Build the compiler:
NixCore helped me out with that, but just select mips32r2, Little Endian and uClib on Buildroot and you should be good.

Compile the code:
Given the code, you can control the strip via the command line
(Make sure you add -I and -L entries to the buildroot install)

Install the spi-gpio-custom-driver on the running X1:
insmod spi-gpio-custom bus0=1,22,23,24,0,50000

Run the code:
./[WHATEVER_YOU_NAMED_IT]

This driver is software based and (from my tests) runs up to 400KHz. It uses GPIOs 22 as CLK,23 as MOSI and 24 MOSI (Even though there is no input data) on the NixCore X1. This translates to pins 27,30 and 22 on the header. I hooked up the RGB strip directly to the 3.3V CLK and MOSI and wrote a simple C file based on my Mbed code.

Honestly, to my surprise I was able to control a single pixel of the strip right off the bat, I expected the driver functioned but I was still a little skeptical. It didn’t take long to address the entire strip. At a comm rate of 50KHz this updates the strip at about 40ms or 25 HZ. As I mentioned the rate could be updated to about 400KHz which would be about 200Hz for a 5M strip, more than enough to beat the human eye.

After I made the C application to set the color I took it a step further and added a web page and CGI script to change the color of the strip based on a web page.

Here are some videos of the strip in action:

And, of course, the code and webpage for the processing: Code1, Code2, Makefile, Webpage files.
Goto http://[IP_ADDRESS]/color.html

Overall I really like the NixCore X1 (I am biased since they are my buddies) but you might want to check them out at http://nixcores.com.

LED light strip LPD6803 code

Wow, it’s been a crazy few months. I am taking some courses for my masters, started a few contracting gigs and went to China with DangerousPrototypes Hacker Camp. There is so much to write about and I’ve been dropping the ball on Protological. Time to change that. Time for some cool LED stuff!

While in China I picked up a 5 meter “5050” 12V LED strip with 30 LEDs per meter for 90RMB which is $14.50 USD. I didn’t buy the controller because I wanted to control it with a microcontroller of course! The controller chip is a knockoff LPD6803 LED driver chip that listens for clocked data. The only datasheet I could find was from Adafruit here and the english is really, really bad. I didn’t know ‘grey’ == ‘color’?! After some playing around I figured out the protocol for how the data is sent. The commands are 16-bit and shifted out MSB first, the clock is idle low and the data is latched in on the high transition of the clock. The MSB is always 1 to indicate that the value is data and the 3 colors are 5 bits each; allowing for 31 steps in brightness.

1000 0000 0000 0000
D|Col1||Col2||Col3|

One would think that the colors are Red, Green and Blue, however some testing showed they are Green, Red and Blue.

1000 0000 0000 0000
D|Grn ||Red ||Blue|

So if you want to turn on an LED with 1 step of RGB the byte value is 1 00001 00001 00001 = 1000 0100 0010 0001 = 0x8421;

These strips can be chained together, so to ‘reset’ the strip pixel ID and have it listen for a new set of colors you send 32 0’s which is 8 bytes of 0x00. Once I got the protocol figured out it was really easy to hook up a micro and control the strip.

I used an LPC1768 Mbed microcontroller (because it’s super easy) and made a demo program with some simple animations. The strip is 30 LEDs per meter, 5 meters long. Each LED ‘pixel’ is a group of 3 RGB LEDs and they all show the same color. So there are 50 groups of 3 in my 5 meter strip. The way I drive the LEDs is I make a uint16_t array in RAM for the 50 pixels, then create a timer to run through the array and send it out to the strip using the SPI hardware. With 8 bytes of reset, 100 bytes of data, at 500KHz I was able to address the whole strip in about 4ms. I set my timer to update the strip at 10ms, giving me a 100Hz refresh rate.

The rest of the application just writes the data array in memory and lets the timer clock it out to the strip. I included the code here with 4 demos. This will build in the Mbed compiler and uses the P5 & P6 SPI pins on the LPC1768 for data and clock. It should be pretty straight forward to port to an Arduino or other microcontroller, just setup a timer and update the SPI config and write functions.

Here is a video of the demos in action:

And the source file: ledstrip_mbed_demo.cpp

Enjoy, Drew

Arduino execution time analysis

Recently I was asked to take a look at some code for the Arduino Yun that was developed as a proof of concept for a medical device. The company developing the device was “running out of room” and couldn’t get the Arduino to sample all the ADC inputs in the time period they wanted. I suspect they mean they are running out of processing time to sample/control multiple outputs within their processing window. I have always worked a level down from the Arduino wiring framework and honestly have not worked with the Arduino family much, but when a contract job comes up, you take it. The first thing I noticed in the code was heavy use of floating point values as parameters and returns for lots of functions. Anyone who is embedded knows that floating point operations take WAY more time than integer math, but I was curious as to how much longer. I didn’t find any good online resources that say “floating point divides take XXX instructions”, so I decided to get that information myself.

While I don’t have a Yun, I do have an Arduino Uno which uses the ATMega328, an 8-bit micro with most instructions running at 1 instruction per clock using a 16MHz crystal. I decided to look at each basic math operation for unsigned integer types and floating point types. All input and output variables were created as volatile so the compiler wouldn’t optimize, each operation was performed 1000 times and the results are the average. The method of recording the times was the micros() function which has an accuracy of 4us and does include an unsigned long shift, adding a uint8_t and a multiply by a uint16 literal value. Here are the results on the Uno:

Starting test, looping 1000 times
Control 10ms loop: time 10009 us
Float div: 34 us ~544 instructions, 29/ms
Float mul: 12 us ~192 instructions, 83/ms
Float add: 11 us ~176 instructions, 90/ms
Float sub: 11 us ~176 instructions, 90/ms
uint8 div: 8 us ~128 instructions, 125/ms
uint8 mul: 3 us ~48 instructions, 333/ms
uint8 add: 3 us ~48 instructions, 333/ms
uint8 sub: 3 us ~48 instructions, 333/ms
uint16 div: 16 us ~256 instructions, 62/ms
uint16 mul: 4 us ~64 instructions, 250/ms
uint16 add: 3 us ~48 instructions, 333/ms
uint16 sub: 3 us ~48 instructions, 333/ms
uint32 div: 41 us, ~656 instructions, 24/ms
uint32 mul: 9 us, ~144 instructions, 111/ms
uint32 add: 4 us, ~64 instructions, 250/ms
uint32 sub: 4 us, ~64 instructions, 250/ms

What is interesting is that a uint32_t divide takes more time than a floating point divide! Overall it is clear to see that as the integer gets larger the processing time increases. Floating point operations are 3-4 times as long as integer operations (except for uint32_t divides). I estimated the number of instructions per operation based on 16MHz and how many of each operation could be performed in 1ms.

The next step is to get these values for the Yun and look for performance improvement areas.

Here is the code I used to get these values: http://protological.com/browser/files/timer_sketch.ino.
(Here is the same code but with a macro function, it’s a little cleaner to look at)

Cross compile for Raspberry Pi, VideoCore, MMAL

I have a project which requires full control of the camera board on a Raspberry Pi from a custom C application, so I recently started looking into getting a toolchain and code running on the Pi. The Raspberry Pi is actually a really powerful and advanced SBC, with a full Broadcom VideoCore GPU and a connection to a 5MP camera that can do full 1080 HD video at 30 FPS. The project I am working on takes a single picture from the camera and saves it to the flash. I could use the raspistill application and some bash scripting, however I want full control over the camera and since eventually I will be processing video from a camera, the access should be as fast as possible by using the GPU. The first step was to build a tool chain and cross compiler to get things working, and then make a test application that uses the Multi-Media Abstraction Layer (mmal) library to access the VideoCore (VC) pipeline. I followed these two articles on how to use crosstool-ng to make a crosscompiler for the Pi. The next step was to make a simple application with some hooks into mmal and link it against the mmal libraries. It turns out the easiest way to build the mmal libraries is to clone the userland code for the Pi and build it locally using your new crosscompiler. I modified the cmake file in the makefile/cmake/toolchain/ directory to point to my custom crosstool-ng compiler rather than the compiler from git (Here are my modified buildme and cmake files).

Everything built successfully and the final step was to add in some mmal variables and function calls and build against the mmal libraries. This was more of a pain than I thought it was going to be. There are like 6 libraries that need to be preferences and 5 include paths for the VideoCore and mmal paths. After a few hours of searching and trial and error I found the correct include paths and library paths for GCC and the linker. One really annoying part was the fact that you have to explicitly add ALL libraries to the ld command, even if they are dependencies of libraries and the .so file is in the same directory as an already linked .so file. Take a look at this image and you will get the idea, all was fixed when I added -l entries for each missing lib. Here is what you need to build and link against if you want to access the camera on the Pi via mmal using VideoCore.

Include paths:
-I ../userland/
-I ../userland/build/inc/interface/vcos/
-I ../userland/host_applications/linux/apps/raspicam/
-I ../userland/host_applications/linux/libs/bcm_host/include/
-I ../userland/interface/vmcs_host/linux/
-I ../userland/interface/vcos/pthreads/
-I ../userland/interface/vcos/

Library include path:
-L ../userland/build/lib

Libraries to link:
-lmmal_core
-lmmal
-lmmal_util
-lvcos
-lcontainers
-lbcm_host
-lmmal_vc_client
-lmmal_components
-lvchiq_arm