Background
-
LWIP
LWIP (Official Website) is a widely used open source TCP/IP stack designed for embedded systems. LWIP was originally developed by Adam Dunkels at the Swedish Institute of Computer Science and
is now developed and maintained by a world wide of developers. As documented in it’s site, LWIP’s focus is to reduce the RAM usage while still having full scale TCP. This makes LWIP suitable for use
in embedded systems with tens of kilobytes of free RAM and room for around 40 kilobytes of code ROM. In this project, we have implemented LWIP porting for “BARE metal” (No OS programming).
-
cc2650
The CC2650 Wireless MCU LaunchPad is the micro-controller used throughout the course, and in this final project. This MCU contains a 32-bit ARM Cortex-M3 processor that runs at 48 MHz as the
main controller and a rich peripheral feature set that includes Bluetooth, ZigBee, 6LoWPAN, UART over micro-usb connection, programming and debugging with the CCS IDE. The launchpad also
comes with 2 LEDs, 2 buttons and more.
Motivation
The TCP/IP protocol suite allows computers of all sizes, from many different computer vendors, running totally different operating systems, to communicate with each other.
By implementing the TCP/IP protocol (including UDP) suite for the TI-RTOS, we can easily port any application protocol to the launchpad, hence running complex applications in the Launchpad.
Moreover (and maybe more trivial), by implementing the TCP/IP suite we can connect the cc2650 to the Internet (WAN instead of PAN), using popular protocols and technologies (i.e: HTTP/s),
common clients app (i.e: web browser, web applications).
Challenges
-
Memory Issues
Before porting LWIP to the cc2650, we compiled LWIP to our own PC (using LWIP-contrib) running Windows, to get a feeling of what it takes to port LWIP in a NO-SYS mode.
When we turned to the next stage of porting LWIP to the TI-RTOS, we had to handle memory issue that were new to us.
At first our code wouldn't fit to the TI-RTOS flash memory size, partly due to the fact that LWIP allocates it's memory statically and manage its own heap..
In order to reduce the code size, we cut in LWIP's TCP settings: send/receive buffers, maximum window size, segment size, number of connections, PCB memory size, etc.
There was a trade-off with the performance of LWIP of course.
We also face this issue later when writing the applications themselves. When writing a web-client the HTTP GET request was larger than the free-space LWIP allocated for the
TCP send-side buffer, and the send request would have failed. This issue was handled in by the following:
- The application would have to perform 'send' in chunks, each time, before sending we questioned how much free space is available, and sent a matching part of the data.
- We registered a callback that was called when a TCP ACK was received, and in it we checked how much free space is available and sent another chunk.
- Saving the state between callbacks to handle sending a big applicative message, made the development a bit more complex.
(For example: Implementing a 'receive' function in a Web-Server in Windows, would have ended with a 'recv' method and a parsing method.
While in the TI-RTOS our callback is called whenever a new segment is received, and we had to maintain a global state - copy the received buffer aside, try to parse it, and then ask
ourselves if the received buffer is complete or not.)
-
Raw Communication API
Aside from the memory issues, we also compiled LWIP in a NO-SYS mode - hence, not supporting the familiar Berkeley-Sockets like API.
That complicated the programming of applications, and prevented us from porting many of them.
We followed the explanation given here
Project Description
The project was done in steps.
-
Step 1 - Porting LWIP to the cc2650 launchpad
The goal at the beginning of this project was to let the cc2650 launchpad have a static IP, and in particular make it respond to ICMP echo request packets (pings) that were sent
from the computer connected to it with serial line.
In this step, following the link here we did the following (under the Code Composer IDE):
-
Removing any unneeded code from being compiled (PPP support, IPv6 support, any unused protocols such as SNMP, DHCP etc).
-
Implementing a network driver interface: in port/sio.c we wrote our implementation wrapping TI-RTOS' UART driver.
Later on, we improved this code - UART's driver has small FIFO buffer, and without a fast sample-rate done by LWIP, bytes would be lost. We moved to a callback oriented programming:
Each time amount of bytes was received, our handler would be have been called (by a SWI mechanism) entering the input to a mailbox.
And API of the network driver, tries to sample bytes from the mailbox (with or without timeout).
-
Writing the 'main' program code.
Initializing all the needed parts (LWIP, the launchpad), creating a single task, raising a NIC with IP (IP,GW,subnet mask) over SLIP, initializing needed LWIP applications
and a main while loop that samples the NIC and handles timeouts.
-
Handling the memory issues mentioned before (was done mainly in port/LWIP_opts.h). Decreasing LWIP's memory requirements to fit the launchpad flash memory by
decreasing the number of connections that can be made simultaneously, the HEAP size, the sending/receiving buffers, MSS, Window's size etc.
-
Step 2 - First interactions with the cc2650 launchpad
After porting the library and implementing a wrapping code, we tested the code. At first we used PUTTY.exe over a COM port, then we compiled our windows port to LWIP (with UART driver), and wrote a ping-sender application, that it's target-IP was the IP of the launchpad.
Without sniffing capabilities (libpcap/wireshark doesn't sniff serial port), we declared success when the program on the PC received the ICMP echo replays, and the traces at the program running on the cc2650 printing the received packets.
-
Step 3 - Connecting the cc2650 to the LAN
After running TCP+UDP applications successfully on the launchpad we wanted to develop a better and more generic solution for interacting with the micro-controller. In particular, we wanted it to be accessible from all entities in LAN/WAN.
By doing this (explained soon) we achieved the following:
-
Communication through standard client softwares (ping.exe, chrome, telnet, putty and more), instead of propriety testing clients programed with windows LWIP over serial.
-
The board would be accessible from any device (PC/mobile) in the LAN/WAN (we tested it), and not just from the computer to which it is connected serially.
-
View and debug network traffic (with Wireshark for example).
This was achieved by implementing python framework that acts as a bridge from the PC-serial-connection to another NIC (i.e WIFI NIC). Moreover, this bridge converts slip to Ethernet and vice-versa.
The framework 'adds' the MCU to the LAN that the hosting PC is member of (from a network point-of-view the MCU and the hosting computer have different IP and mac addresses and share the same physical NIC).
The framework uses pyserial (for the serial communication) and scapy (for raw read / write over the NIC).
The framework runs 3 threads:
-
"Ethernet2Slip" - Opens a raw socket on the WIFI interface.
Whenever an IP fragment destined to the MCU entity (MCU's IP and dummy mac address) is received, a callback that removes the Ethernet headers and adds the Slip encoding is called. The new packet is sent over the serial line.
-
"Slip2Ethernet" - Reads bytes from the serial line.
Whenever receiving a Slip frame, a callback that decodes the Slip frame and adds the Ethernet headers is called.
The src mac of the new Ethernet packet is of the dummy MAC given to the MCU.
The dst mac had to be dynamically computed (we've implemented ARP request-reply mechanism with cache), since the TI-RTOS knows Slip and not Ethernet.
-
"ARP responser".
To let other entities in the LAN communicate with the TI-RTOS in a transparent mode (without config static mac), we've implemented a thread that answers ARP requests, when the 'MAC' of the launchpad is requested.
The fake MAC is returned in the ARP response.
-
TBD - bridge's code pic.
-
Step 4 - Developing an application using raw API
In the last step we wrote an http-server application. (Followed by POCs of working TCP+UDP implementations.)
Explained immediately.
Example Application
We've created an example inspired by the Web's world.
The Web is a relevant and very developed technology these days.
We believe that adding to an IOT component web capabilities can increase the interaction of the users and strengthen the power of the developers.
So, we did the following:
-
The cc2650 launchpad raises a web-server accepting TCP connections via port 80 that presents static HTML pages. This pages also supports basic CGI.
-
We've added our own HTML page that supplies LEDs management interface. (Following the first exercise of the course.)
-
After configuring port forwarding in our local router (NAT), we turned on and off the leds on the launchpad from the chrome web browser running at one house to MCU at the others house.
-
Moreover, we wrote code that sends http requests and allows the launchpad send dynamic information to an Internet web server over URL parameters.
Src
Our project's code can be found in the following GIT repoistory.
License
LWIP's license can be found here.
Future Challenges
-
Power Consumption
Our current porting is quite inefficient.
LWIP is running as a single task, with main while loop that sample the network driver, and handle time-outs.
Main processor is running continuously even without incoming traffic.
This is a better design, enabling cc2650 (as Server) to enter standby mode when no live connections exists:
-
Already implemented: instead of sampling received bytes synchronously (UART_MODE_BLOCKING with short timeout) from UART driver, we use UART_MODE_CALLBACK. When new byte is received by UART driver (into 32 bytes FIFO), our SWI is called and push it to (bigger) mailbox queue.
-
Need to research TI-RTOS UART driver: how UART_MODE_CALLBACK can be power-efficient when no data is received?
-
sys_check_timeouts() - should be executed as a different task with higher priority that awakes once every X ms, using clock mechanism or Task_sleep().
-
After handling "sys_check_timeouts" asynchronously, the main while loop performs only try_read(). therefore we can change the Mailbox_pend(...) to "BIOS_WAIT_FOREVER". Now, LWIP task will "wake up" only when new bytes are received.
-
Maybe when there are no living TCP-connections, we can disable the sys_check_timeouts() periodic calls, and enter to continuous standby mode. Farther research of LWIP's internal implementation is required.
-
Tuning LWIP parameters - performance vs RAM limits:
The trade-off between RAM limits of cc2650 and LWIP performance can be better tuned, and in particular due to specific application requirement.
If an application allows to support less parallel TCP connections, each connection's send/receive buffer, MSS, Window-Size (...), can be increased.
The default configuration we set can be improved as well, but more research is required.
-
Create better API than LWIP raw API
The main challenge that prohibit us from supporting "netconn API" is the lack of threads (time-shared scheduling). Other requirements such as message queue, semaphores, ISR sampling of UART are supported by TI-RTOS.
We would like to run LWIP and application as different threads (tasks). Application will be significantly easier to be programmed as a continuous code, rather than a set of raw- callbacks inside LWIP thread.
Proposed design:
-
Already implemented: sampling UART using callback mode. when new byte is received, it is handled (pushed to mailbox queue) immediately in the context of SWI (higher than LWIP/app tasks, and therefore incoming bytes won't be lost).
-
Checking and handling LWIP's timeouts in highest priority task, that awake once in ms.
-
Application runs in medium priority task.
-
LWIP stack runs in lowest priority task.
-
We will wrap a "socket" object. socket_create() creates new TCP_pcb object and register a set of generic callbacks. It also creates new semaphore, and global recv/send buffers. An handle is return (index of TCP_pcb in an array)
-
Socket APIs like send/recv will be translated to raw API's callbacks, and use inter tasks communication (global buffer and semaphores).
-
Example: when (blocking) recv(socket, *buff, size) is called by application, its implementation will call Semaphore_pend, and the low priority LWIP task will be preempted. LWIP will read received bytes from mailbox (they aren't loss as explained above) and process them. LWIP's raw-API receive callback appends the received data to <*buff>. if required size / push flags is reach, it will also post the semaphore, return the control to the app task.
This design will fit to some kind of applications, require them to enter our blocking API (send/receive/accept) or sleep very often in order to enable LWIP lower priority task to handle communications.
Further thought is needed to enable multiple sockets simultaneously, and handle properly edge cases.
Code - Focus on the interesting pieces of our code
-
port/sio.c - Implementation of the network driver (wrapping the TI-RTOS uart read/write interface)
-
port/lwipcfg.h - LWIP's related applications - Initialized applications, and their settings (IP, NETMASK, GW for example).
-
port/lwipopts.h - LWIP's configuration for this port (size of blocks to allocate, number of pcbs and more).
-
Used applications (ran on the launchpad) source:
-
src/apps/tcpecho_raw/tcpecho_raw.c - From LWIP repo, a TCP echo server that uses a raw API.
-
src/apps/udpecho_raw/udpecho_raw.c - From LWIP repo, a UDP echo server that uses a raw API.
-
src/apps/httpd/http_client.c - Our own HTTP client (lwip-api based), sends HTTP requests to
hardcoded address.
-
lwip_main.c - A file that exports one function, that initializes LWIP itself, it's network interfaces and
all of the relevant applications. Right after, an infinite LWIP-loop is initialized, in which we read packets from
the driver, and LWIP handles timeouts.
-
main.c - The 'main' of our program. Initializes 2 tasks. One task initializes our LWIP code, and the other
is an http client that sends updates whenever a button is pressed (for the demo we only support updating
whenever a button is pressed and do not send what button was pressed).
-
external_files/bridge.py - Our bridge between the SLIP protocol between the cc2650 to the serially-connected computer,
to the Ethernet (the LAN). Enables to any PC in the LAN to send packets (HTTP packets for example) to the cc2650.
Pcaps Examples
Screenshots