Implementing a LIN slave on STM32L011
LIN bus is a communication standard that originated in the automotive industry but is slowly gaining ground in other areas. It is less known than CAN, RS-485, I2C, SPI or USB but has some significant advantages when it comes to developing really cheap and simple devices that have to be networked across medium distances (~50 meters). In cars it is typically used for features like electrical mirror adjustments, motorized seats or parking sensors that do not require much bandwidth and are not safety-critical. LIN bus can also be found in IKEA desks.
Physical layer
LIN specification is freely available. Electrically, it is just a glorified, half-duplex, open-drain UART with a single bus pull-up that operates up to 20 kbaud. Every device needs three signals: ground, bus, supply (also called VBat). Due to low speed the wiring can be practically "anything" (up to a point - limited by capacitance), there is no need for termination or controlled impedance.
The supply in a car often comes from the 12 V electrical system. Logic levels on the bus are specified in terms of the supply voltage. This means that in practice all bus transceivers must use the same supply voltage (ie. mixing 12 V and 24 V supplies will not work). Of course the node can be powered from anything. Only the transceivers need the common bus supply voltage.
Transceiver chips are used to connect nodes to the bus. A LIN transceiver has VBat and bus pins on the LIN "side" and regular TXD/RXD on the low voltage side (plus a common ground of course. There are also system basis chips that have a built-in LDO so you can build a LIN node using just two chips (the system basis chip and an MCU). It is easy to get transceivers that can operate with 8-28 V bus voltages.
In a LIN system (called a "cluster" in the spec) there is a single master node and multiple slave nodes. The only difference (apart from the communication protocol) is that the master node has a 1k bus pull-up resistor to VBat.
Protocol
On the protocol level LIN is just a regular half-duplex UART running nominally at 9600, 19200 or 20400 baud. There are some additional constraints. Data is delivered in frames that can hold up to 8 bytes of payload. Every frame consists of a break, sync byte (0x55), protected identifier byte (PID), payload (up to 8 bytes) and checksum byte.
Taken directly from the standard:
Clock tolerance
The standard was designed with very cheap devices in mind. A slave node does not need an accurate clock as it uses the sync byte to find out what the actual baud is. The standard allows for 14% clock tolerance. On-chip oscillators can easily achieve this accuracy so no crystals are needed. The standard demands an accurate clock only in the master but communication will work as long as both clocks (ie. baud) are within 14%. It does not matter if the master is transmitting at 9600 and the slave sees it at 9700 baud or the other way around because the timings are relative anyway.
Read and write frames
When a master wants to write data to the slaves it sends the complete frame with payload bytes and checksum. When a master wants to read data from a slave it sends only the break, sync and PID. The slave then replies with up to 8 data bytes and a checksum byte.
You can distinguish the devices on the bus with an oscilloscope. Every transceiver will have slightly different logic 0 levels (logic 1 level is set by the common pull-up resistor). If all "bottoms" are on the same level you can be sure that it was sent out by one node, and it has to be the master node because only the master can send complete frames. If the "bottoms" of the data bytes are different than sync and PID you know that it is a response from the slave.
TI has a good application note about LIN.
Protected identifier (PID)
The protected identifier fields deserves some attention. The PID is a regular 8-bit byte. It is created from a frame ID
that is 6 bits long. The two Msbs are calculated based on the 6 Lsbs as "some kind of parity bits".
Practically speaking, only some magic PID values are valid. Frame ID is simply the PID & 0x3F
.
Frame ID is also used to set the length of the frame. The convention is that frames with IDs 0-31 (decimal) have 2 payload bytes, 32-47 have 4, 48-63 have 8. This constraint can be overriden in almost any LIN tool. The communication will work without issues but the device will not be LIN-compliant.
Other parts of the LIN standard
The standard of course defines many more things like CAN message translation to LIN, exact electrical specs, file format of node capabilities, slave polling schedule, standard API of the protocol stack etc. However, all the details are not that critical if you just want to implement your own network and don't have to actually integrate something inside a car.
Having an MCU with a UART and a LIN transceiver you can of course implement any kind of protocol you want (eg. Modbus or XMODEM, or a totally custom text-based protocol) using only the physical layer of LIN. It is a UART just like any other.
I recommend keeping at least the basic structure of the frame the same as in the standard (ie. break + sync + PID + 8 bytes + checksum) to allow work with off-the-shelf LIN tools and analyzers (eg. from Kvaser). The device can also coexist with other LIN-compliant devices on the same bus.
How to implement a LIN master
This article is mainly about implementing a LIN slave node but what about a master? Or any simple tool to test the slave? I made a trivial adapter connecting a USB-to-UART FTDI chip to a LIN transceiver. The Python implementation with a hardcoded frame is trivial:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
The only tricky part is the break period. It must be longer than the transmission time of a single byte (with 14% tolerance) to be correctly detected by an unsynchronized slave. The delay in Python naturally will be inaccurate and wasteful. Another approach
is to switch the port to a significantly lower speed (eg. 4800 baud), send a 0x0
byte and switch back to the intended baud.
The side effect of the bus being half-duplex is that every node receives its own data back during transmission. This is important for collision monitoring. If a node wants to send a logic 1 but sees a logic 0 on the bus it has to give up the transmission (and report the collision to its software). This means that it is perfectly fine to implement a frame ID meaning "does anybody have something to report to the master" and if two or more devices start transmitting at the same time the frame that has the most leading zeros will "win" and will be correctly received by the master. The process can then be repeated until all events are reported back.
Ways to implement a LIN slave
There are several ways of implementing a LIN node:
- Hardware controller - everything is handled by the hardware, the firmware reads and writes full frames and the controller takes care of the rest
- UART with LIN features - the hardware handled break detection and autobauding based on the sync field, bytes & checksumming are then handled by the firmware
- Generic UART - firmware needs a mechanism to detect the break and configure the UART baud settings based on the sync field. This is the approach shown in this article.
- Bit-banging - with a fast enough CPU everything can be implemented with only GPIOs, interrupts and timers. Of course it will place some load on the CPU.
Some features required by the LIN standard may be impossible to implement with just a regular UART. For example collision detection is not a standard UART feature so the node may be sending its frame and not stopping immediately if a collision is detected. Due to the echo from the transceiver the node can still detect that a collision has taken place (because the echo won't match the transmitted data) but it will only be detected after a complete byte (while a compliant node should stop transmitting immediately). The standard also requires to detect other corner cases like a bad stop bit or a timeout if PID byte is not received in time.
My general approach in this implementation is that if the node can't receive correct traffic then there is not much it can do. Of course the firmware could track the various failures independently and report back when the bus starts working again but the correct thing to do is always the same - do nothing or stop and put the node into a safe state. For example if the node controls a window motor in a car it should simply stop. In cars the master can also stop polling when the car shuts down. This is a normal situation so a LIN node should enter a low-power sleep mode after several seconds of bus inactivity.
ST32L011 implementation
STM32L011 is one of the smallest (by memory and by package size) STM32 available. What can you do with such a small device? Still a lot. For example remote IO, control of relays, sensors.
The STM32L011 USART lacks any LIN features (compared to larger STM32s) so the break detection and autobauding has to be done in firmware. After autobauding the reception and transmission of data is handled just like in any other UART.
Header
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The interface is pretty simple. The task function runs in main loop. Frames with data are delivered
from the task but reception of frame ID is handled in the interrupt context. It simplifies the overall logic
& control flow. When frame ID is received the code calls lin_slave_handle_rx_frame_id_fromISR_EXTERNAL()
that is
implemented in the application. That function has to decide if it wants to receive data (and how much),
transmit data back to the master (by calling lin_slave_send_response()
), or ignore the incoming frame alltogether.
Implementation
The driver uses several peripherals:
- EXTI for GPIO interrupts
- Timer for measuring the sync field and timeouts
- USART for reception and transmission after autobauding
Reception is handled in several steps. The most important concern when developing the driver is that it is robust in presence of interference or junk data on the bus, and is always able to recover itself after some timeout. The code is not that easy to follow because the linear chain of events (as the frame arrives) happens in three separate interrupt handlers (with some state shared between them). Maybe the better way would be to have a single function with a state machine that consumes an "event" and the three interrupt handlers only calling it with a proper argument...
The first step is to detect the falling edge (marking the beginning of the break) and start a timer.
Next, EXTI
is configured to trigger on the rising edge. Timer is captured on the rising edge
and a minimal sanity check is applied to reject very short pulses (and avoid waiting for a sync field).
The timer can overflow before EXTI
senses a rising edge and this is okay (only minimum break
length is defined in the standard, infinitely long breaks are okay).
The second step is to measure the sync field and prepare the UART. This is done by starting a timer on the first edge, stopping it on the fifth and applying sanity checking. There is a slight standard non-conformace in this step as the specification required to reject synchronization if the pulses differ too much between each other. My implementation measures the length of all five pulses without checking if they are similar. If the timer overflows then the whole process is reset and the driver starts from the beginning (waiting for the break period).
The sync measurement is used to calculate the baud settings for the UART. It can also be used to update a variable that holds the apparent frequency of the chip. This can be used by the whole application for accurate timings and delays after at least one LIN frame is received.
The final step is to enable the UART and keep receiving the data bytes (while computing the checksum on the fly). This code uses only enhanced checksums. Enhanced checksum includes the PID byte while a classic checksum doesn't. The LIN standard generally accepts both but only one type can be used in the same cluster (with the exception that classic checksum must always be used for some particular diagnostic frames for backwards compatibility).
When testing the code on the STM32L011 I found out that the USART looses data if it arrives immediately after reconfiguration or enabling. The only way I could make the reception reliable was to enable the USART (and apply baud settings) while the sync word was still being received. This leads of course to the problem that the USART has to be configured before the sync field is measured.
Ultimately, if the initial baud settings are wrong then the very first frame after power-on will be dropped. Allowing the initial frame to be dropped (and using the sync length from the previous frame to receive the current one) simplifies the code. This is a non-issue when there is periodic traffic on the bus.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 |
|
Application
Frame ID handling can be handled for example this way:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
The convention is that if the handler returns a positive integer it means the number of expected bytes to receive, zero to ignore the frame, and a negative number if the node has sent a response to the master.
The response must of course be ready before the frame ID is received. The usual LIN convention is that
the master sends first a frame that instructs one slave what it should return and then the master sends
a "read" frame (without payload) that the slave fills with its own data. The lin_slave_send_response()
function should be called immediately in the handler. If it would be called from the main task the timings
may not be predictable and the master may interpret delayed response as a timeout.
The application has to implement lin_slave_handle_rx_frame_EXTERNAL()
. It is called from the main task
after a complete frame has been received.
I release the code into public domain.