M0AGX / LB9MG

Amateur radio and embedded systems

Implementing a LIN slave on STM32L011

LIN bus is a communication standard that originated in the automotive industry but is slowly gaining ground in other areas. It is less known than CAN, RS-485, I2C, SPI or USB but has some significant advantages when it comes to developing really cheap and simple devices that have to be networked across medium distances (~50 meters). In cars it is typically used for features like electrical mirror adjustments, motorized seats or parking sensors that do not require much bandwidth and are not safety-critical. LIN bus can also be found in IKEA desks.

Physical layer

LIN specification is freely available. Electrically, it is just a glorified, half-duplex, open-drain UART with a single bus pull-up that operates up to 20 kbaud. Every device needs three signals: ground, bus, supply (also called VBat). Due to low speed the wiring can be practically "anything" (up to a point - limited by capacitance), there is no need for termination or controlled impedance.

The supply in a car often comes from the 12 V electrical system. Logic levels on the bus are specified in terms of the supply voltage. This means that in practice all bus transceivers must use the same supply voltage (ie. mixing 12 V and 24 V supplies will not work). Of course the node can be powered from anything. Only the transceivers need the common bus supply voltage.

Transceiver chips are used to connect nodes to the bus. A LIN transceiver has VBat and bus pins on the LIN "side" and regular TXD/RXD on the low voltage side (plus a common ground of course. There are also system basis chips that have a built-in LDO so you can build a LIN node using just two chips (the system basis chip and an MCU). It is easy to get transceivers that can operate with 8-28 V bus voltages.

In a LIN system (called a "cluster" in the spec) there is a single master node and multiple slave nodes. The only difference (apart from the communication protocol) is that the master node has a 1k bus pull-up resistor to VBat.

Protocol

On the protocol level LIN is just a regular half-duplex UART running nominally at 9600, 19200 or 20400 baud. There are some additional constraints. Data is delivered in frames that can hold up to 8 bytes of payload. Every frame consists of a break, sync byte (0x55), protected identifier byte (PID), payload (up to 8 bytes) and checksum byte.

Taken directly from the standard: LIN frame

Clock tolerance

The standard was designed with very cheap devices in mind. A slave node does not need an accurate clock as it uses the sync byte to find out what the actual baud is. The standard allows for 14% clock tolerance. On-chip oscillators can easily achieve this accuracy so no crystals are needed. The standard demands an accurate clock only in the master but communication will work as long as both clocks (ie. baud) are within 14%. It does not matter if the master is transmitting at 9600 and the slave sees it at 9700 baud or the other way around because the timings are relative anyway.

Read and write frames

When a master wants to write data to the slaves it sends the complete frame with payload bytes and checksum. When a master wants to read data from a slave it sends only the break, sync and PID. The slave then replies with up to 8 data bytes and a checksum byte.

You can distinguish the devices on the bus with an oscilloscope. Every transceiver will have slightly different logic 0 levels (logic 1 level is set by the common pull-up resistor). If all "bottoms" are on the same level you can be sure that it was sent out by one node, and it has to be the master node because only the master can send complete frames. If the "bottoms" of the data bytes are different than sync and PID you know that it is a response from the slave.

TI has a good application note about LIN.

Protected identifier (PID)

The protected identifier fields deserves some attention. The PID is a regular 8-bit byte. It is created from a frame ID that is 6 bits long. The two Msbs are calculated based on the 6 Lsbs as "some kind of parity bits". Practically speaking, only some magic PID values are valid. Frame ID is simply the PID & 0x3F.

Frame ID is also used to set the length of the frame. The convention is that frames with IDs 0-31 (decimal) have 2 payload bytes, 32-47 have 4, 48-63 have 8. This constraint can be overriden in almost any LIN tool. The communication will work without issues but the device will not be LIN-compliant.

Other parts of the LIN standard

The standard of course defines many more things like CAN message translation to LIN, exact electrical specs, file format of node capabilities, slave polling schedule, standard API of the protocol stack etc. However, all the details are not that critical if you just want to implement your own network and don't have to actually integrate something inside a car.

Having an MCU with a UART and a LIN transceiver you can of course implement any kind of protocol you want (eg. Modbus or XMODEM, or a totally custom text-based protocol) using only the physical layer of LIN. It is a UART just like any other.

I recommend keeping at least the basic structure of the frame the same as in the standard (ie. break + sync + PID + 8 bytes + checksum) to allow work with off-the-shelf LIN tools and analyzers (eg. from Kvaser). The device can also coexist with other LIN-compliant devices on the same bus.

How to implement a LIN master

This article is mainly about implementing a LIN slave node but what about a master? Or any simple tool to test the slave? I made a trivial adapter connecting a USB-to-UART FTDI chip to a LIN transceiver. The Python implementation with a hardcoded frame is trivial:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import serial
import sys
import time

p = serial.Serial(sys.argv[1], 9600, timeout=10.0)

# Example frame:
# ID is 0x30, PID is 0xF0
# 00 00 55 F0 CA FE D0 0D BA AD BE EF 50

p.break_condition = True
time.sleep(0.002)
p.break_condition = False
p.write([0x55, 0xF0, 0xCA, 0xFE, 0xD0, 0x0D, 0xBA, 0xAD, 0xBE, 0xEF, 0x50])
p.close()

The only tricky part is the break period. It must be longer than the transmission time of a single byte (with 14% tolerance) to be correctly detected by an unsynchronized slave. The delay in Python naturally will be inaccurate and wasteful. Another approach is to switch the port to a significantly lower speed (eg. 4800 baud), send a 0x0 byte and switch back to the intended baud.

The side effect of the bus being half-duplex is that every node receives its own data back during transmission. This is important for collision monitoring. If a node wants to send a logic 1 but sees a logic 0 on the bus it has to give up the transmission (and report the collision to its software). This means that it is perfectly fine to implement a frame ID meaning "does anybody have something to report to the master" and if two or more devices start transmitting at the same time the frame that has the most leading zeros will "win" and will be correctly received by the master. The process can then be repeated until all events are reported back.

Ways to implement a LIN slave

There are several ways of implementing a LIN node:

  • Hardware controller - everything is handled by the hardware, the firmware reads and writes full frames and the controller takes care of the rest
  • UART with LIN features - the hardware handled break detection and autobauding based on the sync field, bytes & checksumming are then handled by the firmware
  • Generic UART - firmware needs a mechanism to detect the break and configure the UART baud settings based on the sync field. This is the approach shown in this article.
  • Bit-banging - with a fast enough CPU everything can be implemented with only GPIOs, interrupts and timers. Of course it will place some load on the CPU.

Some features required by the LIN standard may be impossible to implement with just a regular UART. For example collision detection is not a standard UART feature so the node may be sending its frame and not stopping immediately if a collision is detected. Due to the echo from the transceiver the node can still detect that a collision has taken place (because the echo won't match the transmitted data) but it will only be detected after a complete byte (while a compliant node should stop transmitting immediately). The standard also requires to detect other corner cases like a bad stop bit or a timeout if PID byte is not received in time.

My general approach in this implementation is that if the node can't receive correct traffic then there is not much it can do. Of course the firmware could track the various failures independently and report back when the bus starts working again but the correct thing to do is always the same - do nothing or stop and put the node into a safe state. For example if the node controls a window motor in a car it should simply stop. In cars the master can also stop polling when the car shuts down. This is a normal situation so a LIN node should enter a low-power sleep mode after several seconds of bus inactivity.

ST32L011 implementation

STM32L011 is one of the smallest (by memory and by package size) STM32 available. What can you do with such a small device? Still a lot. For example remote IO, control of relays, sensors.

The STM32L011 USART lacks any LIN features (compared to larger STM32s) so the break detection and autobauding has to be done in firmware. After autobauding the reception and transmission of data is handled just like in any other UART.

Header

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#pragma once
#include <stdint.h>

void lin_slave_init(void);

void lin_slave_send_response(const uint8_t *data, uint8_t length);
void lin_slave_task(void);

//returns the number of bytes to receive or sends its own response
int8_t lin_slave_handle_rx_frame_id_fromISR_EXTERNAL(uint8_t frame_id);

//data[0] contains frame ID
void lin_slave_handle_rx_frame_EXTERNAL(const uint8_t *data, uint8_t length); //provided by the application

The interface is pretty simple. The task function runs in main loop. Frames with data are delivered from the task but reception of frame ID is handled in the interrupt context. It simplifies the overall logic & control flow. When frame ID is received the code calls lin_slave_handle_rx_frame_id_fromISR_EXTERNAL() that is implemented in the application. That function has to decide if it wants to receive data (and how much), transmit data back to the master (by calling lin_slave_send_response()), or ignore the incoming frame alltogether.

Implementation

The driver uses several peripherals:

  • EXTI for GPIO interrupts
  • Timer for measuring the sync field and timeouts
  • USART for reception and transmission after autobauding

Reception is handled in several steps. The most important concern when developing the driver is that it is robust in presence of interference or junk data on the bus, and is always able to recover itself after some timeout. The code is not that easy to follow because the linear chain of events (as the frame arrives) happens in three separate interrupt handlers (with some state shared between them). Maybe the better way would be to have a single function with a state machine that consumes an "event" and the three interrupt handlers only calling it with a proper argument...

The first step is to detect the falling edge (marking the beginning of the break) and start a timer. Next, EXTI is configured to trigger on the rising edge. Timer is captured on the rising edge and a minimal sanity check is applied to reject very short pulses (and avoid waiting for a sync field). The timer can overflow before EXTI senses a rising edge and this is okay (only minimum break length is defined in the standard, infinitely long breaks are okay).

The second step is to measure the sync field and prepare the UART. This is done by starting a timer on the first edge, stopping it on the fifth and applying sanity checking. There is a slight standard non-conformace in this step as the specification required to reject synchronization if the pulses differ too much between each other. My implementation measures the length of all five pulses without checking if they are similar. If the timer overflows then the whole process is reset and the driver starts from the beginning (waiting for the break period).

The sync measurement is used to calculate the baud settings for the UART. It can also be used to update a variable that holds the apparent frequency of the chip. This can be used by the whole application for accurate timings and delays after at least one LIN frame is received.

The final step is to enable the UART and keep receiving the data bytes (while computing the checksum on the fly). This code uses only enhanced checksums. Enhanced checksum includes the PID byte while a classic checksum doesn't. The LIN standard generally accepts both but only one type can be used in the same cluster (with the exception that classic checksum must always be used for some particular diagnostic frames for backwards compatibility).

When testing the code on the STM32L011 I found out that the USART looses data if it arrives immediately after reconfiguration or enabling. The only way I could make the reception reliable was to enable the USART (and apply baud settings) while the sync word was still being received. This leads of course to the problem that the USART has to be configured before the sync field is measured.

Ultimately, if the initial baud settings are wrong then the very first frame after power-on will be dropped. Allowing the initial frame to be dropped (and using the sync length from the previous frame to receive the current one) simplifies the code. This is a non-issue when there is periodic traffic on the bus.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
#include "lin_slave.h"
#include <stdbool.h>
#include <stm32l0xx.h>

/* This timer is used to measure the length of the break
 * period and then to calculate effective baud of the sync word.
 * The nominal baud is 9600, let's say +/-20%.
 */
#define LIN_TIMER TIM2
#define LIN_UART USART2
#define LIN_BAUD 9600 //nominal speed
#define LIN_BREAK_LENGTH_SYMBOLS 14 //break must be at least this bit times long
#define LIN_BREAK_MIN_LENGTH_TICKS (0.8/*tolerance*/ * LIN_BREAK_LENGTH_SYMBOLS * SYSCLK_Hz / LIN_BAUD)
#define LIN_BAUD_TOLERANCE 0.2

static volatile bool _got_break;
static volatile uint8_t _sync_pulse_count;
static volatile uint8_t _buffer[16];
static volatile uint8_t _buffer_index;
static volatile uint8_t _bytes_to_receive;
static volatile uint8_t _bytes_to_send;
static volatile uint32_t _uart_brr = SYSCLK_Hz / LIN_BAUD; //startup value
static volatile bool _deliver_rx_callback;
static volatile bool _received_length_for_rx_callback;
static volatile uint8_t _last_PID;

void lin_slave_init(void){
    RCC->APB1ENR |= RCC_APB1ENR_TIM2EN_Msk | RCC_APB1ENR_USART2EN_Msk; //enable bus clock to the timer and USART

    LIN_UART->BRR = _uart_brr;
    LIN_UART->CR2 = USART_CR2_SWAP; //PCB has RX & TX miswired
    //rest of UART initialization is done after sync word is received

#if 0 //dirty transmit test
    LIN_UART->CR1 =
            USART_CR1_TE_Msk/*transmitter enable*/ |
            USART_CR1_UE_Msk/*USART enable*/;
    LIN_UART->TDR = 0x55;
    while (1){}
#endif

    LIN_TIMER->DIER = TIM_DIER_UIE_Msk; //enable overflow interrupt

    LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/; //stop the timer
    LIN_TIMER->PSC = 0; //prescaling is not needed, we want to measure ~15 bit times at 9600baud-20%, that is 31250 ticks
    LIN_TIMER->EGR = TIM_EGR_UG_Msk; //zero timer & update prescaler

    //pins alternate function is done in pins_init()
    //SYSCFG_EXTICR3 is zeros at reset, so PA is the EXTI source
    EXTI->FTSR = EXTI_FTSR_FT9_Msk; //enable PA9 EXTI falling edge
    EXTI->IMR = EXTI_IMR_IM9_Msk; //enable PA9 EXTI interrupt

    NVIC_EnableIRQ(TIM2_IRQn);
    NVIC_EnableIRQ(EXTI4_15_IRQn);
    NVIC_EnableIRQ(USART2_IRQn);
}

void TIM2_IRQHandler(void){ //called when timer overflows (input pulse too long)
    LIN_TIMER->SR = 0; //clear overflow flag (and other flags)
    LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/; //stop the timer
    LIN_TIMER->PSC = 0; //prescale by
    LIN_TIMER->EGR = TIM_EGR_UG_Msk; //zero timer & update prescaler

    if (_got_break == false){
        //break period was so long that the timer overflowed - it is okay
        _got_break = true;

        //Reconfigure the timer for total frame timeout, this will reset
        //the state to default in case no bytes arrive after the break
        //Prescaling by 32 gives around ~131ms timeout @16MHz sysclk
        LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/; //stop the timer
        LIN_TIMER->PSC = 32; //prescale by
        LIN_TIMER->EGR = TIM_EGR_UG_Msk; //zero timer & update prescaler
        LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/ | TIM_CR1_CEN_Msk; //start the timer

    } else {
        //sync word was so long or the frame timed out, the timer overflowed - reset everything
        _got_break = false;
        _buffer_index = 0;
        _sync_pulse_count = 0;
        _bytes_to_receive = 0;

        LIN_UART->CR1 = 0; //disable UART

        //detect the falling edge of next break period
        EXTI->RTSR = 0; //disable rising edge interrupt
        EXTI->FTSR = EXTI_FTSR_FT9_Msk; //enable falling edge interrupt
    }
}

void EXTI4_15_IRQHandler(void){
    //This EXTI ISR is connected to RXD pin.
    //It is used for:
    // * break start detection (when executing the main loop)
    // * break end detection
    // * sync word measurement

    EXTI->PR = EXTI_PR_PIF9_Msk; //clear EXTI interrupt/event flag

    if (_got_break == false){ //we are measuring the break period

        if (EXTI->FTSR & EXTI_FTSR_FT9_Msk){
            //EXTI was configured for falling edge so this is the beginning of the break period
            _sync_pulse_count = 0;
            LIN_TIMER->PSC = 0; //prescale by
            LIN_TIMER->EGR = TIM_EGR_UG_Msk; //zero timer & update prescaler
            LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/ | TIM_CR1_CEN_Msk; //start the timer

            EXTI->FTSR = 0; //disable falling edge interrupt
            EXTI->RTSR = EXTI_RTSR_RT9_Msk; //enable rising edge interrupt

        } else {
            //EXTI was configured for rising edge so this is the end of the break period

            LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/; //stop the timer
            uint32_t break_length_ticks = LIN_TIMER->CNT;
            if (break_length_ticks > LIN_BREAK_MIN_LENGTH_TICKS){
                _got_break = true;
            } //else break was too short (or timer overflowed), but _got_break could already be set by timer ISR

            //configure EXTI and timer for sync word measurement (or another break detection)
            LIN_TIMER->EGR = TIM_EGR_UG_Msk; //zero timer & update prescaler

            //start the timer - it can timeout if first sync pulse does not arrive
            LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/ | TIM_CR1_CEN_Msk; //start the timer

            EXTI->RTSR = 0; //disable rising edge interrupt
            EXTI->FTSR = EXTI_FTSR_FT9_Msk; //enable falling edge interrupt
        }

    } else { //we are measuring the sync word, timer is already running

        if (_sync_pulse_count == 0){
            //this is the first low pulse, timer is configured for timeout
            //restart the timer to measure sync length

            LIN_TIMER->PSC = 0; //prescale by
            LIN_TIMER->EGR = TIM_EGR_UG_Msk; //zero timer & update prescaler
            _sync_pulse_count = 1;

            //UART has to be enabled very early to capture PID byte without problems.
            //In the worst case the very first frame will be lost due to baud mismatch.
            LIN_UART->BRR = _uart_brr; //best known baud setting
            LIN_UART->CR1 =
                    USART_CR1_RE_Msk/*receiver enable*/ |
                    USART_CR1_RXNEIE_Msk/*receive buffer not empty interrupt enable*/ |
                    USART_CR1_UE_Msk/*USART enable*/;

            _bytes_to_receive = 2; //try to receive at least the sync and PID bytes
            _buffer_index = 0;

        } else {
            _sync_pulse_count++;

            if (_sync_pulse_count == 5/*includes start bit*/){
                //we got the whole sync word - tune our USART speed and start receiving bytes
                uint32_t sync_word_length_ticks = LIN_TIMER->CNT;

                uint32_t new_uart_brr = sync_word_length_ticks / 8;
                if (
                        (new_uart_brr > ((SYSCLK_Hz / LIN_BAUD)*(1 - LIN_BAUD_TOLERANCE))) &&
                        (new_uart_brr < ((SYSCLK_Hz / LIN_BAUD)*(1 + LIN_BAUD_TOLERANCE)))
                ){
                    _uart_brr = new_uart_brr;
                }

                //Reconfigure the timer for total frame timeout
                //Prescaling by 32 gives around ~131ms timeout @16MHz sysclk
                LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/; //stop the timer
                LIN_TIMER->PSC = 32; //prescale by
                LIN_TIMER->EGR = TIM_EGR_UG_Msk; //zero timer & update prescaler
                LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/ | TIM_CR1_CEN_Msk; //start the timer

                EXTI->RTSR = 0; //disable rising edge interrupt
                EXTI->FTSR = 0; //disable falling edge interrupt
            }
        }
    }
}

void USART2_IRQHandler(void){
    if (LIN_UART->ISR & USART_ISR_RXNE_Msk){
        _buffer[_buffer_index] = LIN_UART->RDR;

        if (_buffer_index == 1){
            //this is the frame ID

            _last_PID = _buffer[_buffer_index];
            int8_t status = lin_slave_handle_rx_frame_id_fromISR_EXTERNAL(_buffer[_buffer_index] & 0x3F/*convert PID to ID*/);
            if (status > 0){
                _bytes_to_receive = status + 1/*checksum byte*/ + 1; //second +1 because the variable is immediately decremented
            } else if (status < 0){
                //handler has set up response transmission
                return;
            } else {
                //handler wants to ignore this frame
                goto setup_next_frame_reception;
            }
        }
        _buffer_index++;
        _bytes_to_receive--;

        if ((_bytes_to_receive == 0) && _buffer_index){
            //we have received everything we wanted

            uint32_t checksum = 0;
            //enhanced checksum includes the original PID
            for (uint32_t i = 1/*skip sync byte*/; i < (uint32_t)(_buffer_index - 1/*received checksum byte*/); i++){
                checksum += _buffer[i]; //checksum per LIN spec 2.3.1.5
                if (checksum >= 256){
                    checksum -= 255;
                }
            }
            checksum = ~checksum;
            checksum &= 0xFF; //leave just the LSB
            if (checksum == _buffer[_buffer_index - 1]){

                _buffer[1] &= 0x3F/*convert PID to ID*/;
                _deliver_rx_callback = true;
                _received_length_for_rx_callback = _buffer_index - 1/*sync*/ - 1/*PID*/;

            } else {
        //CRC is invalid, can't do anything
            }
            goto setup_next_frame_reception;
        } //end if ((_bytes_to_receive == 0) && _buffer_index)...

        if (_buffer_index == sizeof(_buffer)){ //overflow
            goto setup_next_frame_reception;
        }

        return;

    } else if (_bytes_to_send){ //this must be the TXEIE interrupt

        LIN_UART->TDR = _buffer[_buffer_index];
        _bytes_to_send--;
        _buffer_index++;
        if (_bytes_to_send == 0){ //disable TXEIE interrupt, enable TC interrupt
            LIN_UART->CR1 =
                    USART_CR1_TE_Msk/*transmitter enable*/ |
                    USART_CR1_TCIE_Msk/*TX complete interrupt enable*/ |
                    USART_CR1_UE_Msk/*USART enable*/;
        }
        return;
    } else { //this must be the TC interrupt - transmission of last byte is done, prepare reception for next frame

        goto setup_next_frame_reception;
    }

    setup_next_frame_reception:
    LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/; //stop the timer

    _got_break = false;
    _buffer_index = 0;
    _sync_pulse_count = 0;
    _bytes_to_receive = 0;
    LIN_UART->CR1 = 0; //disable UART

    //detect the falling edge of next break period
    EXTI->RTSR = 0; //disable rising edge interrupt
    EXTI->FTSR = EXTI_FTSR_FT9_Msk; //enable falling edge interrupt
    EXTI->IMR = EXTI_IMR_IM9_Msk; //enable PA9 EXTI interrupt
}

void lin_slave_send_response(const uint8_t *data, uint8_t length){
    //this function is called by lin_slave_handle_rx_frame_id_fromISR_EXTERNAL() in ISR context

    //stop the timer, we're not waiting for any timeout in this state
    LIN_TIMER->CR1 = TIM_CR1_URS_Msk/*software update does not IRQ*/;

    uint32_t checksum = _last_PID; //enhanced checksum includes the original PID

    uint8_t *dst = (uint8_t*)&(_buffer[0]); //buffer[0] holds the received PID byte
    for (uint32_t i = 0; i < length; i++){
        *dst = data[i];
        dst++;

        checksum += data[i]; //checksum per LIN spec 2.3.1.5
        if (checksum >= 256){
            checksum -= 255;
        }
    }

    *dst = ~checksum;
    _bytes_to_send = length + 1/*checksum*/ + 1;
    _buffer_index = 1;

    LIN_UART->CR1 =
            USART_CR1_TE_Msk/*transmitter enable*/ |
            USART_CR1_UE_Msk/*USART enable*/;

    LIN_UART->TDR = _buffer[0];

    LIN_UART->CR1 =
            USART_CR1_TE_Msk/*transmitter enable*/ |
            USART_CR1_TXEIE_Msk/*TX register empty interrupt enable*/ |
            USART_CR1_UE_Msk/*USART enable*/;

    //now the TX register empty interrupt will transmit rest of the bytes
}

void lin_slave_task(void){
    if (_deliver_rx_callback){
        _deliver_rx_callback = false;

        lin_slave_handle_rx_frame_EXTERNAL(
                (uint8_t*)_buffer + 1/*skip sync*/, //includes the ID
                _received_length_for_rx_callback);
    }
}

__attribute__((weak)) void lin_slave_handle_rx_frame_EXTERNAL(const uint8_t *data __attribute__((unused)), uint8_t length __attribute__((unused))){
    __BKPT(123); //has to be implemented in the application
}

__attribute__((weak)) int8_t lin_slave_handle_rx_frame_id_fromISR_EXTERNAL(uint8_t frame_id __attribute__((unused))){
    __BKPT(123); //has to be implemented in the application
    return 0;
}

Application

Frame ID handling can be handled for example this way:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
int8_t lin_slave_handle_rx_frame_id_fromISR_EXTERNAL(uint8_t frame_id){
    //requests
    switch (frame_id){
    case 10:
        return 2;
    case 33:
        return 4;
    case 0:
    case 49:
        return 8;
    }
    //response
    if (frame_id == GLOBAL_LIN_response_id){
        lin_slave_send_response((const uint8_t*)GLOBAL_LIN_response_buffer, GLOBAL_LIN_response_length);
        int8_t r = -GLOBAL_LIN_response_length;
        GLOBAL_LIN_response_length = 0;
        GLOBAL_LIN_response_id = 0;
        return r;
    }
    return 0;
}

The convention is that if the handler returns a positive integer it means the number of expected bytes to receive, zero to ignore the frame, and a negative number if the node has sent a response to the master.

The response must of course be ready before the frame ID is received. The usual LIN convention is that the master sends first a frame that instructs one slave what it should return and then the master sends a "read" frame (without payload) that the slave fills with its own data. The lin_slave_send_response() function should be called immediately in the handler. If it would be called from the main task the timings may not be predictable and the master may interpret delayed response as a timeout.

The application has to implement lin_slave_handle_rx_frame_EXTERNAL(). It is called from the main task after a complete frame has been received.

I release the code into public domain.