M0AGX / LB9MG

Amateur radio and embedded systems

Connecting a high speed parallel ADC to SAM E70

ADCs come in all shapes and sizes. Converters built into MCUs made great progress across the years but sometimes you need a dedicated discrete ADC. At the extremes of the "speed spectrum" there are "slow but precise" ADCs (that tend to use SPI or I2C), and absolute monsters with SerDes outputs that can only be consumed by an FPGA.

My application needed an ADC from the middle ground but with multiple channels and simultaneous sampling. Interfacing a medium-speed multichannel ADC to an SAM E70 MCU turned out more interesting than I wished for. 😊

After some searching the AD7606 has been selected for the gadget I was working on. The AD7606 in my application is an 8 channel ADC with 16-bit resolution. The most important feature is simultaneous sampling which means that all channels are sampled at the exact same moment so all your samples are already aligned in time when they come out from the ADC. This ADC has both serial (SPI) and parallel interfaces. However, SPI was too slow for my sample rate so I had to somehow marry the parallel interface to the SAM E70.

This is where the fun began...

The problems

Sampling has to be uniform so the ADC needs a stable trigger signal equal to the desired sample rate. This is easy to do using a timer in PWM mode. 🙂

The whole system will shuffle plenty of data so all operations should be handled by DMA. However, there is no easy way to trigger DMA from a timer on the SAM E70.

The ADC SPI interface is specified up to 23.5 MHz which means that the ADC can output data up to 23.5 Mbps. There are 8 channels, each 16-bit so a "packet" from all channels is 128 bits. If you divide 23.5 Mb by 128 you will theoretically get 0.183 Mpacket/s. Alternatively, you can divide the other way around to find out that complete readout at maximum SPI speed takes around 5.44 µs. The ADC also needs time to convert (t_conv in the datasheet) which is around 4.2 µs (and even longer with oversampling). Obviously conversion and readout must happen sequentially (there are no FIFOs or buffers in this ADC) so the "total time" to convert & read a packet of samples is around 10 µs. This means an effective sample rate of only 100 kSPS. My application needed the maximum rate of 200 kSPS so I had to use the parallel interface.

The parallel interface is 8 or 16 bit wide. The readout is naturally 8/16 times faster but needs more pins. Big MCUs like the SAM E70 have an external bus interface that could be used to interface with the AD7606 but... not in the package that I had available.

This brings us to the issue of "How to do a parallel bus interface without such peripheral in the MCU?". First I thought about using the GPIOs. GPIO inputs can be read all at once so it should be easy. However, I could not find any GPIO port that would have at least 8 consecutive pins available in the package starting from bit 0 (to avoid having to shift the samples to make sense).

Fortunately, the SAM E70 has a "parallel input/output" (PIO) peripheral that can read 8 input pins at a time, organize them into 8/16/32-bit values, and supports DMA! Fortunately the pins were available in the MCU package I was using. PIO can only read data when clocked externally. How do I generate the clock signal?

The ADC also has a CS pin that behaves identically in SPI and parallel modes. There is also an SCK pin also behaves identically in both modes. What if I use the MCU SPI peripheral in master mode only to generate the CS & SCK signals and leave MOSI and MISO disconnected? 🙂

Generating the timebase

I use a timer in PWM mode to generate the CONVST signal. ADC starts converting on the rising edge. Falling edge is used to trigger the readout. Figuring out how to trigger DMA from a timer took quite some effort. PWM duty cycle has to be high enough to give the ADC the time to convert but also as low as possible to have the time to read out the data from the ADC. The duty cycle has to be just right.

overview

How everything fits together

The application looks like a Jenga tower made of timer channels, DMA channels, SPI, PIO. And it actually works! Timer channels provide the signals for the ADC to convert and also trigger one of the DMA channels, that in turn tells SPI to transmit. SPI generates the CS & SCK signals for the ADC and PIO. PIO then reads the bits from the ADC. The hackiest part was to realize that two pin loopbacks on the PCB are needed for the MCU to keep kicking itself.

With all these things in place the CPU load to get data from the ADC is exactly 0% (if you are not counting the buffer full interrupt). There is of course some bus load but the SAM E70 can move a lot of data with the bus running at 150 MHz.

overview

Timer configuration

Getting the timers and DMA to work was worth its own article so feel free to have a look. The only difference compared to the old article is that the descriptor triggered by timer capture is linked to the SPI descriptor (and the SPI descriptor links back to the timer capture descriptor).

SPI configuration

SPI initialization is very simple. Just a plain master mode:

1
2
3
4
5
6
7
8
9
SPI0->SPI_MR = SPI_MR_MSTR_MASTER | SPI_MR_PS; // master mode, hardware CS control

SPI0->SPI_CSR[DAQ_SPI_CS] =
    SPI_CSR_SCBR(15)       | // clock divider
    SPI_CSR_BITS_16_BIT    | // every transfer is 16-bits long
    SPI_CSR_CPOL_IDLE_HIGH | // clock idle high
    SPI_CSR_CSAAT;           // keep CS low until LASTXFER is written

SPI0->SPI_CR = SPI_CR_SPIEN; // enable the peripheral

The tricky issues:

  • Clock polarity has to be idle high. The ADC outputs on the falling edge and PIO has to read on the rising edge.
  • Backpressure has to be disabled (the WDRBT bit). Usually the SPI master would wait for the last received byte to be read out before transmitting the next byte to avoid loosing data. In this case we don't care about the data as the peripheral is only needed to generate CS & SCK.

SPI DMA descriptor

SPI has to generate 16 clock pulses. Reminder why: the ADC has 8 channels, each sample is 16-bits wide, the interface is 8-bits wide. The nice thing about the SPI peripheral in the SAM E70 is that it has a 32-bit wide TX register. A single write can specify up to 2 data bytes to be sent and the CS line to use. This means that all I have to do to is a single word write using DMA to the TX register. SPI hardware takes care of the rest.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
uint32_t spi_tx_data[] = {
    0x0               | // dummy data
    SPI_TDR_PCS_NPCS0 | // CS line to use
    SPI_TDR_LASTXFER    // pull CS high when done
};

dma_view2_desc_t dma_spi_tx_desc = {
    .NDA = (uint32_t)&dma_timer_capture_desc,  // link to the timer capture descriptor
    .UBC = XDMAC_CUBC_UBLEN(sizeof(spi_tx_data) / sizeof(spi_tx_data[0])) | // length in words
        (2 << 27) | // NVIEW = 2, next descriptor type is view 2
        (1 << 26) | // NDEN, update destination from next descriptor
        (1 << 25) | // NSEN, update source from next descriptor
        (1 << 24) , // NDE, next descriptor is enabled
    .SA = (uint32_t)spi_tx_data, // source
    .DA = (uint32_t)&(SPI0->SPI_TDR), //destination
    .CFG =
        XDMAC_CC_PERID(XDMAC_CHANNEL_HWID_SPI0_TX) | // use SPI TX trigger signal
        XDMAC_CC_DAM_FIXED_AM  | // fixed destination
        XDMAC_CC_SAM_FIXED_AM  | // fixed source
        XDMAC_CC_DIF_AHB_IF1   | // use AHB port 1 for destination
        XDMAC_CC_SIF_AHB_IF1   | // use AHB port 1 for source
        XDMAC_CC_DWIDTH_WORD   | // 32-bit data size
        XDMAC_CC_CSIZE_CHK_1   | // 1 data chunk
        XDMAC_CC_DSYNC_MEM2PER | // memory to peripheral
        XDMAC_CC_MBSIZE_SINGLE | // burst = 1 data
        XDMAC_CC_TYPE_PER_TRAN   // transfer synchronized with peripheral
};

Alternatively, if there is a spare DMA channel left this descriptor could be triggered from the timer capture signal and link back to itself instead of linking to the timer capture descriptor. This may save a couple of bus cycles the DMA would spend loading the linked descriptor.

PIO configuration

PIO is now my favorite peripheral. All it takes to set it up is:

1
PIOA->PIO_PCMR = PIO_PCMR_ALWYS | PIO_PCMR_DSIZE_WORD | PIO_PCMR_PCEN;

This configures capture on any enable pin (the enable pin has to be pulled up and can not be used for anything else on the board), size to capture is 32-bit. The peripheral has only 8 data inputs so it will buffer 4 bytes internally before generating the DMA request.

PIO DMA descriptor

Every ADC "packet" (16-bit readings of all 8 channels) consists of 16 bytes so it makes the most sense to read a multiple of 16 bytes from the ADC before processing the data later. Bigger buffers lead to more efficiency because the IRQ and start of processing happens less often (at the price of increased latency). Size of the buffer can also be matched to fit neatly into a USB packet or Ethernet frame. The descriptor simply reads the PIO data register (word at a time), places the data in the buffer until it reaches the end of the buffer, and then starts over from the beginning of the buffer. The DMA channel can deliver an interrupt after the whole buffer has been filled to tell the firmware to process the data.

Depending on the data processing speed it may be better to have two buffers (also called "ping-pong mode" in Silicon Labs' documentation) and two descriptors. When one is filled, the other one is being processed. First descriptor moves PIO data to the first buffer, then links to the second descriptor that fills the second buffer, and the second descriptor links back to the first descriptor.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#define ADC_CAPTURE_WORDS 512
uint32_t adc_data_buffer[ADC_CAPTURE_WORDS];

dma_view2_desc_t dma_pio_read_desc; //forward declaration

dma_view2_desc_t dma_pio_read_desc = {
    .NDA = (uint32_t)&dma_pio_read_desc,  // next linked descriptor - itself
    .UBC = XDMAC_CUBC_UBLEN(ADC_CAPTURE_WORDS) | //length in words
           (2 << 27) | // NVIEW = 2, next descriptor type is view 2
           (1 << 26) | // NDEN, update destination from next descriptor
           (0 << 25) | // NSEN, do not update source from next descriptor
           (1 << 24) , // NDE, next descriptor is enabled
    .SA = (uint32_t) &(PIOA->PIO_PCRHR), // source
    .DA = (uint32_t) adc_data_buffer, //destination
    .CFG =
        XDMAC_CC_PERID_PIOA         | // trigger on PIO signal
        XDMAC_CC_DAM_INCREMENTED_AM | // incrementing destination
        XDMAC_CC_SAM_FIXED_AM       | // fixed source
        XDMAC_CC_DIF_AHB_IF1        | // use AHB port 1 for destination
        XDMAC_CC_SIF_AHB_IF1        | // use AHB port 1 for source
        XDMAC_CC_DWIDTH_WORD        | // 32-bit data size
        XDMAC_CC_CSIZE_CHK_1        | // 1 data chunk
        XDMAC_CC_DSYNC_PER2MEM      | // peripheral to memory
        XDMAC_CC_MBSIZE_SINGLE      | // burst = 1 data
        XDMAC_CC_TYPE_PER_TRAN        // transfer synchronized with peripheral
};

DMA configuration

Two DMA channels are used. The descriptors have to be loaded only once. Everything happens without firmware involvement after the timer output is started.

Channel 0 is used the timer capture and SPI descriptors:

1
2
3
XDMAC->XdmacChid[0/*channel*/].XDMAC_CNDA = (uint32_t)&dma_timer_capture_desc;
XDMAC->XdmacChid[0/*channel*/].XDMAC_CNDC = TRIGGER_NEXT_DESC_SETTINGS;
XDMAC->XDMAC_GE = (1 << 0/*channel*/); // enable the channel

Channel 1 does PIO data reception and raises an interrupt when the buffer is full:

1
2
3
4
5
XDMAC->XdmacChid[1/*channel*/].XDMAC_CNDA = (uint32_t)&dma_pio_read_desc;
XDMAC->XdmacChid[1/*channel*/].XDMAC_CNDC = TRIGGER_NEXT_DESC_SETTINGS;
XDMAC->XDMAC_GE = (1 << 1/*channel*/);  // enable the channel
XDMAC->XDMAC_GIE = (1 << 1/*channel*/); // enable channel interrupt
NVIC_EnableIRQ(XDMAC_IRQn);

The CNDC constant is explained in the previous article.

Diagrams were made with Ditaa.