DMA for embedded systems

One of the key features of LwRB library is that it can be seamlessly integrated with DMA controllers on embedded systems.

Note

DMA stands for Direct Memory Access controller and is usually used to off-load CPU. More about DMA is available on Wikipedia.

DMA controllers normally use source and destination memory addresses to transfer data in-between. This features, together with LwRB, allows seamless integration and zero-copy of application data at interrupts after DMA transfer has been completed. Some manual work is necessary to be handled, but this is very minor in comparison of writing byte-by-byte to buffer at (for example) each received character.

Below are 2 common use cases:

DMA transfers data from LwRB memory to (usually) some hardware IP
DMA transfers data from hardware IP to memory

Zero-copy data from LwRB memory

This describes how to pass LwRB output memory address as pointer to DMA (or any other processing function). After data is successfully processed, application can skip processed data and mark buffer as free for new data being written to it.

Data transfer from memory to hardware IP

Case A: Initial state, buffer is full and holds 7 bytes
Case B: State after skipping R pointer for 3 bytes. Buffer now holds 4 remaining bytes
Case C: Buffer is empty, no more memory available for read operation

Code example:

Skip buffer data after usage

#include "lwrb/lwrb.h"

/* Declare rb instance & raw data */
lwrb_t buff;
uint8_t buff_data[8];

size_t len;
uint8_t* data;

/* Initialize buffer, use buff_data as data array */
lwrb_init(&buff, buff_data, sizeof(buff_data));

/* Use write, read operations, process data */
/* ... */

/* IMAGE PART A */

/* At this stage, we have buffer as on image above */
/* R = 5, W = 4, buffer is considered full */

/* Get length of linear memory at read pointer */
/* Function returns 3 as we can read 3 bytes from buffer in sequence */
/* When function returns 0, there is no memory available in the buffer for read anymore */
if ((len = lwrb_get_linear_block_read_length(&buff)) > 0) {
    /* Get pointer to first element in linear block at read address */
    /* Function returns &buff_data[5] */
    data = lwrb_get_linear_block_read_address(&buff);

    /* Send data via DMA and wait to finish (for sake of example) */
    send_data(data, len);

    /* Now skip sent bytes from buffer = move read pointer */
    lwrb_skip(&buff, len);

    /* Now R points to top of buffer, R = 0 */
    /* At this point, we are at image part B */
}

/* IMAGE PART B */

/* Get length of linear memory at read pointer */
/* Function returns 4 as we can read 4 bytes from buffer in sequence */
/* When function returns 0, there is no memory available in the buffer for read anymore */
if ((len = lwrb_get_linear_block_read_length(&buff)) > 0) {
    /* Get pointer to first element in linear block at read address */
    /* Function returns &buff_data[0] */
    data = lwrb_get_linear_block_read_address(&buff);

    /* Send data via DMA and wait to finish (for sake of example) */
    send_data(data, len);

    /* Now skip sent bytes from buffer = move read pointer */
    /* Read pointer is moved for len bytes */
    lwrb_skip(&buff, len);

    /* Now R points to 4, that is R == W and buffer is now empty */
    /* At this point, we are at image part C */
}

/* IMAGE PART C */

/* Buffer is considered empty as R == W */

Part A on image clearly shows that not all data bytes are linked in single contiguous block of memory. To send all bytes from lwrb, it might be necessary to repeat procedure multiple times

Skip buffer data for non-contiguous block

/* Initialization part skipped */

/* Get length of linear memory at read pointer */
/* When function returns 0, there is no memory
   available in the buffer for read anymore */
while ((len = lwrb_get_linear_block_read_length(&buff)) > 0) {
    /* Get pointer to first element in linear block at read address */
    data = lwrb_get_linear_block_read_address(&buff);

    /* If max length needs to be considered */
    /* simply decrease it and use smaller len on skip function */
    if (len > max_len) {
        len = max_len;
    }

    /* Send data via DMA and wait to finish (for sake of example) */
    send_data(data, len);

    /* Now skip sent bytes from buffer = move read pointer */
    lwrb_skip(&buff, len);
}

Zero-copy data to LwRB memory

Similar to reading data from buffer with zero-copy overhead, it is possible to write to lwrb with zero-copy overhead too. Only difference is that application now needs pointer to write memory address and length of maximal number of bytes to directly copy into buffer. After successful processing, buffer advance operation is necessary to manually increase write pointer and to increase number of bytes in buffer.

Case A: Initial state, buffer is empty as R == W
- Based on W pointer position, application could write 4 bytes to contiguous block of memory
Case B: State after advancing W pointer for 4 bytes. Buffer now holds 4 bytes and has 3 remaining available
Case C: Buffer is full, no more free memory available for write operation

Code example:

Advance buffer pointer for manually written bytes

/* Declare rb instance & raw data */
lwrb_t buff;
uint8_t buff_data[8];

size_t len;
uint8_t* data;

/* Initialize buffer, use buff_data as data array */
lwrb_init(&buff, buff_data, sizeof(buff_data));

/* Use write, read operations, process data */
/* ... */

/* IMAGE PART A */

/* At this stage, we have buffer as on image above */
/* R = 4, W = 4, buffer is considered empty */

/* Get length of linear memory at write pointer */
/* Function returns 4 as we can write 4 bytes to buffer in sequence */
/* When function returns 0, there is no memory available in the buffer for write anymore */
if ((len = lwrb_get_linear_block_write_length(&buff)) > 0) {
    /* Get pointer to first element in linear block at write address */
    /* Function returns &buff_data[4] */
    data = lwrb_get_linear_block_write_address(&buff);

    /* Receive data via DMA and wait to finish (for sake of example) */
    /* Any other hardware may directly write to data array */
    /* Data array has len bytes length */
    /* Or use memcpy(data, my_array, len); */
    receive_data(data, len);

    /* Now advance buffer for written bytes to buffer = move write pointer */
    /* Write pointer is moved for len bytes */
    lwrb_advance(&buff, len);

    /* Now W points to top of buffer, W = 0 */
    /* At this point, we are at image part B */
}

/* IMAGE PART B */

/* Get length of linear memory at write pointer */
/* Function returns 3 as we can write 3 bytes to buffer in sequence */
/* When function returns 0, there is no memory available in the buffer for write anymore */
if ((len = lwrb_get_linear_block_write_length(&buff)) > 0) {
    /* Get pointer to first element in linear block at write address */
    /* Function returns &buff_data[0] */
    data = lwrb_get_linear_block_write_address(&buff);

    /* Receive data via DMA and wait to finish (for sake of example) */
    /* Any other hardware may directly write to data array */
    /* Data array has len bytes length */
    /* Or use memcpy(data, my_array, len); */
    receive_data(data, len);

    /* Now advance buffer for written bytes to buffer = move write pointer */
    /* Write pointer is moved for len bytes */
    lwrb_advance(&buff, len);

    /* Now W points to 3, R points to 4, that is R == W + 1 and buffer is now full */
    /* At this point, we are at image part C */
}

/* IMAGE PART C */

/* Buffer is considered full as R == W + 1 */

Example for DMA transfer from memory

This is an example showing pseudo code for implementing data transfer using DMA with zero-copy overhead. For read operation purposes, application gets direct access to LwRB read pointer and length of contiguous memory.

It is assumed that after DMA transfer completes, interrupt is generated (embedded system) and buffer is skipped in the interrupt.

Note

Buffer skip operation is used to mark sent data as processed and to free memory for new writes to buffer

DMA usage with buffer

/* Declare rb instance & raw data */
lwrb_t buff;
uint8_t buff_data[8];

/* Working data length */
volatile size_t len;

/* Send data function */
void send_data(void);

int
main(void) {
    /* Initialize buffer */
    lwrb_init(&buff, buff_data, sizeof(buff_data));

    /* Write 4 bytes of data */
    lwrb_write(&buff, "0123", 4);

    /* Send data over DMA */
    send_data();

    while (1);
}

/* Send data over DMA */
void
send_data(void) {
    /* If len > 0, DMA transfer is on-going */
    if (len > 0) {
        return;
    }

    /* Get maximal length of buffer to read data as linear memory */
    len = lwrb_get_linear_block_read_length(&buff);
    if (len > 0) {
        /* Get pointer to read memory */
        uint8_t* data = lwrb_get_linear_block_read_address(&buff);

        /* Start DMA transfer */
        start_dma_transfer(data, len);
    }

    /* Function does not wait for transfer to finish */
}

/* Interrupt handler */
/* Called on DMA transfer finish */
void
DMA_Interrupt_handler(void) {
    /* Transfer finished */
    if (len > 0) {
        /* Now skip the data (move read pointer) as they were successfully transferred over DMA */
        lwrb_skip(&buff, len);

        /* Reset length = DMA is not active */
        len = 0;

        /* Try to send more */
        send_data();
    }
}

Tip

Check STM32 UART DMA TX RX Github repository for use cases.