Last time, we talked about UART, which is asynchronous. Now let’s talk about some synchronous interfaces! Synchronous simply means that a clock is provided along with data signals, so it is clear to both the master and slave when the data should be shifted out, or when the data should be sampled. Let’s look at SPI first.
Overview
SPI, which stands for Serial Peripheral Interface, is the simpler than I2C. SPI generally has four or more singals: MOSI, MISO, SCLK and at least one CS. MOSI, which stands for Master Out Slave In, is driven by the master and is read by the slaves. MISO, Master In Slave Out, is driven by the slaves and read by the master. SCK is the synchronous clock and is always driven by the master. CS, which stands for chip select, is driven by the master. While MOSI, MISO and SCK are all connected to the master and to every slave, CS is unique to each slave. CS is high when a slave is not selected, and the chip is unresponsive to the master. When CS is low, that slave is selected, and it starts outputting on MISO, as well as sampling MOSI based on SCLK. For example, the Variable Load schematic has two devices on the SPI bus:


SPI signals highlighted
The top picture shows the SPI signals on the microcontroller. Since there are two slave devices on the SPI bus, there are two CS signals: ADC-CS and DAC-CS. When the microcontroller wants to talk to the ADC (U2), ADC-CS is driven low. When the microcontroller wants to talk to the DAC (U1), DAC-CS is driven low. The microcontroller, ADC, and DAC all share MOSI and SCK. The DAC does not have MISO because it has nothing to say to the micrcontroller; it only receives data from the master.
Strengths and Weaknesses
SPI’s advantage is its speed. The SPI interface is extremely fast for three reasons:
- The data clock can be tens of megahertz, which puts the bandwidth in the megabits per second range. (The data clock can also be as slow as you want; it doesn’t have to be that fast if you don’t want it to be)
- There is little overhead. I2C, as we’ll see later, must transmit an address to initiate data transfer. UART, as mentioned last time, needs stop and start bits, and maybe a parity bit. The CS signal(s) make addressing unnecessary, and the data clock makes the stop and start bits unnecessary.
- The communication is full-duplex. This means that data can be sent and received simultaneously. I2C, as we’ll see later, is half duplex; this means that a device can only send or receive, but not both at the same time.
Because SPI is so fast, the SPI bus should be used in application where you need high speed or frequency. For the Variable Load project, I’ll need to read the ADC and update the DAC rapidly to monitor and control the load. I’m not sure how frequently I’ll need to sample each device, but having the capability to do it in the tens or hundreds of kilohertz range would be good.
SPI really has only one disadvantage: the number of signals. If you have a SPI bus with, say, 3 devices on it, then you’ll need a whopping six signals: SCK, MOSI, MISO, CS1, CS2, and CS3. Compare that to I2C, which can over a hundred devices on it, with just two wires! The large number of signals becomes a pain if you’re trying to send SPI over a harness. This pain is compounded if you have to add another device: you’ll have to modify the wire harness to accommodate another chip select signal.
Protocol
SPI is pretty straight forward: MOSI and MISO carry data, and SCK is used to shift data out, or shift data in. However, there is one key question: what edge (rising or falling) do you use to update the data output, or sample the data input? There are four generally accepted modes of transfer: mode 0, mode 1, mode 2, and mode 3. Another way to describe the mode is using CPOL and CPHA:

From ATMEGA32U4 datasheet
Mode 0 has CPOL = 0 and CPHA = 0. Mode 1 has CPOL = 0, CPHA = 1. Mode 2 has CPOL = 1, CPHA = 0. Mode 3 has CPOL = 1, CPHA = 1. What does that mean? Well, CPOL = 0 means the clock is idle low, and CPOL = 1 means that the clock is idle high. CPHA = 0 means the first edge (and third, and fifth…) should be used for sampling, while CPHA = 1 means the second edge (and fourth, and sixth…) should be used for sampling. The figures above illustrate that point. Let’s look at each mode:
- Mode 0: the clock is idle low, which means the first transition is a rising edge (from low to high). This means that the data should be sampled on rising edges. However, since the very first edge is used for sampling, the data must be valid before the first edge. In order to avoid changing data near the sampling point, the data is updated on the opposite edge as the sampling edge, which means the output is changed on the clock’s falling edges.
- Mode 1: the clock is idle low, which means the first transition is a rising edge, while the second transition is a falling edge. Since CPHA = 1, the second edge, the falling edge, is used for sampling. This means the data is updated on rising edges.
- Mode 2: the clock is idle high, which means the first transition is a falling edge. This means that data is sampled on the falling edge, and data is updated on the rising edge. Again, since the very first edge is when the data is sampled, the data must be valid before the first edge.
- Mode 3: the clock is idle high, which means the second transition is a rising edge. This means that data is sampled on the rising edge, and data is updated on the falling edge.
Mode 0 and mode 3 are quite similar: data is sampled on the rising edge, and updated on the falling edge. Likewise, mode 1 and mode 2 are similar: data is sampled on the falling edge, and updated on the rising edge. The difference between mode 0 and mode 3 (as well as between mode 1 and mode 2) is whether data is valid on the very first edge: data must be valid on the first edge for mode 0, while it does not have to be valid for mode 3.
Peripheral Hardware
Let’s take a look at how the peripheral is set up to implement SPI:

From ATMEGA32U4 datasheet
The SPI peripheral in the ATMEGA32U4 is shown above. The most important parts are:
- SPI clock generator: SCK is generated using a divided down system clock
- Pin control logic: this module is responsible for shifting data out on MOSI and in from MISO, as well as outputting the clock for the data.
- Shift register & buffer: data to be sent is loaded into the shift register, and the received data is automatically loaded into the buffer to be read later.
- SPI control register: this is where CPOL and CPHA are set. This register also controls what the SCK frequency is, and whether the SPI is in master or slave mode.
Note that none of the chip selects are shown here. Chip select signals are actually just GPIO pins for the ATMEGA32U4; there’s no special hardware dedicated to controlling them in the SPI protocol.
Software
Let’s see how the software communicates with the hardware:

Here’s the constructor. First, the software sets the SPI into master or slave mode. Although the microcontroller will be the master for this application, it is possible for the microcontroller to act as a slave; this means something else (probably another microcontroller) would be the one controlling SCK, MOSI and chip select. Anyways, the bit order is set, which means either the most significant bit is transmitted first, or the least is; I’ve only ever seen most significant bit first, though. Then, the clock speed is set. The SCK clock is produced by dividing down the system clock.
Next up is setting up pins. Like the UART peripheral, the SPI peripheral doesn’t have the ability to set the data direction of some of its pins. In this case, MOSI and SCK aren’t automatically set to outputs when the SPI peripheral is in master mode. For this reason, DDRB must have its first and second bits set to configure those signals as outputs when in master mode. Then, SS direction is set. SS is the chip select the microcontroller would use, if it were in slave mode. Since we’ll be using the microcontroller as master mode for this application, we can set that pin to either input or output. However, setting this pin to input leads to complications, which I’ll talk about in a bit.
Lastly, the SPI peripheral is enabled.

Above is the helper methods. Not a whole lot to say, as its just bit manipulation like usual. However, take a look at the comments in set_ss_dir: if SS is set to input, and the SPI peripheral is in master mode, then the SPI peripheral will automatically change to slave mode! Unfortunately, I didn’t realize at the time when I was working on my schematic, and so I hooked this pin up to a button on the rotary encoder I plan to use. This means the microcontroller will change to a slave when I press the button, which is behavior I don’t want. I’ll have to fix that problem in hardware or in software, but for now, I’ll ignore it.

As we saw, the hardware has one shift register for transmitting and receiving data. As each bit is shifted out for transmission, one bit is shifted in from the receiver. That’s why send_byte will also return a byte: the send and receive happen simultaneously. Here, data is written to the shift register (SPDR), and then the method waits until the transmission/reception is complete, as indicated by the interrupt flag. Then, the received data is returned. Note that the interrupt flag is cleared by reading the flag, and then reading the data. This is done in the last two lines of the method.
At the start of the method, the SPI transfer mode can be set. Initially, I wasn’t planning on having mode be an argument, but I realized that since different devices often have different transfer modes (for example, the ADC and DAC have different transfer modes), then it makes sense to force the caller to provide a mode, to prevent accidentally calling this method with the wrong mode.

I also wrote code to send two, three and four bytes at a time. The code for sending two bytes is shown above; the three and four bytes look about the same. Note, however, that the methods for transferring data don’t set or clear the CS signal. This is so that the application code can send an arbitrary amount of bytes. For flash chips, for example, you can read or write to pages or sectors, which can be hundreds of bytes. For these types of transfers, you would clear the CS signal at the start and set it at the very end. If manipulating CS was done inside the send_byte function, then you would only be able to work with one bytes at a time, rather than sectors or pages.

That’s about it for SPI! Above are some miscellaneous methods for interrups and checking flags, but that’s it. SPI is pretty straight forward: set CS low for the chip you want, output the data and clock, save the received data, then set CS high.