How to make a USB Mass Storage Device part 2

I've been putting this off for some time because writing these posts takes a long time. I know the audience for this stuff is incredibly small, if it exists at all, but I hope to keep updating this as I learn more about the PIC32MZ.

Today I'm going to cover reading from and writing to an SD card in a mass storage device.

Read - Command 0x28

In a previous post I used DMA to read blocks from an SD card attached to SPI2. What I found was that when combined with a lot of interrupts and USB DMA it kept falling over, so I removed it. The good news is the new method is far more stable and just as fast. Let's take a look at the code:

case 0x28: // Read (10)
{
    if (MSD_ready)
    {
        read_length = (int)(cbw.CBWCB[7] << 8) | cbw.CBWCB[8];
        read_address  = (int)(cbw.CBWCB[2] << 24) | (int)(cbw.CBWCB[3] << 16) | (cbw.CBWCB[4] << 8) | (cbw.CBWCB[5]);

        blocks_left = read_length;
        toggle = 0;

        // Start SD multiblock read
        result = SD_start_multi_block_read(read_address);
        // Read first block
        addr = USB_data_buffer;
        result |= SD_read_multi_block(addr, 512);
        blocks_left--;

        // Wait until we are ready to send on EP1
        USB_EP1_wait_TXRDY();

        while (blocks_left)
        {
            // Send this block via DMA
            USB_EP1_DMA_DONE = 0;
            addr = USB_data_buffer + (toggle * 512);
            USB_EP1_send_DMA(addr, 512);

            // While it's sending, read the next block in
            toggle ^= 1;
            addr = USB_data_buffer + (toggle * 512);
            result |= SD_read_multi_block(addr, 512);

            // Wait until DMA transfer is done before continuing
            while (!USB_EP1_DMA_DONE);
            USB_EP1_wait_TXRDY();
            blocks_left--;
        }

        // Stop the SD card multiblock read
        SD_stop_multi_block_read();

        // There will be one block that is left unsent, send it now
        USB_EP1_DMA_DONE = 0;
        USB_EP1_send_DMA(addr, 512);
        while (!USB_EP1_DMA_DONE);
        USB_EP1_wait_TXRDY();

        if (result != RES_OK) 
        {
            requestSenseAnswer[2] = 0x5;
            requestSenseAnswer[12] = 0x20;
            csw.dCSWDataResidue = 252;
            csw.bCSWStatus |= 1; // Error occurred   
        }
    }
    else
    {
        csw.bCSWStatus |= 1; // Error occurred   
        requestSenseAnswer[2] = 2;
    }

    break;

}

OK, it's not that long. Let's take a look at it bit by bit. Oh, and MSD_ready is a flag I use to signify if an SD card is attached.

read_length = (int)(cbw.CBWCB[7] << 8) | cbw.CBWCB[8];
read_address  = (int)(cbw.CBWCB[2] << 24) | (int)(cbw.CBWCB[3] << 16) | (cbw.CBWCB[4] << 8) | (cbw.CBWCB[5]);

The host will send us the amount of blocks (that is, 512-byte blocks on an SD card), to read as well as the Logical Block Address (LBA) to start reading at.

For the rest of it, instead of line by line I think it'd be better to explain what it's doing. The way I have implemented the reading system is to first initiate a multi-block read (SD_start_multi_block_read()) from the SD card. You do this by sending it a special command (CMD18) along with the starting address. If you are reading many blocks, this works out much faster than reading multiple single blocks. Once I have done this, I read 512 bytes from the SD card (SD_read_multi_block()) and send it to the host via USB DMA (USB_EP1_send_DMA()) and immediately start reading another block from the SD card. In this way the amount of time the SD card is idle and the amount of time the USB transfer is idle is greatly reduced.

I've also implemented a double-buffer system so that when I'm reading I can send at the same time. In this way, I can read large 64kB blocks from the SD card and send them to the host at a fairly high speed while only using 1kB of RAM.

On reasonably good SD cards, I've managed transfer speeds of 5.5MB/s, which is way faster than the 2MB/s I was getting from Harmony.

USB DMA

The USB peripheral on the PIC32MZ has its own DMA system, separate from the DMA system I covered before. In the previous section I talked about sending blocks via DMA, let's take a quick look at how this would look:

void USB_EP1_send_DMA(unsigned char *buffer, uint32_t dmaCount)
{
    USB_EP1_DMA_DONE = 0;                           // Flag to indicate transfer is done

    USBE1CSR0bits.MODE = 1;                         // Set the mode to transfer (for receiving data, it'd be set to 0)

    USBDMA1Abits.DMAADDR = virt_to_phys(buffer);    // Set the address of the buffer to read from
    USBDMA1Nbits.DMACOUNT = dmaCount;               // Set the number of bytes to receive
    USBDMA1Cbits.DMABRSTM = 3;                      // Set DMA burst mode 3
    USBDMA1Cbits.DMAMODE = 0;                       // Set DMA mode 0
    USBDMA1Cbits.DMAEP = 1; // OUT Endpoint 1       // Set USB DMA channel 1 to work with endpoint 1
    USBDMA1Cbits.DMADIR = 1;                        // Set the DMA direction to transfer (again, for receiving you'd set this to 0)
    USBDMA1Cbits.DMAIE = 1;                         // Enable the USB DMA interrupt
    USBDMA1Cbits.DMAEN = 1;                         // Enable USB DMA channel 1
}

As with all DMA, you must declare the buffer "coherent", like this:

unsigned char __attribute__ ((coherent, aligned(8))) USB_data_buffer[USB_DATA_BUFFER_SIZE];

If you just declare it as always, that is:

unsigned char USB_data_buffer[USB_DATA_BUFFER_SIZE];

then your DMA transfers will not work.

Write - Command 0x2A

Let's take a look at the code first:

case 0x2A: // Write (10)
{
    if (MSD_ready)
    {
        read_length = (int)(cbw.CBWCB[7] << 8) | cbw.CBWCB[8];
        read_address  = (int)(cbw.CBWCB[2] << 24) | (int)(cbw.CBWCB[3] << 16) | (cbw.CBWCB[4] << 8) | (cbw.CBWCB[5]);

        USBCSR3bits.ENDPOINT = 2;

        blocks_left = read_length;

        while (blocks_left > 0)
        {
            if (blocks_left > USB_DATA_BUFFER_SIZE / 512)
                blocks_to_read = USB_DATA_BUFFER_SIZE / 512;
            else
                blocks_to_read = blocks_left;

            for (cnt = 0; cnt < blocks_to_read; cnt++)
            {
                while (!USBE2CSR1bits.RXPKTRDY)
                addr = USB_data_buffer + (512 * cnt);
                USB_EP2_DMA_DONE = 0;
                USB_EP2_receive_DMA(addr, 512);
                while (!USB_EP2_DMA_DONE);
                USBE2CSR1bits.RXPKTRDY = 0;
                blocks_left--;
            }                    

            result = disk_write(0, USB_data_buffer, read_address, blocks_to_read);
            result = RES_OK;

            read_address += (blocks_to_read * 512);

            if (result != RES_OK) csw.bCSWStatus |= 1; // Error occurred   
        }
    }
    else
    {
        csw.bCSWStatus |= 1; // Error occurred   
        requestSenseAnswer[2] = 2;
    }

    break;

}

When reading worked so well, I felt sure that I could do writing in a similar way. That is, read 512 bytes from the USB host and write them while simultaneously reading another 512 bytes from the host via DMA. However, when I tried this I had terrible writing speeds. At 2MB/s it was still double Harmony, but I was sure I could do better.

What I found was that reading the entire (up to) 64kB of data into a buffer and then writing it in one shot ended up way faster. I can now get consistent 4MB/s writes over USB, which makes the writing actually useful for many projects.

Writing to the SD card

I had never implemented SD card writing before, I was using ancient code I found almost 10 years ago online. It turns out multiblock writing is not terribly hard. Here's the code:

static
int xmit_datablock (    /* 1:OK, 0:Failed */
    const BYTE *buff,   /* 512 byte data block to be transmitted */
    BYTE token,         /* Data token */
    UINT count
)
{
    static BYTE resp;
    UINT bc = 512;

    // Clear out SPIBUF
    while (SPI_CHAN_FUNC(STATbits).RXBUFELM > 0)
    {
        resp = SPI_CHAN_FUNC(BUF);
        bc = resp;
    }

    if (wait_ready() != 0xFF) return 0;

    xmit_spi(token);        /* Xmit a token */
    if (token != 0xFD) 
    {   /* Not StopTran token */

        // Send 16 bytes first
        for (bc = 0; bc < 16; bc++)
        {
            SPI_CHAN_FUNC(BUF) = *buff++;
        }

        bc = 512 - 16;

        while (bc > 0)
        {
            if (SPI_CHAN_FUNC(STATbits).RXBUFELM > 0)
            {
                // Read an 0xFF
                resp = SPI_CHAN_FUNC(BUF);
                // Send next byte
                SPI_CHAN_FUNC(BUF) = *buff++;
                bc--;
            }
        }

        while (SPI_CHAN_FUNC(STATbits).SPIBUSY);


        // Clear out the FIFO

        while (SPI_CHAN_FUNC(STATbits).RXBUFELM > 0)
        {
            resp = SPI_CHAN_FUNC(BUF);
        }

        // Get 2x CRC and 1x response
        SPI_CHAN_FUNC(BUF) = 0xFF;
        SPI_CHAN_FUNC(BUF) = 0xFF;
        SPI_CHAN_FUNC(BUF) = 0xFF;
        while (SPI_CHAN_FUNC(STATbits).SPIBUSY);
        while ((SPI_CHAN_FUNC(STATbits).RXBUFELM > 0))
        resp = SPI_CHAN_FUNC(BUF);

        if ((resp & 0x1F) != 0x05)  /* If not accepted, return with error */
            return 0;
    }


    return 1;
}
  • Send CMD25 (multiple block write) and the starting address
  • With each 512-byte block, wait until the SD card sends you 0xFF in return, indicating it is idle / ready
  • Then write the token 0xFC before writing the data, which you can write immediately
  • After you write the data, the SD card controller will send you 2 bytes of CRC and one byte of response data. If the lower 4 bits of the response are equal to 0x5, the write was successful, otherwise it failed
  • After the last block has been written, send the token 0xFD to the SD card to indicate the multiblock write is done

The biggest change I made was that I made use of the Enhanced Buffers (ENHBUF) mode of SPI, enabling me to queue up to 16 bytes at once. This vastly reduced the amount of time the SPI port (and thus the SD card) were idle, and sped up transfers.

OK, long enough for one day. Next time I'll have a look at USB HID (keyboard) and show how I used it to make a joystick.

Tags: USB, MSD, tl;dr

Using the PIC32MZ EF USB module in host mode

Writing about USB MSD is fairly dry, boring stuff, so while I'm investigating USB on the PIC32MZ series, I thought I might as well try and get host mode working. Boy, what a mission. There's even less documentation on how things work, even less examples for it and there are so many quirks and weird things about it. I've finally gotten to the stage where I can connect to a device and query it (on endpoint 0), so I thought I'd share my code for anyone that may need this.

Setting up host mode

OK, here's the code I (currently) use. You'll notice it's quite different from the device mode initialisation.

void USB_init()
{
    volatile uint8_t * usbBaseAddress;

    USBCRCONbits.USBIE = 1;

    *((unsigned char*)&USBE0CSR0 + 0x7F) = 0x3;
    delay_ms(10);
    *((unsigned char*)&USBE0CSR0 + 0x7F) = 0;

    USBCSR2bits.SESSRQIE = 1;
    USBCSR2bits.CONNIE = 1;
    USBCSR2bits.RESETIE = 1;
    USBCSR2bits.VBUSERRIE = 1;
    USBCSR2bits.DISCONIE = 1;
    USBCSR2bits.EP1RXIE = 1;
    USBCSR1bits.EP1TXIE = 1;

    IEC4bits.USBIE = 0;         // Enable the USB interrupt    
    IFS4bits.USBIF = 0;         // Clear the USB interrupt flag.
    IPC33bits.USBIP = 7;        // USB Interrupt Priority 7
    IPC33bits.USBIS = 1;        // USB Interrupt Sub-Priority 1
    IPC33bits.USBDMAIP = 5;
    IPC33bits.USBDMAIS = 1;
    IFS4bits.USBDMAIF = 0;
    IEC4bits.USBDMAIE = 0;

    USB_init_endpoints();

    USBCSR0bits.HSEN = 1;
    USBCRCONbits.USBIDOVEN = 1;
    USBCRCONbits.PHYIDEN = 1;

    USBCRCONbits.USBIDVAL = 0; 
    USBCRCONbits.USBIDVAL = 0; 

    IFS4bits.USBIF = 0;         // Clear the USB interrupt flag.
    IFS4bits.USBDMAIF = 0;

    IEC4bits.USBDMAIE = 1;
    IEC4bits.USBIE = 1;         // Enable the USB interrupt

    USBOTGbits.SESSION = 1;
}

Let's get straight to it. What the heck is *((unsigned char*)&USBE0CSR0 + 0x7F) = 0x3; and why am I doing it like that? I first saw this code in Harmony and I wondered the same thing. First off, what does it mean? For that, we need to take a look at the datasheet:

USBCSR0 address

The important piece of information is the address of USBCSR0, which is listed as 3000 (which is actual hexadecimal, so 0x3000). So to get the target address of that piece of code, we need to see what's at 0x3000 + 0x7F, or 0x307F:

USBEOFRST address

Side note: The datasheet has split USBEOFRST into USB and EOFRST, so you can't search for USBEOFRST in the address list. Way to go Microchip!

OK, so USBEOFRST, the register controlling "USB END-OF-FRAME/SOFT RESET CONTROL" starts at 0x307C. So the first byte of the 4-byte register is at 0x307C, the second byte at 0x307D, the third at 0x307E and the fourth at 0x307F, which is the address we are looking for. Setting this to 3 will set the NRST and NRSTX bits. But what do those bits do? Let's look further down in the datasheet:

USBEOFRST bits

OK, so it resets some clock or other. Point is, if you don't reset this, USB host will not work at all. In Harmony code I saw, Microchip describes it as a "workaround for an error in the PHY", though I cannot find this in any errata anywhere.

So we know what that line of code is doing, but why are we doing it like that? Surely we can go:

USBEOFRSTbits.NRST = 1;  
USBEOFRSTbits.NRSTX = 1;

and have the same result? I mean, surely, right? XC32 even has the bit definitions there and everything. And yet, it doesn't work. It sometimes seems to, but most often not. There are a few registers relating to USB that you have to access indirectly like this or nothing works at all! If anybody knows why, I'd sure appreciate a message. Anyway, Harmony does it like this and for once it makes sense why they did it in this way.

So we enable interrupts turn on the "soft reset" bits, wait 10 milliseconds, turn them off (which turns the USB clock back on) and then disable interrupts again. Why enable them to disable them straight away? I don't rightly know, this is what Harmony seemed to do and it took me a week of solid trying to get anything to work, so maybe I'm just superstitious at this point! Let's take a look at the next block:

USBCSR0bits.HSEN = 1;
USBCRCONbits.USBIDOVEN = 1;
USBCRCONbits.PHYIDEN = 1;

USBCRCONbits.USBIDVAL = 0; 
USBCRCONbits.USBIDVAL = 0; 

Enable High Speed mode by setting HSEN to 1. Enable the USB ID override enable bit by setting USBIDEOVEN to 1. Enable monitoring of the PHY ID by setting PHYIDEN to 1 and then set USBIDVAL to 0 (0 = host, 1 = device). The value of USBID is very important for the USB module, I've started using USB-C connectors on my boards, and they don't have a USBID pin.
So I control this via software now. Please note that you should also enable the pull-down for pin RF3 (the USB ID pin) like this:

CNPDFbits.CNPDF3 = 1; // Pull down to ensure host mode

to ensure the USB ID pin's value is 0. I don't know if you have to do this even with USB ID override enabled, but like I said it took a week of pulling my hair out before I finally got this to work and I didn't mess with it further yet.

The final piece of the puzzle is setting USBOTGbits.SESSION to 1, which starts a session with an attached device. In device mode, we had to set USBCSR0bits.SOFTCONN to 1, but that is not the case in host mode.

The program flow after intialising the USB port

This part again took a while to get my head around, mostly because I didn't know much about USB before I started this. The flow, from what I can see, seems to be:

  • Once a device is plugged in, a Device Connection Interrupt (enabled by setting CONNIE to 1 earlier) will be thrown and your ISR needs to catch this.
  • When a connection interrupt is thrown, you need to tell the device to reset itself. This is where the device will set up its endpoints so this is very important!
  • After that, you can communicate with the device on endpoint 0.

OK, easier said than done right? Right.

Catching the Device Connection interrupt

Let's take a look at the part of my USB ISR in question:

    unsigned int CSR0, CSR1, CSR2;

    CSR2 = USBCSR2;

    RESETIF = (CSR2 & (1 << 18)) ? 1 : 0;

    CONNIF = (CSR2 & (1 << 20)) ? 1 : 0;

Why do it like that? If you'll remember from device mode, once you read USBCSR2 (or USBCSR0 or USBCSR1) all the interrupt flags will be reset! You need to store the values beforehand if you want to check for multiple interrupts, which we do!

Telling the device to reset itself

Fairly straightforward, thankfully:

USBCSR0bits.RESET = 1;
delay_ms(100);
USBCSR0bits.RESET = 0;
delay_ms(100);

You don't need to wait 100ms, this code is still in the early stages so I'm playing around to see how long I have to wait. It works with a 100ms delay. Again, this will tell the attached USB device to reset its USB stack and initialise its own endpoints. Depending on the device, this may be the difference between it working or not.

Communicating with an attached device on endpoint 0

Here's where the real fun begins! This is pretty much the opposite of device mode in that instead of receiving queries and answering them, we will be sending the queries and reading the replies. The difference is, we now need to set slightly different bits to communicate. These endpoint 0 packets, called setup packets, are special and different from packets on the other endpoints. Let's take a look at my code for sending on endpoint 0:

void USB_EP0_send_setup(unsigned char *buffer, uint32_t length)
{
    int cnt;
    unsigned char *FIFO_buffer;

    FIFO_buffer = (unsigned char *)&USBFIFO0;

    for (cnt = 0; cnt < length; cnt++)
    {
        *FIFO_buffer = *buffer++; // Send the bytes
    }

    *((unsigned char*)&USBE0CSR0 + 0x2) = 0xA;
}

First off, the length of these setup packets seems to always be 8 bytes, and some devices can only handle 8 bytes at a time. So be warned! In device mode, we needed to set TXPKTRDY to 1 but here we need to set both TXPKTRDY and SETUPPKT to 1. This tells the PIC32MZ USB hardware to send a setup token instead of an OUT (i.e. from host to device) token. Some requests, for example assigning an address to the device, will not require any data from the device. However, if the device does need to reply, what do we do? Let's take a look at my code for requesting and reading a device descriptor:

void USB_HOST_read_device_descriptor(unsigned char *buffer)
{
    int received_length;
    int bytes_to_read;
    int buffer_index;

    bytes_to_read = 0x12; // 18 bytes for device descriptor
    buffer_index = 0; // Start at the beginning

    // Send descriptor request
    USB_EP0_send_setup(USB_DEVICE_DESCRIPTOR_REQUEST, 8);

    // Wait for the TX interrupt to fire, indicating it was sent
    USB_EP0_IF = 0;
    while (USB_EP0_IF == 0);

    // Once it is sent, request a packet from the device
    *((unsigned char*)&USBE0CSR0 + 0x3) = 0x60;

    while (bytes_to_read > 0)
    {
        USB_EP0_IF = 0;
        while (USB_EP0_IF == 0);
        received_length = USBE0CSR2bits.RXCNT;
        USB_EP0_receive(&buffer[buffer_index], USBE0CSR2bits.RXCNT);

        buffer_index += received_length;
        bytes_to_read -= received_length;
        if (bytes_to_read > 0)
            // Request another packet (set REQPKT)
            *((unsigned char*)&USBE0CSR0 + 0x3) = 0x20;
        else
            // The read is done, clear STATUS bit and REQPKT bit
            *((unsigned char*)&USBE0CSR0 + 0x3) = 0x0;
    }    
}

As the comments state, we send the request and then we wait until the TX interrupt fires, indicating that we have actually sent the packet. Then, vitally, we need to set some more bits to tell the USB hardware we want a packet from the device. We do this by setting the bits STATPKT and REQPKT to 1. Now the USB hardware will actually request an IN packet (i.e. from device to host transfer). Once it arrives, an interrupt will fire (EP0IF will be set), indicating we have received some data. We can read this data from EP0 at usual with the following code:

void USB_EP0_receive(unsigned char *buffer, uint32_t length)
{
    int cnt;
    unsigned char *FIFO_buffer;

    // Get 8-bit pointer to USB FIFO for endpoint 0
    FIFO_buffer = (unsigned char *)&USBFIFO0;

    for(cnt = 0; cnt < length; cnt++)
    {
        // Read in one byte at a time
        *buffer++ = *(FIFO_buffer + (cnt & 3));
    }

    USBE0CSR0bits.RXRDYC = 1;
}

This is exactly the same as the reading procedure for device mode. Now, depending on the device, it may only be able to send 8 bytes at a time. For example, my PlayStation 5 controller (no, I don't have a PS5, just a controller :)) can send 64 bytes at a time. My Logitech Unifying receiver can only send 8 bytes at a time. For this particular request (get device descriptor), I need to receive 18 bytes. This means that the PS5 DualSense controller will send the reply all at once, but the Unifying receiver will split it into 3 packets of 8 + 8 + 2 bytes in length. If you are still expecting more bytes, you need to set the REQPKT bit again. If you are done receiving, you must clear both the STATUS and REQPKT bits.

While this all seems perfectly straightforward in hindsight, believe me when I say finding this all out without any documentation was a real pain in the butt.

That's all for today. Next time I'll either continue the MSD posts or upload something on HID. Hope this helps!

Tags: USB, host

How to make a USB Mass Storage Device part 1

When writing my code the biggest problem I ran across was a lack of documentation on any of the things I needed to know. I found bits and pieces here and there on many, many sites but there was no one site I could go have a look at for a quick reference. So today, I'm going to explain the commands that you need to support, exactly what they mean and what the data structures you need to have look like.

In other words, extremely tl;dr. If you just want some code that worked for me (and hopefully will for you too!) check yesterday's post.

For today's post, I am assuming we want to use an SD card attached to a PIC32MZ as a Mass Storage Device.

SCSI Transparent Command Set

Small Computer Systems Interface (SCSI) Primary Commands-2 (SPC-2) is a standard used to connect hard drives and optical drives to a computer. So why care? Because the USB Mass Storage Device (hereafter MSD) code we will be writing makes use of these commands, or rather a reduced set of them (thankfully) called the SCSI Transparent Command Set. The commands are as follows:

SCSI Transparent Command Set

OK. Before going into what each of those mean, let's take a look at how these commands are sent to us. Each of these commands come to you in structured called a Command Block Wrapper (CBW) and after dealing with them you need to send back another structure called a Command Status Wrapper (CSW) to the host. So let's take a look at what the CBW and CSW look like:

(Images copied from Microchip's AN2554 app note)

Command Block Wrapper format

USB CBW structure

Command Status Wrapper format

USB CSW structure

So what'll happen is the host will issue a command, for example Read. You process the command and then send back a CSW.

A brief note on error handling

Before proceeding, it is important to understand how error handling works because it is also how we tell the host we don't support certain commands. So let's take an example. The host asks us to read some data and while we are performing the read operation an error occurs. What does this look like?

Error handling flow with SPC-2

So either when something goes wrong or we don't support a command, it is not enough to send back an error in the CSW. The host will ask what the error was by sending a Request Sense command which we then have to reply to. Only then is the error handling process finished!

This is extra important because you must use this method to tell the host that you don't support certain commands (like Write, if you want a read-only device)!

Back to those SCSI Commands

OK, with that out the way, let's look at the 11 commands one by one and what replies we are expected to send for each.

Test Unit Ready Command 0x0

This command, as the name suggests, means the host is asking us to check if our connected SD card is attached and ready for reading and writing. If the card is ready, simply send back a Command Status Wrapper (hereafter CSW) with a bmCBWStatus of zero, indicating no error. If, however, the card isn't ready (for example, isn't attached yet), you can set the bmCBWStatus to 1 and also get ready for the host to issue a Request Sense command asking what exactly the error is. More info on that in the Request Sense command explanation, which is next.

Request Sense Command 0x3

The host sense the Request Sense command when you have previously send back a CSW with a bmCBWStatus value of anything but 0. In the status field, 1 is used to represent "Command failed" and 2 is used to represent "Phase error". To oversimplify, for Command failed, the host will reset the current operation and for Phase Error, which usually means the device has an irrecoverable error, the host will issue a Reset command to the device.

Anyway! The Request Sense command is used for the host to get detail information about what error has occurred, and it requires that you send back a fixed-format structure that looks like this:

Request sense reply structure

(This image copied from https://docs.oracle.com/en/storage/tape-storage/storagetek-sl150-modular-tape-library/slorm/img/slk_117.png)

OK wow that's a lot of extra stuff! Fortunately, you don't need to mess with any of it, except for the Sense Key and the Additional Sense Code fields. For a detailed look at all the error codes possible, head over to This excellent site on oracle.com.

For my code, however, it was enough to do the following:

  • When there was a command I didn't support, set the Sense Key field to 0x5, which is "Illegal Command" (aka Unsupported Command)
  • When there was an error with reading or writing, I set the Sense Key field to 0x2, which is "Not ready" and the Additional Sense Code field to 0x01 which is "In Process of Becoming Ready"
  • When the SD card was not attached, I set the Sense Key field to 0x2 and the Additional Sense Code field to 0x1

Inquiry Command 0x12

This command is for the host to find out information about our device. We send back a fixed-format structure, again, that looks like this:

Inquiry reply structure

(Image copied from Microchip's AN2554 app note)

You need to set Peripheral Device Type to 0, indicating a device that has direct block access. That is, Windows (or whatever the host is) will tell you specifically which blocks to read and write from. Apart from that, there's not much we need to change here apart from the Vendor Identification, Product Identification and Product Revision Level fields.

Mode Sense(6) Command 0x1A

The Mode Sense 6-byte command (so named because the CBW is 6 bytes long) is again a way for the host to get more specific information about our device. We need to send back a structure with the following format:

Mode Sense(6) reply structure

(Image copied from Microchip's AN2554 app note)

You can just reply with the values 3, 0, 0, 0 for our purposes.

Please note that I've read that Mac OS may instead issue a Mode Sense(10) Command (opcode 0x5A), which requires a slightly different 8 byte long structure in reply. I don't have a Mac so I can't test it but I believe the values 0,8,0,0,0,0,0,0 will work as a reply.

Start/Stop Unit Command 0x1B

This command only needs to be supported if you want to be able to eject your device from, for example, My Computer and have it disappear. There is no special reply required, just a CSW with a bmCSWStatus of 0.
Interestingly (well, interesting to me), many USB drives and Microchip Harmony do not seem to support this command (or at least Harmony v2.04 did not), which means you can "eject" them from Windows but they'll just stubbornly stay there in My Computer.

Prevent Allow Medium Removal 0x1E

This command caused me endless headaches. I implemented it at first and it all worked fine until it came to writing to my disk. You see, in Windows if you accept this command, Windows assumes your device will not suddenly be removed and therefore it caches everything it copies to the device. Why does this matter? Well, if you're copying a 1GB file to the SD card, and it's writing at 2MB/s, it's going to naturally take a while. But the progress bar for the copy will immediately jump to 99% as Windows caches the entire copy and you will not be able to cancel the copy until it's done. Me, I like to see the progress bar. So what I had to do was, when this command came in, set bmCSWStatus to 1 (error) and set my Request Sense Sense Key to 0x5, Illegal Command.
Now that I don't support the command, Windows no longer assumes the drive won't suddenly disappear and doesn't enable write caching!

Read Format Capacity Command 0x23

This command requires us to send back a 12 byte array indicating the formatted capacity of our media. The format looks as follows:

Read Format Capacity reply structure

(Image taken from https://www.mikrocontroller.net/attachment/41812/0x23_READFMT_070123.pdf and edited a lot)

Let's take a look at this:

  • The first 3 bytes must be 0, then the capacity list length must be set to 8.
  • The next 4 bytes are an integer representing how many blocks we have on our SD card .
  • Next is 1 byte representing the descriptor type. We set this to 2, formatted media.
  • Then finally we have 3 bytes representing block length. For SD cards this is always 512 bytes!

However, we have an Endian problem here! Each of those groups of 4 bytes need to be put into reverse byte order! So for the first four bytes, you'd expect to send 0, 0, 0, 8 but in actuality you need to send 8, 0, 0, 0.

This command is also optional, and Harmony doesn't support it either. But it's so similar to the next command I supported it anyway.

Read Capacity Command 0x25

Very similar to the Read Format Capacity, except here we only send 8 bytes back:

  • The first 4 bytes are the number of blocks on our device (blocks = sectors for our SD card)
  • The last 4 bytes are the size of one sector (so 512 for our SD card example)

Like the Read Format Capacity command, the two groups of 4 bytes must be put into Little Endian order with the LSB first. So for "512" (which is 0x00000200 in hex) you actually need to send 0x00020000. A nice little trick to be aware of.

Read(10) Command 0x28

This command instructs us to read a certain amount of blocks (sectors)from the SD card and send them to the host. Again, a 10-byte Command Block will be sent from the host, hence the name. The flow of dealing with a read request looks like this:

  • The host sends the number of the block you need to start reading on and the number of blocks to read in the CBW (this means that for an SD card, if the host asks you to read 3 blocks, it is expecting 3 x 512 bytes in return)
  • Upon receipt of this instruction, immediately read from the SD card and send the data back to the host. You can read it all into a buffer and send it all back at once if you wish, the data will never be longer than 64kB per Read(10) request.
  • Only after you have sent back all the data can you send a CSW indicating the operation was successful

Write(10) Command 0x2A

Very much like the Read(10) command except the flow is slightly different:

  • The host sends the number of the block you need to start writing on and the number of blocks to write in the CBW
  • The host immediately starts sending you the data you need to write. You can buffer this data and write it all at once later on, the data will never be longer than 64kB per Write(10) request.
  • After sending all the data, send a CSW indicating success (or failure)

Verify Command 0x2F

This is a command for the host to ask the device to verify the data in a specified range of blocks. According to Microchip's AN282 app note, "this command is used when the host PC formats the filesystem". Honestly, by the time I got here I just wanted to get this code done and didn't implement it, instead just sending a "Passed" / "Good" status in the CSW.
However, for completeness, the host can either ask you to verify the data yourself and not send you any data to compare with, or it can send you the data it expects to be there so you can do a byte-by-byte comparison.

Overview

Phew that is extremely tl;dr, even for me and I can get pretty chatty. Next time, I'll start taking a look at how to implement this in code. Cheers for now!

Tags: USB, MSD, tl;dr

USB MSD code

Code for USB MSD

OK, the mass storage device code is finally done! What a wild ride it's been. Bits and pieces of information everywhere, some of it very hard to find! It's been a good few weeks so I need a break before writing about it but I'm uploading the code for anyone who may want to use it.

Please bear in mind that on my board, the SD card is attached to SPI2 on RB3 (for SDI) and RB5 (for SDO). The Chip Select (CS) pin is attached to RA0. If you wish to use this code, you will have to modify the appropriate values in mmcpic32.h and also the PPS settings in main.c.

If anyone has any questions, please drop a comment and I'll try and help out.

Here's the code.

Tags: code, USB, MSD