USB HID joystick

Right, as promised here are the files for the USB HID joystick. The source code only makes sense for my schematic so adjust as you see fit. As always, if you use this code, please do at least give credit where credit is due. It's been a long battle trying to understand both USB on the PIC32MZ and understanding how USB works in general.

Schematic

PIC32MZ HID joystick by Aidan Mocke - Schematic

Here's a low quality render to give some approximation of what the board looks like in real life

USB HID Joystick Board render

OK, enough of that, Here's the source code

And here are the KiCad files if you need them.

Tags: USB, HID

USB HID joystick

Hello there world. I've finally found the time to complete my USB HID joystick and it seems to be working. In a few days I'll upload both the KiCad files and the source code, along with a very long-winded explanation (yay) of how USB HID and its associated reports work. Hope you're all well!

Tags: USB, HID

How to make a USB Mass Storage Device part 2

I've been putting this off for some time because writing these posts takes a long time. I know the audience for this stuff is incredibly small, if it exists at all, but I hope to keep updating this as I learn more about the PIC32MZ.

Today I'm going to cover reading from and writing to an SD card in a mass storage device.

Read - Command 0x28

In a previous post I used DMA to read blocks from an SD card attached to SPI2. What I found was that when combined with a lot of interrupts and USB DMA it kept falling over, so I removed it. The good news is the new method is far more stable and just as fast. Let's take a look at the code:

case 0x28: // Read (10)
{
    if (MSD_ready)
    {
        read_length = (int)(cbw.CBWCB[7] << 8) | cbw.CBWCB[8];
        read_address  = (int)(cbw.CBWCB[2] << 24) | (int)(cbw.CBWCB[3] << 16) | (cbw.CBWCB[4] << 8) | (cbw.CBWCB[5]);

        blocks_left = read_length;
        toggle = 0;

        // Start SD multiblock read
        result = SD_start_multi_block_read(read_address);
        // Read first block
        addr = USB_data_buffer;
        result |= SD_read_multi_block(addr, 512);
        blocks_left--;

        // Wait until we are ready to send on EP1
        USB_EP1_wait_TXRDY();

        while (blocks_left)
        {
            // Send this block via DMA
            USB_EP1_DMA_DONE = 0;
            addr = USB_data_buffer + (toggle * 512);
            USB_EP1_send_DMA(addr, 512);

            // While it's sending, read the next block in
            toggle ^= 1;
            addr = USB_data_buffer + (toggle * 512);
            result |= SD_read_multi_block(addr, 512);

            // Wait until DMA transfer is done before continuing
            while (!USB_EP1_DMA_DONE);
            USB_EP1_wait_TXRDY();
            blocks_left--;
        }

        // Stop the SD card multiblock read
        SD_stop_multi_block_read();

        // There will be one block that is left unsent, send it now
        USB_EP1_DMA_DONE = 0;
        USB_EP1_send_DMA(addr, 512);
        while (!USB_EP1_DMA_DONE);
        USB_EP1_wait_TXRDY();

        if (result != RES_OK) 
        {
            requestSenseAnswer[2] = 0x5;
            requestSenseAnswer[12] = 0x20;
            csw.dCSWDataResidue = 252;
            csw.bCSWStatus |= 1; // Error occurred   
        }
    }
    else
    {
        csw.bCSWStatus |= 1; // Error occurred   
        requestSenseAnswer[2] = 2;
    }

    break;

}

OK, it's not that long. Let's take a look at it bit by bit. Oh, and MSD_ready is a flag I use to signify if an SD card is attached.

read_length = (int)(cbw.CBWCB[7] << 8) | cbw.CBWCB[8];
read_address  = (int)(cbw.CBWCB[2] << 24) | (int)(cbw.CBWCB[3] << 16) | (cbw.CBWCB[4] << 8) | (cbw.CBWCB[5]);

The host will send us the amount of blocks (that is, 512-byte blocks on an SD card), to read as well as the Logical Block Address (LBA) to start reading at.

For the rest of it, instead of line by line I think it'd be better to explain what it's doing. The way I have implemented the reading system is to first initiate a multi-block read (SD_start_multi_block_read()) from the SD card. You do this by sending it a special command (CMD18) along with the starting address. If you are reading many blocks, this works out much faster than reading multiple single blocks. Once I have done this, I read 512 bytes from the SD card (SD_read_multi_block()) and send it to the host via USB DMA (USB_EP1_send_DMA()) and immediately start reading another block from the SD card. In this way the amount of time the SD card is idle and the amount of time the USB transfer is idle is greatly reduced.

I've also implemented a double-buffer system so that when I'm reading I can send at the same time. In this way, I can read large 64kB blocks from the SD card and send them to the host at a fairly high speed while only using 1kB of RAM.

On reasonably good SD cards, I've managed transfer speeds of 5.5MB/s, which is way faster than the 2MB/s I was getting from Harmony.

USB DMA

The USB peripheral on the PIC32MZ has its own DMA system, separate from the DMA system I covered before. In the previous section I talked about sending blocks via DMA, let's take a quick look at how this would look:

void USB_EP1_send_DMA(unsigned char *buffer, uint32_t dmaCount)
{
    USB_EP1_DMA_DONE = 0;                           // Flag to indicate transfer is done

    USBE1CSR0bits.MODE = 1;                         // Set the mode to transfer (for receiving data, it'd be set to 0)

    USBDMA1Abits.DMAADDR = virt_to_phys(buffer);    // Set the address of the buffer to read from
    USBDMA1Nbits.DMACOUNT = dmaCount;               // Set the number of bytes to receive
    USBDMA1Cbits.DMABRSTM = 3;                      // Set DMA burst mode 3
    USBDMA1Cbits.DMAMODE = 0;                       // Set DMA mode 0
    USBDMA1Cbits.DMAEP = 1; // OUT Endpoint 1       // Set USB DMA channel 1 to work with endpoint 1
    USBDMA1Cbits.DMADIR = 1;                        // Set the DMA direction to transfer (again, for receiving you'd set this to 0)
    USBDMA1Cbits.DMAIE = 1;                         // Enable the USB DMA interrupt
    USBDMA1Cbits.DMAEN = 1;                         // Enable USB DMA channel 1
}

As with all DMA, you must declare the buffer "coherent", like this:

unsigned char __attribute__ ((coherent, aligned(8))) USB_data_buffer[USB_DATA_BUFFER_SIZE];

If you just declare it as always, that is:

unsigned char USB_data_buffer[USB_DATA_BUFFER_SIZE];

then your DMA transfers will not work.

Write - Command 0x2A

Let's take a look at the code first:

case 0x2A: // Write (10)
{
    if (MSD_ready)
    {
        read_length = (int)(cbw.CBWCB[7] << 8) | cbw.CBWCB[8];
        read_address  = (int)(cbw.CBWCB[2] << 24) | (int)(cbw.CBWCB[3] << 16) | (cbw.CBWCB[4] << 8) | (cbw.CBWCB[5]);

        USBCSR3bits.ENDPOINT = 2;

        blocks_left = read_length;

        while (blocks_left > 0)
        {
            if (blocks_left > USB_DATA_BUFFER_SIZE / 512)
                blocks_to_read = USB_DATA_BUFFER_SIZE / 512;
            else
                blocks_to_read = blocks_left;

            for (cnt = 0; cnt < blocks_to_read; cnt++)
            {
                while (!USBE2CSR1bits.RXPKTRDY)
                addr = USB_data_buffer + (512 * cnt);
                USB_EP2_DMA_DONE = 0;
                USB_EP2_receive_DMA(addr, 512);
                while (!USB_EP2_DMA_DONE);
                USBE2CSR1bits.RXPKTRDY = 0;
                blocks_left--;
            }                    

            result = disk_write(0, USB_data_buffer, read_address, blocks_to_read);
            result = RES_OK;

            read_address += (blocks_to_read * 512);

            if (result != RES_OK) csw.bCSWStatus |= 1; // Error occurred   
        }
    }
    else
    {
        csw.bCSWStatus |= 1; // Error occurred   
        requestSenseAnswer[2] = 2;
    }

    break;

}

When reading worked so well, I felt sure that I could do writing in a similar way. That is, read 512 bytes from the USB host and write them while simultaneously reading another 512 bytes from the host via DMA. However, when I tried this I had terrible writing speeds. At 2MB/s it was still double Harmony, but I was sure I could do better.

What I found was that reading the entire (up to) 64kB of data into a buffer and then writing it in one shot ended up way faster. I can now get consistent 4MB/s writes over USB, which makes the writing actually useful for many projects.

Writing to the SD card

I had never implemented SD card writing before, I was using ancient code I found almost 10 years ago online. It turns out multiblock writing is not terribly hard. Here's the code:

static
int xmit_datablock (    /* 1:OK, 0:Failed */
    const BYTE *buff,   /* 512 byte data block to be transmitted */
    BYTE token,         /* Data token */
    UINT count
)
{
    static BYTE resp;
    UINT bc = 512;

    // Clear out SPIBUF
    while (SPI_CHAN_FUNC(STATbits).RXBUFELM > 0)
    {
        resp = SPI_CHAN_FUNC(BUF);
        bc = resp;
    }

    if (wait_ready() != 0xFF) return 0;

    xmit_spi(token);        /* Xmit a token */
    if (token != 0xFD) 
    {   /* Not StopTran token */

        // Send 16 bytes first
        for (bc = 0; bc < 16; bc++)
        {
            SPI_CHAN_FUNC(BUF) = *buff++;
        }

        bc = 512 - 16;

        while (bc > 0)
        {
            if (SPI_CHAN_FUNC(STATbits).RXBUFELM > 0)
            {
                // Read an 0xFF
                resp = SPI_CHAN_FUNC(BUF);
                // Send next byte
                SPI_CHAN_FUNC(BUF) = *buff++;
                bc--;
            }
        }

        while (SPI_CHAN_FUNC(STATbits).SPIBUSY);


        // Clear out the FIFO

        while (SPI_CHAN_FUNC(STATbits).RXBUFELM > 0)
        {
            resp = SPI_CHAN_FUNC(BUF);
        }

        // Get 2x CRC and 1x response
        SPI_CHAN_FUNC(BUF) = 0xFF;
        SPI_CHAN_FUNC(BUF) = 0xFF;
        SPI_CHAN_FUNC(BUF) = 0xFF;
        while (SPI_CHAN_FUNC(STATbits).SPIBUSY);
        while ((SPI_CHAN_FUNC(STATbits).RXBUFELM > 0))
        resp = SPI_CHAN_FUNC(BUF);

        if ((resp & 0x1F) != 0x05)  /* If not accepted, return with error */
            return 0;
    }


    return 1;
}
  • Send CMD25 (multiple block write) and the starting address
  • With each 512-byte block, wait until the SD card sends you 0xFF in return, indicating it is idle / ready
  • Then write the token 0xFC before writing the data, which you can write immediately
  • After you write the data, the SD card controller will send you 2 bytes of CRC and one byte of response data. If the lower 4 bits of the response are equal to 0x5, the write was successful, otherwise it failed
  • After the last block has been written, send the token 0xFD to the SD card to indicate the multiblock write is done

The biggest change I made was that I made use of the Enhanced Buffers (ENHBUF) mode of SPI, enabling me to queue up to 16 bytes at once. This vastly reduced the amount of time the SPI port (and thus the SD card) were idle, and sped up transfers.

OK, long enough for one day. Next time I'll have a look at USB HID (keyboard) and show how I used it to make a joystick.

Tags: USB, MSD, tl;dr

Using the PIC32MZ EF USB module in host mode

Writing about USB MSD is fairly dry, boring stuff, so while I'm investigating USB on the PIC32MZ series, I thought I might as well try and get host mode working. Boy, what a mission. There's even less documentation on how things work, even less examples for it and there are so many quirks and weird things about it. I've finally gotten to the stage where I can connect to a device and query it (on endpoint 0), so I thought I'd share my code for anyone that may need this.

Setting up host mode

OK, here's the code I (currently) use. You'll notice it's quite different from the device mode initialisation.

void USB_init()
{
    volatile uint8_t * usbBaseAddress;

    USBCRCONbits.USBIE = 1;

    *((unsigned char*)&USBE0CSR0 + 0x7F) = 0x3;
    delay_ms(10);
    *((unsigned char*)&USBE0CSR0 + 0x7F) = 0;

    USBCSR2bits.SESSRQIE = 1;
    USBCSR2bits.CONNIE = 1;
    USBCSR2bits.RESETIE = 1;
    USBCSR2bits.VBUSERRIE = 1;
    USBCSR2bits.DISCONIE = 1;
    USBCSR2bits.EP1RXIE = 1;
    USBCSR1bits.EP1TXIE = 1;

    IEC4bits.USBIE = 0;         // Enable the USB interrupt    
    IFS4bits.USBIF = 0;         // Clear the USB interrupt flag.
    IPC33bits.USBIP = 7;        // USB Interrupt Priority 7
    IPC33bits.USBIS = 1;        // USB Interrupt Sub-Priority 1
    IPC33bits.USBDMAIP = 5;
    IPC33bits.USBDMAIS = 1;
    IFS4bits.USBDMAIF = 0;
    IEC4bits.USBDMAIE = 0;

    USB_init_endpoints();

    USBCSR0bits.HSEN = 1;
    USBCRCONbits.USBIDOVEN = 1;
    USBCRCONbits.PHYIDEN = 1;

    USBCRCONbits.USBIDVAL = 0; 
    USBCRCONbits.USBIDVAL = 0; 

    IFS4bits.USBIF = 0;         // Clear the USB interrupt flag.
    IFS4bits.USBDMAIF = 0;

    IEC4bits.USBDMAIE = 1;
    IEC4bits.USBIE = 1;         // Enable the USB interrupt

    USBOTGbits.SESSION = 1;
}

Let's get straight to it. What the heck is *((unsigned char*)&USBE0CSR0 + 0x7F) = 0x3; and why am I doing it like that? I first saw this code in Harmony and I wondered the same thing. First off, what does it mean? For that, we need to take a look at the datasheet:

USBCSR0 address

The important piece of information is the address of USBCSR0, which is listed as 3000 (which is actual hexadecimal, so 0x3000). So to get the target address of that piece of code, we need to see what's at 0x3000 + 0x7F, or 0x307F:

USBEOFRST address

Side note: The datasheet has split USBEOFRST into USB and EOFRST, so you can't search for USBEOFRST in the address list. Way to go Microchip!

OK, so USBEOFRST, the register controlling "USB END-OF-FRAME/SOFT RESET CONTROL" starts at 0x307C. So the first byte of the 4-byte register is at 0x307C, the second byte at 0x307D, the third at 0x307E and the fourth at 0x307F, which is the address we are looking for. Setting this to 3 will set the NRST and NRSTX bits. But what do those bits do? Let's look further down in the datasheet:

USBEOFRST bits

OK, so it resets some clock or other. Point is, if you don't reset this, USB host will not work at all. In Harmony code I saw, Microchip describes it as a "workaround for an error in the PHY", though I cannot find this in any errata anywhere.

So we know what that line of code is doing, but why are we doing it like that? Surely we can go:

USBEOFRSTbits.NRST = 1;  
USBEOFRSTbits.NRSTX = 1;

and have the same result? I mean, surely, right? XC32 even has the bit definitions there and everything. And yet, it doesn't work. It sometimes seems to, but most often not. There are a few registers relating to USB that you have to access indirectly like this or nothing works at all! If anybody knows why, I'd sure appreciate a message. Anyway, Harmony does it like this and for once it makes sense why they did it in this way.

So we enable interrupts turn on the "soft reset" bits, wait 10 milliseconds, turn them off (which turns the USB clock back on) and then disable interrupts again. Why enable them to disable them straight away? I don't rightly know, this is what Harmony seemed to do and it took me a week of solid trying to get anything to work, so maybe I'm just superstitious at this point! Let's take a look at the next block:

USBCSR0bits.HSEN = 1;
USBCRCONbits.USBIDOVEN = 1;
USBCRCONbits.PHYIDEN = 1;

USBCRCONbits.USBIDVAL = 0; 
USBCRCONbits.USBIDVAL = 0; 

Enable High Speed mode by setting HSEN to 1. Enable the USB ID override enable bit by setting USBIDEOVEN to 1. Enable monitoring of the PHY ID by setting PHYIDEN to 1 and then set USBIDVAL to 0 (0 = host, 1 = device). The value of USBID is very important for the USB module, I've started using USB-C connectors on my boards, and they don't have a USBID pin.
So I control this via software now. Please note that you should also enable the pull-down for pin RF3 (the USB ID pin) like this:

CNPDFbits.CNPDF3 = 1; // Pull down to ensure host mode

to ensure the USB ID pin's value is 0. I don't know if you have to do this even with USB ID override enabled, but like I said it took a week of pulling my hair out before I finally got this to work and I didn't mess with it further yet.

The final piece of the puzzle is setting USBOTGbits.SESSION to 1, which starts a session with an attached device. In device mode, we had to set USBCSR0bits.SOFTCONN to 1, but that is not the case in host mode.

The program flow after intialising the USB port

This part again took a while to get my head around, mostly because I didn't know much about USB before I started this. The flow, from what I can see, seems to be:

  • Once a device is plugged in, a Device Connection Interrupt (enabled by setting CONNIE to 1 earlier) will be thrown and your ISR needs to catch this.
  • When a connection interrupt is thrown, you need to tell the device to reset itself. This is where the device will set up its endpoints so this is very important!
  • After that, you can communicate with the device on endpoint 0.

OK, easier said than done right? Right.

Catching the Device Connection interrupt

Let's take a look at the part of my USB ISR in question:

    unsigned int CSR0, CSR1, CSR2;

    CSR2 = USBCSR2;

    RESETIF = (CSR2 & (1 << 18)) ? 1 : 0;

    CONNIF = (CSR2 & (1 << 20)) ? 1 : 0;

Why do it like that? If you'll remember from device mode, once you read USBCSR2 (or USBCSR0 or USBCSR1) all the interrupt flags will be reset! You need to store the values beforehand if you want to check for multiple interrupts, which we do!

Telling the device to reset itself

Fairly straightforward, thankfully:

USBCSR0bits.RESET = 1;
delay_ms(100);
USBCSR0bits.RESET = 0;
delay_ms(100);

You don't need to wait 100ms, this code is still in the early stages so I'm playing around to see how long I have to wait. It works with a 100ms delay. Again, this will tell the attached USB device to reset its USB stack and initialise its own endpoints. Depending on the device, this may be the difference between it working or not.

Communicating with an attached device on endpoint 0

Here's where the real fun begins! This is pretty much the opposite of device mode in that instead of receiving queries and answering them, we will be sending the queries and reading the replies. The difference is, we now need to set slightly different bits to communicate. These endpoint 0 packets, called setup packets, are special and different from packets on the other endpoints. Let's take a look at my code for sending on endpoint 0:

void USB_EP0_send_setup(unsigned char *buffer, uint32_t length)
{
    int cnt;
    unsigned char *FIFO_buffer;

    FIFO_buffer = (unsigned char *)&USBFIFO0;

    for (cnt = 0; cnt < length; cnt++)
    {
        *FIFO_buffer = *buffer++; // Send the bytes
    }

    *((unsigned char*)&USBE0CSR0 + 0x2) = 0xA;
}

First off, the length of these setup packets seems to always be 8 bytes, and some devices can only handle 8 bytes at a time. So be warned! In device mode, we needed to set TXPKTRDY to 1 but here we need to set both TXPKTRDY and SETUPPKT to 1. This tells the PIC32MZ USB hardware to send a setup token instead of an OUT (i.e. from host to device) token. Some requests, for example assigning an address to the device, will not require any data from the device. However, if the device does need to reply, what do we do? Let's take a look at my code for requesting and reading a device descriptor:

void USB_HOST_read_device_descriptor(unsigned char *buffer)
{
    int received_length;
    int bytes_to_read;
    int buffer_index;

    bytes_to_read = 0x12; // 18 bytes for device descriptor
    buffer_index = 0; // Start at the beginning

    // Send descriptor request
    USB_EP0_send_setup(USB_DEVICE_DESCRIPTOR_REQUEST, 8);

    // Wait for the TX interrupt to fire, indicating it was sent
    USB_EP0_IF = 0;
    while (USB_EP0_IF == 0);

    // Once it is sent, request a packet from the device
    *((unsigned char*)&USBE0CSR0 + 0x3) = 0x60;

    while (bytes_to_read > 0)
    {
        USB_EP0_IF = 0;
        while (USB_EP0_IF == 0);
        received_length = USBE0CSR2bits.RXCNT;
        USB_EP0_receive(&buffer[buffer_index], USBE0CSR2bits.RXCNT);

        buffer_index += received_length;
        bytes_to_read -= received_length;
        if (bytes_to_read > 0)
            // Request another packet (set REQPKT)
            *((unsigned char*)&USBE0CSR0 + 0x3) = 0x20;
        else
            // The read is done, clear STATUS bit and REQPKT bit
            *((unsigned char*)&USBE0CSR0 + 0x3) = 0x0;
    }    
}

As the comments state, we send the request and then we wait until the TX interrupt fires, indicating that we have actually sent the packet. Then, vitally, we need to set some more bits to tell the USB hardware we want a packet from the device. We do this by setting the bits STATPKT and REQPKT to 1. Now the USB hardware will actually request an IN packet (i.e. from device to host transfer). Once it arrives, an interrupt will fire (EP0IF will be set), indicating we have received some data. We can read this data from EP0 at usual with the following code:

void USB_EP0_receive(unsigned char *buffer, uint32_t length)
{
    int cnt;
    unsigned char *FIFO_buffer;

    // Get 8-bit pointer to USB FIFO for endpoint 0
    FIFO_buffer = (unsigned char *)&USBFIFO0;

    for(cnt = 0; cnt < length; cnt++)
    {
        // Read in one byte at a time
        *buffer++ = *(FIFO_buffer + (cnt & 3));
    }

    USBE0CSR0bits.RXRDYC = 1;
}

This is exactly the same as the reading procedure for device mode. Now, depending on the device, it may only be able to send 8 bytes at a time. For example, my PlayStation 5 controller (no, I don't have a PS5, just a controller :)) can send 64 bytes at a time. My Logitech Unifying receiver can only send 8 bytes at a time. For this particular request (get device descriptor), I need to receive 18 bytes. This means that the PS5 DualSense controller will send the reply all at once, but the Unifying receiver will split it into 3 packets of 8 + 8 + 2 bytes in length. If you are still expecting more bytes, you need to set the REQPKT bit again. If you are done receiving, you must clear both the STATUS and REQPKT bits.

While this all seems perfectly straightforward in hindsight, believe me when I say finding this all out without any documentation was a real pain in the butt.

That's all for today. Next time I'll either continue the MSD posts or upload something on HID. Hope this helps!

Tags: USB, host