JTAG Progromming on a PIC
Programming an FPGA is a simple job if you have a JTAG interface and programmer to hand. When a product is out in the field though, it's nice to have a more 'customer friendly' method of update....
We're using Lattice XP chips. Lattice supplies the source code for an embedded programmer with their IspVM programming solution. This used 'SlimVME' files that are compressed for a smaller footprint.
Instead of using .JED files, you get 2 VME files, one 'DATA' file containing the fuse settings and a 'ALGO' file that has the programming algorithm. You can create them either in binary form or as HEX files with .c extensions that ley you compile them directly into code (each file contains a single big initialised array)
In the lattice example, you compile a version of these files in with their programming code. Obviously this leads to a large code footprint ~200K for my code....not so good on a PIC!
My solution is to use the binary export option instead and feed it to the PIC from a PC application as it is needed. The Lattice code accesses the data in a non-linear fashion so its not as simple as streaming it.
My solution:
Using a PIC 18F87J50 processor
1) Use the Microchip 2.5 USB stack example. 64 byte packets
2) Move the USB service routines from the main loop and into a timer controlled interrupt
This means we clean up the main loop and make processing the USB easier
3) When a USB packet comes in (interrupt scope from the USB service routine), copy the data into an 'execution buffer' set a flag to show a command is ready and exit the interrupt
4) The main loop runs until the flag is set then parses the 64 byte data packet
5) For sending commands, the PC builds a 64 byte command packet and sends it. The PIC receives it, parses and sends a 64 byte packet in return (I know its a bit inefficiont, but its easy!)
6) When this is working, add the JTAG code files and fix all the compile errors, the Lattice code is designed for a different processor so you'll need to fix various syntax and types that are different. Note:
- PIC C30 ints are unsigned, Lattice code needs signed ints add 'signed' to all ints
- Replace 'short int' with 'int'
- C30 ints are 16 bit, not 32 bit so replace 'int' with 'signed long' especially for anything that is storing the position in the data buffer.
7) Replace the hardware function that sets and clears pins with PORT and LAT to read and write the pins correctly
8) To start the update, send a command packet from the PC, in the command parser (called from the main loop) you call the main function of the Lattice code. The main loop will now be blocked until it completes
8) Now the clever bit, the GetByte function is returns a specified byte in the DATA or ALGO array. As we don't have the entire array loaded we have to re-write this...
Create 2 small buffers (55 bytes in my case, allowing 9 bytes overhead in the packet) one for data, one for algo. Also a variable to hold the start position of each buffer. This is its index into the full array, stored on the PC.
9) When GetByte is called, we check if the index requested in our current buffer, if so we return the byte. If it is not, we have to request a new packet of data from the PC that starts with the requested byte.
10) To request an update of the buffer, the PIC creates a request packet to the PC, requesting the type (data or algo), the start index and the number of bytes (up to 55) to send.
11) The get byte function then drops into an infinite while loop until the PC can respond with the new data. As the main loop is currently blocked, we have to poll for the 'command ready' flag inside it and call the parser when it turns up. The parser should recognise the type of data that has been sent, copy it to DATA or ALGO array as necessary and update the start index for that array.
12) When the array has been updates, we can exit the while loop and return the byte that was requested some time ago! The extra delay in waiting for the PC to send the data is not a problem.
In my example, it takes about 4 minutes to update the FPGA although its still pretty inefficient so I hope to improve it later
Any questions, please stick them in the comments.
We're using Lattice XP chips. Lattice supplies the source code for an embedded programmer with their IspVM programming solution. This used 'SlimVME' files that are compressed for a smaller footprint.
Instead of using .JED files, you get 2 VME files, one 'DATA' file containing the fuse settings and a 'ALGO' file that has the programming algorithm. You can create them either in binary form or as HEX files with .c extensions that ley you compile them directly into code (each file contains a single big initialised array)
In the lattice example, you compile a version of these files in with their programming code. Obviously this leads to a large code footprint ~200K for my code....not so good on a PIC!
My solution is to use the binary export option instead and feed it to the PIC from a PC application as it is needed. The Lattice code accesses the data in a non-linear fashion so its not as simple as streaming it.
My solution:
Using a PIC 18F87J50 processor
1) Use the Microchip 2.5 USB stack example. 64 byte packets
2) Move the USB service routines from the main loop and into a timer controlled interrupt
This means we clean up the main loop and make processing the USB easier
3) When a USB packet comes in (interrupt scope from the USB service routine), copy the data into an 'execution buffer' set a flag to show a command is ready and exit the interrupt
4) The main loop runs until the flag is set then parses the 64 byte data packet
5) For sending commands, the PC builds a 64 byte command packet and sends it. The PIC receives it, parses and sends a 64 byte packet in return (I know its a bit inefficiont, but its easy!)
6) When this is working, add the JTAG code files and fix all the compile errors, the Lattice code is designed for a different processor so you'll need to fix various syntax and types that are different. Note:
- PIC C30 ints are unsigned, Lattice code needs signed ints add 'signed' to all ints
- Replace 'short int' with 'int'
- C30 ints are 16 bit, not 32 bit so replace 'int' with 'signed long' especially for anything that is storing the position in the data buffer.
7) Replace the hardware function that sets and clears pins with PORT and LAT to read and write the pins correctly
8) To start the update, send a command packet from the PC, in the command parser (called from the main loop) you call the main function of the Lattice code. The main loop will now be blocked until it completes
8) Now the clever bit, the GetByte function is returns a specified byte in the DATA or ALGO array. As we don't have the entire array loaded we have to re-write this...
Create 2 small buffers (55 bytes in my case, allowing 9 bytes overhead in the packet) one for data, one for algo. Also a variable to hold the start position of each buffer. This is its index into the full array, stored on the PC.
9) When GetByte is called, we check if the index requested in our current buffer, if so we return the byte. If it is not, we have to request a new packet of data from the PC that starts with the requested byte.
10) To request an update of the buffer, the PIC creates a request packet to the PC, requesting the type (data or algo), the start index and the number of bytes (up to 55) to send.
11) The get byte function then drops into an infinite while loop until the PC can respond with the new data. As the main loop is currently blocked, we have to poll for the 'command ready' flag inside it and call the parser when it turns up. The parser should recognise the type of data that has been sent, copy it to DATA or ALGO array as necessary and update the start index for that array.
12) When the array has been updates, we can exit the while loop and return the byte that was requested some time ago! The extra delay in waiting for the PC to send the data is not a problem.
In my example, it takes about 4 minutes to update the FPGA although its still pretty inefficient so I hope to improve it later
Any questions, please stick them in the comments.
Hello,
ReplyDeleteI am trying to develop a similar functionality i.e. program the MACH XO chip using PIC 18F87J50. My data erray was 46KB so I store it in the ROM of the PIC. Right now, my program loops out after transferring 32KB of data to the FPGA with the ERROR_DATA_FILE. I know my data and algo files are right as I verified them using the PC parallel port with ispVM and they passed its test.
Also, using the same code and setup if I change algo file to something simple like SRAM Erase, Verify ID etc. then the code exits without any errors. However, for large data files the code does not execute till the end.
I commented out all error statements in slim_pro.c just to enable all data bytes to get transferred into the FPGA. However, I observed that at the end of the transfer, while the movingalgoindex=last instruction in algo array, but the movingdataindex was ending at different values with different runs. Seems like there is a race condition happening somewhere ?
At my wits' end right now. Any help/advice/pointers highly appreciated !
Thank you !
Hi Pheonix,
ReplyDeleteAre you using interrupts anywhere in the code? Your arrays might have to be marked as 'volatile' if you are loading data into the arrays from the interrupt context...
I also had a problem with a data error but it was due to by code corrupting the file as it was sent to the PIC. It might be worth recording the last few bytes you are executing before the crash and checking they are in sequence in the original file. I was losing \r and \n chars due to legacy code. I'd also check your algo file is buffering correctly, I was losing bytes on the boundary of each buffer segment for a while!
As soon as I started feeding the correct Data and Algo bytes in though, it worked fine...
Andy
Thanks Andy for the prompt reply ! I am not using interrupts anywhere. To be sure, I also set GIE = 0 in INTCON and set WDTCON = OFF.
ReplyDeleteSince I dont load my data and algo information through files, will there still be data corruption ? I have declared algo and data as volatile ROM arrays and so they get compiled, programmed into the PIC along with the rest of the code.
Also, in order to see the last commands executed, how do I print to a file while running the code on the PIC in debug mode ? The printf function does not work when I am running the code on the PIC. And the code can also not be checked on the simulator as it expects TDO bits to come from the FPGA and hence loops out at the first TDO verify point in code.
Got it ! Had to declare ALL the index variables as unsigned int long. I had only changed MovingDataIndex and MovingAlgoIndex previously.
ReplyDeleteglad to hear it's working! That's a good point for anyone else doing this.... Check ALL your variable types when converting to the PIC. The PIC has a different size for 'int' than is expected by the code
ReplyDeleteHi,
ReplyDeleteFirstly thanks for this blog post, it has been very helpful. I am attempting to do something similar but with an Atmel ARM based chip. I have managed to load all the data via usb into an array without having to split it and this works fine. However in practise I will have to be able to split the file. However, when splitting the file I get error code -1 (Verify fail) returned from the ISP routine during the second pass through the data array (The verify stage). I have checked thoroughly and the data being loaded into the array and handed back from the getByte function is identical to the unsplit case.
Did you encounter this issue at any point, and if you did how did you go about fixing it?
Hi,
ReplyDeleteNot precisely... I did have various bugs in my maths though that fed in the wrong bytes! If the un-split file works, it would strongly suggest you are not actually loading the data correctly the second time.
Have you checked the data from GetByte for both the Algo and Data files? One of the files is read in a repeating pattern that could be messing you up when the requested data is 'before' your current position in the file?
Hi,
ReplyDeleteNot precisely... I did have various bugs in my maths though that fed in the wrong bytes! If the un-split file works, it would strongly suggest you are not actually loading the data correctly the second time.
Have you checked the data from GetByte for both the Algo and Data files? One of the files is read in a repeating pattern that could be messing you up when the requested data is 'before' your current position in the file?
Hi,
ReplyDeleteNot precisely... I did have various bugs in my maths though that fed in the wrong bytes! If the un-split file works, it would strongly suggest you are not actually loading the data correctly the second time.
Have you checked the data from GetByte for both the Algo and Data files? One of the files is read in a repeating pattern that could be messing you up when the requested data is 'before' your current position in the file?
Hi,
ReplyDeleteNot precisely... I did have various bugs in my maths though that fed in the wrong bytes! If the un-split file works, it would strongly suggest you are not actually loading the data correctly the second time.
Have you checked the data from GetByte for both the Algo and Data files? One of the files is read in a repeating pattern that could be messing you up when the requested data is 'before' your current position in the file?
The problem has finally been sorted. It turned out that the development board I was using had a LED showing the status of the usb transfer which happened to be connected to the same pin I was using for the TCK. So during the USB transfer at the split the TCK would be toggled and would progress the FPGA state and ruin the programming. So simily commenting out the LED code from the framework fixed the problem.
ReplyDeleteSimple problem that took ages to fix.