目录(?)[+]
You use a "JTAG cable" to control a JTAG bus from a PC. The JTAG cable is just a way to control the four JTAG signals from the PC.
The JTAG cable might connect to a PC's
The simplest is a parallel port JTAG cable.
USB and Ethernet JTAG cables are good too. They are faster at streaming large amount of data, but more complex to control and with more overhead (=slower) for small amount of data.
A parallel port can be regarded as a twelve bits output from the PC, and a five bits input to the PC. JTAG just requires a three bits output and a one bit input, so that's no problem. For example, take a look at the schematic of Xilinx's parallel-III cable. You can see on the left the PC's parallel signals, one the right the JTAG connector, and in the middle a few electronic buffers.
From the software point of view, a parallel port is ideal since it is very easy to control. For example, here's C code that shows how to toggle the TCK signal for an Altera ByteBlaster JTAG cable.
#define lpt_addr 0x378 #define TCK 0x01 void toggle_TCK() { outport(lpt_addr, 0); outport(lpt_addr, TCK); outport(lpt_addr, 0); }
One minor problem with Windows 2000/XP/Vista is that the "outport" IO instruction is restricted, but GiveIO and UserPort are free generic drivers that open up the IO space.
We know that a PC is connected to the JTAG bus as illustrated here:
So we have 4 signals (TDI, TDO, TMS, TCK) to take care of.
TCK is the JTAG clock signal. The other JTAG signals (TDI, TDO, TMS) are synchronous to TCK. So TCK has to toggle for anything to happen (usually things happen on TCK's rising edge).
Inside each JTAG IC, there is a JTAG TAP controller. On the figure above, that means that there is a TAP controller in the CPU and another in the FPGA.
The TAP controller is mainly a state machine with 16 states. TMS is the signal that controls the TAP controller. Since TMS is connected to all the JTAG ICs in parallel, all the TAP controllers move together (to the same state).
Here's the TAP controller state machine:
The little numbers ("0" or "1") close to each arrow are the value of TMS to change state. So for example, if a TAP controller is at state "Select DR-Scan" and TMS is "0" and TCK toggles, the state changes to "Capture-DR".
For JTAG to work properly, the tap controllers of a JTAG chain must always be in the same state. After power-up, they may not be in sync, but there is a trick. Look at the state machine and notice that no matter what state you are, if TMS stays at "1" for five clocks, a TAP controller goes back to the state "Test-Logic-Reset". That's used to synchronize the TAP controllers.
So let's say we want to go to "Shift-IR" after power-up:
// first sync everybody to the test-logic-reset state for(i=0; i<5; i++) JTAG_clock(TMS); // now that everybody is in a known and identical state, we can move together to another state // let's go to Shift-IR JTAG_clock(0); JTAG_clock(TMS); JTAG_clock(TMS); JTAG_clock(0); JTAG_clock(0);
Now that we know how to change state, we can use the two most important JTAG states, namely "Shift-DR" and "Shift-IR".
Shift-DR and Shift-IR are used in combination with the TDI and TDO lines.
Let's start with Shift-IR.
There is one register into each IC TAP controller called IR, the "instruction register". You write a value into this register that corresponds to what you want to do with JTAG (each IC has a list of possible instructions).
Also each IC IR register has a specific length. For example, let's say that the CPU's IR is 5 bits long, and the FPGA's IR is 10 bits long. The IR registers form a chain that is loaded through the TDI and TDO pins.
From the PC point of view, the above IR chain is 15 bits long.
To load IR values, the PC has to make sure the TAP controllers are in the Shift-IR state, and send 15 bits through TDI. Once shifted, the first 10 bits actually end up into the FPGA IR, and the last 5 bits into the CPU IR. If the PC was sending more than 15 bits, it would start receiving back what it sent (with a delay of 15 clocks) on TDO.
For example, let's send 00100 into the CPU's IR, and 0000000010 into the FPGA's IR.
// Because the bits are shifted through in a chain, we must start sending the data for the device that is at the end of the chain // so we send the 10 FPGA IR bits first JTAG_clock(0); JTAG_clock(1); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); // then send the 5 CPU IR bits JTAG_clock(0); JTAG_clock(0); JTAG_clock(1); JTAG_clock(0); JTAG_clock(0 | TMS); // last bit needs to have TMS active (to exit shift-IR)
In our example, the CPU's IR is 5 bits long (i.e. can hold values from 0 to 31). That means it can support up to 32 JTAG instructions. In practice, a CPU probably uses 5 to 10 instructions, and the remaining IR values are unused.
Same thing for the FPGA, with a 10 bits long IR, it can hold 1024 instructions, with most of them unused.
JTAG has a few mandatory instructions (what these instructions do will be illustrated later):
To get the list of possible IR values supported by a specific IC, check the IC's datasheet (or BSDL file - more on that later).
Each TAP controller has only one IR register, but has multiple DR registers. The DR registers are similar to the IR registers (they are shifted the same way the IR registers are but using the Shift-DR state instead of the Shift-IR).
Each IR value selects a different DR register. So in our example, the CPU could have up to 32 DR registers (if all IR instructions were implemented).
One important IR value is the "all-ones" value. For the CPU that would be 11111 and for the FPGA, that's 1111111111. This value corresponds to the mandatory IR instruction called BYPASS. In bypass mode, the TAP controller DR register is always a single flip-flop which does nothing besides delaying the TDI input by one clock cycle before outputting to TDO.
One interesting way to use this BYPASS mode is to count the number of ICs that are present in the JTAG chain.
If each JTAG IC delays the TDI-TDO chain by one clock, we can send some data and check by how long it is delayed. That gives us the number of ICs in the chain.
Here's an example:
// go to reset state for(i=0; i<5; i++) JTAG_clock(TMS); // go to Shift-IR JTAG_clock(0); JTAG_clock(TMS); JTAG_clock(TMS); JTAG_clock(0); JTAG_clock(0); // Send plenty of ones into the IR registers // That makes sure all devices are in BYPASS! for(i=0; i<999; i++) JTAG_clock(1); JTAG_clock(1 | TMS); // last bit needs to have TMS active, to exit shift-IR // we are in Exit1-IR, go to Shift-DR JTAG_clock(TMS); JTAG_clock(TMS); JTAG_clock(0); JTAG_clock(0); // Send plenty of zeros into the DR registers to flush them for(i=0; i<1000; i++) JTAG_clock(0); // now send ones until we receive one back for(i=0; i<1000; i++) if(JTAG_clock(1)) break; nbDevices = i; printf("There are %d device(s) in the JTAG chain\n", nbDevices);
Most JTAG ICs support the IDCODE instruction. In IDCODE mode, the DR register is loaded with a 32-bits value that represents the device ID.
Unlike for the BYPASS instruction, the IR value for the IDCODE is not standard. Fortunately, each time the TAP controller goes into Test-Logic-Reset, it goes into IDCODE mode (and loads the IDCODE into DR).
Here's an example:
// go to reset state (that loads IDCODE into IR of all the devices) for(i=0; i<5; i++) JTAG_clock(TMS); // go to Shift-DR JTAG_clock(0); JTAG_clock(TMS); JTAG_clock(0); JTAG_clock(0); // and read the IDCODES for(i=0; i < nbDevices; i++) { printf("IDCODE for device %d is %08X\n", i+1, JTAG_read(32)); }
Now let's ask the TAP controllers to go into boundary-scan mode, where the DR chain goes through each IO block and can read or hijack each pin!
Boundary-scan can be used even while a device is otherwise running. So for example, using JTAG on an FPGA, you can tell the status of each pin while the FPGA is running.
Let's try to read the value of the pins. We use a JTAG instruction called SAMPLE for that. Each IC instruction code list is different. You have to look into the IC datasheet, or the IC BSDL file to get the codes.
A BSDL file is actually a VHDL file that describes the boundary chain of an IC.
Here's an interesting portion of an Altera BSDL file (Cyclone EP1C3 in TQFP 100 pins package).
attribute INSTRUCTION_LENGTH of EP1C3T100 : entity is 10; attribute INSTRUCTION_OPCODE of EP1C3T100 : entity is "BYPASS (1111111111), "& "EXTEST (0000000000), "& "SAMPLE (0000000101), "& "IDCODE (0000000110), "& "USERCODE (0000000111), "& "CLAMP (0000001010), "& "HIGHZ (0000001011), "& "CONFIG_IO (0000001101)"; attribute INSTRUCTION_CAPTURE of EP1C3T100 : entity is "0101010101"; attribute IDCODE_REGISTER of EP1C3T100 : entity is "0000"& --4-bit Version "0010000010000001"& --16-bit Part Number (hex 2081) "00001101110"& --11-bit Manufacturer's Identity "1"; --Mandatory LSB attribute BOUNDARY_LENGTH of EP1C3T100 : entity is 339;
Here's what we learn from this device's BSDL:
The boundary-scan is 339 bits long. That doesn't mean there are 339 pins.
Each pin use an IO pad on the IC die. Some IO pads use one, two or three bits from the chain (depending if the pin is input only, output with tri-state, or both). See the links at the bottom of this page for more details. Also some registers correspond to IO pads that may not be bounded (they exists on the IC die but are not accessible externally). Which explains why a 100 pins device can have a 339 bits boundary-scan chain.
Going back to the BSDL file, we also get this:
attribute BOUNDARY_REGISTER of EP1C3T100 : entity is --BSC group 0 for I/O pin 100 "0 (BC_1, IO100, input, X)," & "1 (BC_1, *, control, 1)," & "2 (BC_1, IO100, output3, X, 1, 1, Z)," & --BSC group 1 for I/O pin 99 "3 (BC_1, IO99, input, X)," & "4 (BC_1, *, control, 1)," & "5 (BC_1, IO99, output3, X, 4, 1, Z)," & ... ... ... --BSC group 112 for I/O pin 1 "336 (BC_1, IO1, input, X)," & "337 (BC_1, *, control, 1)," & "338 (BC_1, IO1, output3, X, 337, 1, Z)" ;
This lists all the 339 bits of the chain, and what they do.
For example, bit 3 is the one that tells us what is the value on pin 99.
Let's read the boundary-scan registers, and print the value on pin 99:
// go to reset state for(i=0; i<5; i++) JTAG_clock(TMS); // go to Shift-IR JTAG_clock(0); JTAG_clock(TMS); JTAG_clock(TMS); JTAG_clock(0); JTAG_clock(0); // Assuming that IR is 10 bits long, // that there is only one device in the chain, // and that SAMPLE code = 0000000101b JTAG_clock(1); JTAG_clock(0); JTAG_clock(1); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0); JTAG_clock(0 or TMS); // last bit needs to have TMS active, to exit shift-IR // we are in Exit1-IR, go to Shift-DR JTAG_clock(TMS); JTAG_clock(TMS); JTAG_clock(0); JTAG_clock(0); // read the boundary-scan chain bits in an array called BSB JTAG_read(BSB, 339); printf("Status of pin 99 = %d\n, BSB[3]);
Easy, right?
Your turn to experiment!