Home
Welcome
Information


FPGA projects
Basic
Music box
LED displays
Pong game
R/C servos
Text LCD module
Quadrature decoder
PWM and one-bit DAC
Debouncer
Crossing clock domains
External contributions

Interfaces
RS-232
JTAG
I2C
EPP
SPI
PCI
PCI Express
10BASE-T

Advanced
Digital oscilloscope
Graphic LCD panel
Direct Digital Synthesis
CNC steppers
Spoc CPU core

Hands-on
A simple oscilloscope


FPGA introduction
What are FPGAs?
How FPGAs work
Internal RAM
FPGA pins
Clocks and global lines
Download cables
Configuration
Learn more

FPGA software
Design software
Pin assignment
Design-entry/HDL
Simulation/HDL
Synthesis and P&R

FPGA electronic
SMD technology
Crystals and oscillators

HDL info
HDL tutorials
Verilog tips
VHDL tips

Quick-start guides
ISE
Quartus

Site
News
FPGA links
HDL tutorials
Forum


PCI Express - Topology

Point-to-point architecture

At 2.5Gsps, the PCI Express Gen1 line speed is a whopping 75 times faster than the 33MHz legacy PCI speed.
How is that possible? only because PCI express is a point-to-point bus.

Remember how PCI is a shared bus?

With PCI, ample time has to be specified to let the signals settle during each clock cycle. That's because each line of the PCI bus is shared along the PCI connectors and boards on the same bus. With PCI Express, each signal is point-to-point, which means there is no more settling time and the line speeds can be much higher.

So for example, if a motherboard has two 1-lane connectors and one 16-lane connector, that requires 6+6+34=46 pins on the bridge just for the REFCLKs, PERs and PETs (since no sharing is allowed).

Clock recovery

At speeds starting at 2.5GHz, the point-to-point architecture is still a challenge to get working because the duration of each bit is so short that timing jitter (the time uncertainty surrounding the arrival of each bit) becomes a problem. And even if each signal pair had an associated clock pair transmitted along with it, the clock pair would also be subject to timing jitter. So instead a new technique called "clock recovery" is used.

Clock recovery is simple. Basically, for each signal pair, the pair receiver looks at the signal transitions (a bit 0 followed by a bit 1, or vice-versa), from which it can infer the position of surrounding bits. One problem is that if many successive bits are transmitted with the same value (like lots of 0's), no signal transition is seen. So extra bits are transmitted to ensure that signals transitions are not too far apart (which "re-synchronizes" the clock recovery mechanism).

The extra bits are sent using a scheme called 8b/10b encoding, so that for each 8 bit of useful data, 10 bits are actually transmitted (a 20% overhead) in a specific way that guarantees enough signal transitions. But that also means that at 2.5GHz, we only get 250MBps of useful bandwidth per pair (instead of the 312MBps we would get without the encoding).

Differential pairs

Now remember the fact that the signals are sent on differential pairs (PER and PET)? That has many advantages:

Differential pairs have one obvious disadvantage: it takes twice as many wires to transmit a signal.



>>> NEXT: PCI Express - A story of packets, stack and network >>>



This page was last updated on March 31 2011.