The Problem
Modern GPUs are incredibly powerful but also black boxes. As an ECE student, I wanted to understand graphics at the hardware level. Could I build a functional GPU from scratch using just an FPGA?
Constraints
Working with the Artix-7 FPGA meant dealing with:
- Only 35K logic cells available
- Limited to 225KB of Block RAM
- Strict 60Hz timing requirements for VGA
- No soft CPU - pure hardware implementation
Technical Approach
1. Display Controller
Implemented VGA timing generator with precise 25.175MHz pixel clock:
always @(posedge clk_25mhz) begin
if (h_count < H_DISPLAY)
vga_active <= (v_count < V_DISPLAY);
// Precise timing for 640x480@60Hz
end
2. Graphics Pipeline Architecture
- Frame Buffer: Dual-port BRAM with double buffering to prevent tearing
- Rasterizer: Hardware implementation of Bresenham line algorithm
- Sprite Engine: DMA-style sprite blitting with transparency
- Collision Detection: Parallel boundary checking in hardware
3. Host Communication
Python toolchain for development and debugging:
# Stream sprite data via UART
def upload_sprite(serial_port, sprite_data):
header = struct.pack('>HH', width, height)
serial_port.write(header + sprite_data)
Results & Performance
✅ 640×480 @ 60Hz stable VGA output
✅ 12-bit RGB color depth (4096 colors)
✅ 2 Games Running: Pong and Snake fully playable
✅ Less than 50% LUT Usage: Efficient resource utilization
✅ Zero frame drops with hardware double-buffering
Key Learnings
The biggest challenge was timing closure. The VGA specification requires exact timing - being off by even 0.1% causes display issues. I spent two weeks debugging why my display was shifted until I realized my pixel clock was 25.170MHz instead of 25.175MHz.
Another surprise was how much BRAM organization matters. Switching from a linear frame buffer to a tile-based approach improved fill rate by 3x.
What’s Next?
Currently implementing:
- 3D wireframe rendering
- Texture mapping support
- HDMI output upgrade
- Custom GPU instruction set