Core Concepts
Large Language Models (LLMs) can be leveraged to automate both the design and verification of hardware modules, potentially streamlining the digital design pipeline.
Abstract
The paper explores the capabilities and limitations of state-of-the-art conversational LLMs, such as ChatGPT-4 and ChatGPT-3.5, in producing Verilog code for functional hardware design and verification.
The authors developed a suite of 8 representative hardware benchmarks, including shift registers, sequence generators, finite state machines, and more. They then used a conversational workflow to prompt the LLMs to generate the Verilog designs and accompanying testbenches.
The results show that ChatGPT-4 was able to successfully generate compliant designs and testbenches for the majority of the benchmarks, often requiring only minor tool feedback to fix issues. In contrast, ChatGPT-3.5 struggled more, with most conversations yielding failed or non-compliant results. The authors also taped out one of the successful ChatGPT-4 designs on a Skywater 130nm shuttle, verifying its functionality in silicon.
The key findings are:
ChatGPT-4 can be effectively used for both hardware design and verification, though it requires some human feedback to address errors.
Testbench generation remains a significant challenge for current LLMs, as they struggle to create comprehensive and self-checking test cases.
Improvements in LLM capabilities, whether through new models or fine-tuning, could lead to tools that simplify hardware design and increase designer productivity.
Stats
51% of development effort (cost) in ASIC and FPGA-based systems are spent on verification.
The Tiny Tapeout 3 platform used in this work has constraints such as a limit of 8 bits of input and 8 bits of output per design.
Quotes
"LLMs have recently been made 'conversational' using instruction-tuning. Rather than guessing the next most likely token in an 'autocomplete' fashion, they ingest whole prompts and formulate complete responses to those prompts."
"While current state of the art LLMs can be used for design tasks, they are still underperforming when it comes to test."