RTL Design: Writing Synthesisable Verilog and VHDL (Complete Guide)

What this article covers: This is the most complete guide to RTL design available for VLSI engineers. You will learn what Register Transfer Level (RTL) means, how to write synthesisable Verilog and VHDL, which constructs are synthesisable and which are not, RTL coding best practices, FSM coding styles, reset strategies, clock domain crossing, MISRA and lint rules, and how RTL quality directly affects synthesis, timing closure and silicon success.

RTL design diagram showing 15 powerful rules for synthesisable Verilog and VHDL with FSM design, coding practices, and digital logic implementation

RTL design is the foundation of every digital chip ever manufactured. From the simplest IoT microcontroller to the most powerful AI accelerator, every piece of logic in every chip was described first as RTL code – written in Verilog, SystemVerilog, or VHDL – before any synthesis tool, physical design tool, or foundry process ever touched it. Getting RTL right is the single most important factor in a successful chip design. RTL bugs found after tapeout can cost millions of dollars and months of schedule. RTL bugs found in simulation cost hours.

In the VLSI design flow, the RTL design stage sits right after system specification and architecture definition. It is the stage where the hardware description language (HDL) code is written that defines exactly what the chip will do. This code must be synthesisable – meaning it must be written in a way that logic synthesis tools (Synopsys Design Compiler, Cadence Genus) can translate it correctly into a gate-level netlist. Non-synthesisable code may simulate correctly but will produce wrong or no hardware at all.

This complete guide teaches you how to write professional, synthesisable RTL design in both Verilog and VHDL, from first principles to advanced industry techniques.

1. What is RTL Design?

RTL design stands for Register Transfer Level design. It is a method of describing digital hardware using a Hardware Description Language (HDL), where the description focuses on how data moves between registers (flip-flops) on each clock cycle, and what combinational logic transforms that data between those transfers. The term “register transfer level” captures both concepts: registers (the flip-flops that store state) and transfers (the movement and transformation of data between them, mediated by combinational logic).

The RTL abstraction level sits between two other abstraction levels in digital design:

Abstraction Level	Description	Example	Used For
Behavioural Level	Describes what the system does algorithmically – no timing, no registers implied	`if (a > b) max = a; else max = b;`	High-level synthesis input, early validation
RTL Level ⬅ This	Describes data transfers between registers per clock cycle, with combinational logic between	`always @(posedge clk) q <= d;`	Logic synthesis input — industry standard
Gate Level	Describes the circuit as actual logic gates (AND, OR, NAND, flip-flops) from a standard cell library	Netlist of cell instances and connections	Physical design input, gate-level simulation
Switch Level	Describes transistor-level connectivity (NMOS/PMOS switches)	SPICE netlist	Analog/custom IC design

RTL design is the dominant design methodology in the semiconductor industry because it provides the right level of abstraction: detailed enough to be synthesised into hardware accurately, but abstract enough to allow designers to think in terms of functionality and architecture rather than transistor-level details. When you write RTL, you are describing hardware – not writing software. Every line of RTL code implies actual physical logic gates that will be manufactured on silicon.

💡 RTL Design Is Hardware Description, Not ProgrammingThe most common mistake beginners make is treating RTL like software code. Unlike software, in RTL design: (1) All always blocks execute concurrently, not sequentially. (2) Every signal assignment implies physical wires and logic gates. (3) Timing is real – setup and hold time violations cause silicon failures. (4) You cannot “allocate memory” – storage is always explicit flip-flops or SRAM.

2. RTL vs Behavioural vs Gate-Level Abstraction

To understand RTL design deeply, you must understand how it differs from behavioural coding and gate-level coding. The same hardware can be described at all three levels, but only RTL is the standard input for logic synthesis tools in the industrial VLSI design flow.

Consider a simple 2:1 multiplexer (MUX). Here is how it looks at all three levels:

Behavioural Level (Not Directly Synthesisable for Timing-Critical Paths)

Verilog — Behavioural

// Behavioural - uses delay, not synthesisable in this form
module mux_behav (
    input  a, b, sel,
    output reg y
);
always @(a or b or sel) begin
    #5 y = sel ? a : b;   // #5 delay - NOT synthesisable
end
endmodule

RTL Level (Synthesisable – Industry Standard)

Verilog — RTL (Synthesisable)

// RTL — fully synthesisable, no delays, proper sensitivity list
module mux_rtl (
    input  wire a, b, sel,
    output reg  y
);
always @(*) begin         // @(*) = complete sensitivity list
    if (sel)
        y = a;
    else
        y = b;
end
endmodule

Gate Level (Post-Synthesis Netlist)

Verilog – Gate Level (Generated by Synthesis Tool)

// Gate-level - generated by synthesis tool, not written by hand
module mux_gate (a, b, sel, y);
  input a, b, sel;
  output y;
  MX2_HVT U1 (.A(a), .B(b), .S(sel), .Y(y)); // Standard cell instance
endmodule

The RTL version is what you write as an engineer. The gate-level version is what the synthesis tool produces. Gate-level is never hand-coded for complex designs – that would take years. This illustrates exactly why RTL design is so powerful: one line of RTL can produce hundreds of gates automatically.

3. Verilog vs VHDL vs SystemVerilog: Which to Use for RTL Design?

There are three main HDLs used for RTL design in the industry today. Choosing the right one depends on your design domain, company preference, and the geographic region you work in.

Language	Year	Primary Use	Region / Domain	Strengths	Weaknesses
Verilog (IEEE 1364)	1984	RTL design, synthesis	USA, Asia (dominant)	Concise syntax, C-like, easy to learn, very widely supported	Weaker type system, easier to write bugs
VHDL (IEEE 1076)	1987	RTL design, simulation	Europe, defence/aerospace	Strong typing, verbose but explicit, DoD/aerospace standard	Verbose, steeper learning curve, slower to write
SystemVerilog (IEEE 1800)	2002	RTL design + Verification	Global (increasingly dominant)	Superset of Verilog + OOP verification features, UVM support	Large language, complex features if misused

💡 Industry Recommendation (2025)For RTL design:SystemVerilogis the modern standard – it gives you all of Verilog’s strengths with improved syntax, explicit data types, interfaces, and better synthesis features. Many leading companies (Qualcomm, Apple, NVIDIA, Google) use SystemVerilog for RTL. VHDL remains dominant in European aerospace and defence (DO-254). Pure Verilog is still widely used in legacy designs and academia.

4. Verilog Basics for RTL Design

Verilog is a Hardware Description Language that forms the backbone of RTL design in most of the semiconductor industry. Understanding its core constructs is essential before writing any synthesisable RTL code.

4.1 Module and Port Declaration

In Verilog, every design unit is a module. A module has ports (inputs and outputs) and internal logic. This is the basic building block of any RTL design.

Verilog – Module Structure

// Module declaration — best practice for RTL design
module adder_8bit (
    input  wire [7:0] a,        // 8-bit input A
    input  wire [7:0] b,        // 8-bit input B
    input  wire        cin,     // Carry input
    output wire [7:0] sum,     // Sum output
    output wire        cout    // Carry out
);
    // Continuous assignment — combinational logic
    assign {cout, sum} = a + b + cin;

endmodule

4.2 Verilog Data Types for RTL Design

Understanding data types is critical for writing correct, synthesisable RTL design in Verilog:

Data Type	Use in RTL	Synthesisable?	Example
`wire`	Connects module ports and continuous assignments – represents physical wires	✅ Yes	`wire [7:0] data_bus;`
`reg`	Holds a value in procedural (always) blocks – does NOT always imply a flip-flop! Can be combinational or sequential depending on usage	✅ Yes	`reg [3:0] count;`
`integer`	32-bit signed – useful in for loops (testbench or generate)	⚠️ Avoid in synthesisable RTL	`integer i;`
`parameter`	Compile-time constant – essential for parameterisable RTL design	✅ Yes	`parameter WIDTH = 8;`
`localparam`	Local constant within module – cannot be overridden from outside	✅ Yes	`localparam IDLE = 2'b00;`
`real`	Floating-point – for simulation only	❌ No	`real freq = 1.5e9;`
`time`	Simulation time type	❌ No	`time t_start;`

4.3 always Block – The Core of RTL Design

The always block is the most fundamental construct in Verilog RTL design. There are two types used in synthesisable RTL:

Verilog — Sequential always Block (Flip-Flop)

// Sequential logic — infers flip-flops
always @(posedge clk or negedge rst_n) begin
    if (!rst_n)
        q <= 8'h00;     // Non-blocking: always use <= for sequential RTL
    else
        q <= d;
end

Verilog — Combinational always Block

// Combinational logic — @(*) ensures complete sensitivity list
always @(*) begin
    case (sel)
        2'b00: y = a;    // Blocking: always use = for combinational RTL
        2'b01: y = b;
        2'b10: y = c;
        default: y = d; // ALWAYS include default — prevents latch inference
    endcase
end

⚠️ Critical RTL Rule: Blocking vs Non-Blocking AssignmentsThis is the most important rule in Verilog RTL design – and the most common source of bugs:

Sequential logic (always @posedge clk):ALWAYS use non-blocking assignments (<=)
Combinational logic (always @*):ALWAYS use blocking assignments (=)
NEVER mix both types in the same always block

Mixing blocking and non-blocking assignments in sequential blocks is a simulator-synthesis mismatch – the simulation will pass but the synthesised hardware will behave differently.

5. VHDL Basics for RTL Design

VHDL (VHSIC Hardware Description Language) is the second major HDL used for RTL design. It is strongly typed, more verbose than Verilog, and is the dominant language for European aerospace, defence and automotive chip design (many companies following DO-254 or ISO 26262 mandate VHDL). Understanding VHDL RTL coding is essential for engineers working in these domains.

5.1 VHDL Entity and Architecture

In VHDL, every design unit consists of an entity (defines the interface – ports) and an architecture (defines the implementation – internal logic). This separation of interface and implementation is one of VHDL’s strengths for large-team RTL design.

VHDL — Entity and Architecture

-- VHDL RTL Design: 8-bit Register with Synchronous Reset
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

-- Entity: defines ports (interface)
entity reg_8bit is
    port (
        clk   : in  std_logic;
        rst_n : in  std_logic;
        en    : in  std_logic;
        d     : in  std_logic_vector(7 downto 0);
        q     : out std_logic_vector(7 downto 0)
    );
end entity reg_8bit;

-- Architecture: defines the implementation (behaviour)
architecture rtl of reg_8bit is
begin
    -- Sequential process: infers D flip-flops
    reg_proc : process(clk)
    begin
        if rising_edge(clk) then
            if rst_n = '0' then
                q <= (others => '0');   -- Synchronous reset
            elsif en = '1' then
                q <= d;                 -- Load data when enabled
            end if;
        end if;
    end process reg_proc;
end architecture rtl;

5.2 VHDL Data Types for RTL Design

VHDL’s strong type system prevents many common RTL errors that are possible in Verilog. Key data types for synthesisable RTL design in VHDL include:

VHDL Type	Description	RTL Use
`std_logic`	9-value logic type (U, X, 0, 1, Z, W, L, H, -)	Standard for all single-bit signals – must include IEEE.STD_LOGIC_1164
`std_logic_vector`	Array of std_logic – most common multi-bit type	Buses, data paths: `std_logic_vector(7 downto 0)`
`unsigned`	Unsigned integer (from NUMERIC_STD)	Arithmetic operations – use for arithmetic signals
`signed`	Signed two’s complement integer (from NUMERIC_STD)	Signed arithmetic operations
`integer`	32-bit integer – synthesisable with range constraint	`integer range 0 to 255` – defines bit width
`boolean`	True/false – synthesisable	Control signals, enable flags
`enumeration`	User-defined enumerated types	FSM state encoding: `type state_t is (IDLE, ACTIVE, DONE);`

5.3 VHDL Signal vs Variable

One of VHDL’s most commonly misunderstood concepts in RTL design is the difference between signals and variables:

Aspect	Signal	Variable
Scope	Defined in architecture or package – visible everywhere in the architecture	Defined inside a process – local to that process only
Update timing	Updated at the end of the delta cycle (after process suspension)	Updated immediately when assigned
Synthesis result	Always maps to a wire or flip-flop	May or may not produce a register depending on usage context
RTL recommendation	Use for inter-process communication, output ports, and main data paths	Use for local intermediate calculations within a process
Assignment operator	`<=` (signal assignment)	`:=` (variable assignment)

6. Synthesisable vs Non-Synthesisable Constructs

One of the most critical skills in RTL design is knowing exactly which language constructs are synthesisable and which are not. Using non-synthesisable constructs in RTL code is a common beginner mistake that causes the synthesis tool to either error out, produce incorrect hardware, or silently ignore the construct – leading to simulation-synthesis mismatches that are extremely hard to debug.

Verilog: Synthesisable Constructs

Construct	Synthesisable?	Infers
`assign` (continuous assignment)	✅ Yes	Combinational logic (wires and gates)
`always @(posedge clk)`	✅ Yes	Flip-flops (sequential logic)
`always @(*)`	✅ Yes	Combinational logic
`if-else` in always	✅ Yes	MUX (combinational) or conditional register update (sequential)
`case / casez / casex`	✅ Yes	MUX / decoder logic
`for` loop with fixed bounds	✅ Yes	Replicated logic (unrolled at compile time)
`module instantiation`	✅ Yes	Hierarchical structural design
`generate`	✅ Yes	Parameterisable repeated or conditional hardware structures
`parameter / localparam`	✅ Yes	Compile-time constants (no hardware)
Arithmetic operators (+, -, *, /)	✅ Yes (*, / may be large)	Adder, subtractor, multiplier, divider circuits
Bitwise operators (&, \|, ^, ~)	✅ Yes	AND, OR, XOR, NOT gates
Reduction operators (&, \|, ^)	✅ Yes	Multi-input AND, OR, XOR reduction trees
Shift operators (<<, >>, <<<, >>>)	✅ Yes	Barrel shifters or wire routing (constant shifts)
Concatenation { , }	✅ Yes	Wire concatenation / bus manipulation
Conditional operator (? 🙂	✅ Yes	2:1 MUX

Verilog: Non-Synthesisable Constructs (Testbench / Simulation Only)

Construct	Why Not Synthesisable	Use Instead
`#delay` (e.g., `#10 clk = ~clk;`)	Time delays have no physical hardware equivalent in combinational/sequential logic	Use timing constraints in SDC
`initial` block	Executes once at time 0 – no hardware equivalent (except FPGA with initialisation support)	Use synchronous reset for initialisation
`$display, $monitor, $finish`	System tasks – simulator functions only	Not applicable in RTL – testbench only
`force / release`	Simulator construct for overriding signals – no hardware equivalent	Testbench only
`fork / join`	Parallel thread simulation – no hardware mapping	Testbench only
Dynamic memory (`new`, queues, mailboxes)	SystemVerilog verification constructs – no hardware equivalent	SystemVerilog classes for testbench only
File I/O (`$fopen, $fread`)	Operating system file access – no hardware equivalent	Testbench only
Unbounded `while` loop	May execute indefinitely – synthesis cannot determine hardware size	Use `for` loop with fixed bounds
`wait` statement	Event-based waiting – not synthesisable	Use synchronous state machines

7. Writing Sequential Logic in RTL Design

Sequential logic is the type of digital logic that has memory – its output depends not only on current inputs but also on past states. In RTL design, sequential logic is implemented using flip-flops, which are inferred whenever a signal is assigned inside a clock-edge-triggered always block.

7.1 D Flip-Flop with Synchronous Reset

Verilog — D Flip-Flop (Synchronous Reset)

module dff_sync_rst (
    input  wire clk, rst_n, d,
    output reg  q
);
    always @(posedge clk) begin
        if (!rst_n)    // Synchronous reset - checked at clock edge
            q <= 1'b0;
        else
            q <= d;
    end
endmodule

7.2 D Flip-Flop with Asynchronous Reset

Verilog – D Flip-Flop (Asynchronous Reset)

module dff_async_rst (
    input  wire clk, rst_n, d,
    output reg  q
);
    // rst_n in sensitivity list = asynchronous reset
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n)    // Asynchronous — fires regardless of clock
            q <= 1'b0;
        else
            q <= d;
    end
endmodule

7.3 Parameterisable Shift Register

Verilog — Parameterisable Shift Register

module shift_reg #(
    parameter WIDTH = 8,
    parameter DEPTH = 4
) (
    input  wire              clk, rst_n, en,
    input  wire [WIDTH-1:0] d_in,
    output wire [WIDTH-1:0] d_out
);
    reg [WIDTH-1:0] shift_mem [DEPTH-1:0];
    integer i;

    always @(posedge clk) begin
        if (!rst_n) begin
            for (i = 0; i < DEPTH; i = i + 1)
                shift_mem[i] <= {WIDTH{1'b0}};
        end else if (en) begin
            shift_mem[0] <= d_in;
            for (i = 1; i < DEPTH; i = i + 1)
                shift_mem[i] <= shift_mem[i-1];
        end
    end
    assign d_out = shift_mem[DEPTH-1];
endmodule

8. Writing Combinational Logic in RTL Design

Combinational logic produces outputs that depend only on the current values of its inputs – there is no memory, no clock. In RTL design, combinational logic is described using continuous assignments (assign statements) or combinational always @(*) blocks. The synthesis tool maps this into the appropriate logic gates from the standard cell library.

8.1 Priority Encoder Using always @(*)

Verilog — Priority Encoder (4-to-2)

module priority_enc_4to2 (
    input  wire [3:0] req,     // 4-bit request bus
    output reg  [1:0] grant,   // 2-bit grant output
    output reg          valid    // Valid output
);
    always @(*) begin
        valid = 1'b1;
        if      (req[3]) grant = 2'b11;  // Highest priority
        else if (req[2]) grant = 2'b10;
        else if (req[1]) grant = 2'b01;
        else if (req[0]) grant = 2'b00;
        else begin
            grant = 2'b00;
            valid = 1'b0;              // No request active
        end
    end
endmodule

8.2 Avoiding Latch Inference in RTL Design

One of the most critical rules in combinational RTL design is to avoid unintentional latch inference. A latch is inferred when a signal in a combinational always block is not assigned in all possible code paths. Latches are timing-analysis nightmares — they are level-sensitive, not edge-triggered, which causes problems with static timing analysis and often indicates a design error.

❌ Latch Inferred (Bug)

always @(*) begin
  if (sel)
    y = a;    // What is y when
              // sel=0? Latch!
end

✅ No Latch – Correct RTL

always @(*) begin
  if (sel)
    y = a;
  else
    y = b;  // All paths covered
end

❌ Incomplete case (Latch Bug)

always @(*) begin
  case (opcode)
    2'b00: result = a + b;
    2'b01: result = a - b;
    // Missing 2'b10, 2'b11!
    // = Latch inferred
  endcase
end

✅ Default covers all paths

always @(*) begin
  case (opcode)
    2'b00: result = a + b;
    2'b01: result = a - b;
    2'b10: result = a & b;
    default: result = 8'h00;
    // Default prevents latch
  endcase
end

9. FSM (Finite State Machine) RTL Coding

Finite State Machines (FSMs) are one of the most fundamental building blocks in RTL design. Nearly every digital controller, protocol handler, sequencer, and arbiter in a chip is implemented as an FSM. Understanding how to code FSMs correctly in RTL is essential for every VLSI engineer.

9.1 Moore vs Mealy FSMs

Aspect	Moore FSM	Mealy FSM
Output depends on	Current state only	Current state AND current inputs
Output timing	Registered (one cycle after state transition)	Combinational (immediate with input change)
Number of states	Typically more states needed	Typically fewer states needed
Glitching	No output glitching – outputs are registered	Possible input glitch → output glitch
Preferred for	Most RTL designs – safer, easier to verify	Protocol interfaces where immediate response needed

9.2 Three-Process FSM Coding Style (Recommended for RTL Design)

The three-process FSM coding style is the industry best practice for RTL design – it separates state register, next-state logic, and output logic into three distinct always blocks for maximum clarity, maintainability, and synthesis friendliness:

Verilog – Three-Process FSM (Moore – UART TX Controller Example)

module uart_tx_ctrl (
    input  wire       clk, rst_n,
    input  wire       tx_start,   // Start transmission
    input  wire       tx_done,    // Bit-level done
    output reg        tx_en,      // Enable transmitter
    output reg        tx_busy     // Busy flag
);

    // State encoding using localparams
    localparam [1:0]
        IDLE     = 2'b00,
        START    = 2'b01,
        TRANSMIT = 2'b10,
        STOP     = 2'b11;

    reg [1:0] curr_state, next_state;

    // PROCESS 1: State Register (Sequential)
    always @(posedge clk or negedge rst_n) begin
        if (!rst_n)
            curr_state <= IDLE;
        else
            curr_state <= next_state;
    end

    // PROCESS 2: Next-State Logic (Combinational)
    always @(*) begin
        next_state = curr_state; // Default: stay in current state
        case (curr_state)
            IDLE    : if (tx_start) next_state = START;
            START   :               next_state = TRANSMIT;
            TRANSMIT: if (tx_done)  next_state = STOP;
            STOP    :               next_state = IDLE;
            default:               next_state = IDLE;
        endcase
    end

    // PROCESS 3: Output Logic (Moore — based on state only)
    always @(*) begin
        // Default outputs to prevent latch inference
        tx_en   = 1'b0;
        tx_busy = 1'b0;
        case (curr_state)
            IDLE    : begin tx_en = 1'b0; tx_busy = 1'b0; end
            START   : begin tx_en = 1'b1; tx_busy = 1'b1; end
            TRANSMIT: begin tx_en = 1'b1; tx_busy = 1'b1; end
            STOP    : begin tx_en = 1'b0; tx_busy = 1'b1; end
            default : begin tx_en = 1'b0; tx_busy = 1'b0; end
        endcase
    end

endmodule

9.3 FSM State Encoding

Encoding Style	Example (4 states)	Advantages	Disadvantages
Binary	00, 01, 10, 11	Minimum flip-flops (log2 N)	More complex next-state logic, slower
One-Hot	0001, 0010, 0100, 1000	Fastest next-state logic, used in high-speed designs	More flip-flops (N flip-flops)
Gray Code	00, 01, 11, 10	Only one bit changes per transition – good for CDC	More complex output decoding
Johnson	1000, 1100, 1110, 1111	Power efficient – limited bit switching	Requires special decoding

10. Reset Strategies in RTL Design

Reset strategy is one of the most important architectural decisions in RTL design. The reset strategy affects power consumption, timing analysis, design robustness, and functional safety compliance. There are two fundamental types of reset:

Reset Type	How It Works	Verilog Sensitivity List	Advantages	Disadvantages
Synchronous Reset	Reset is checked only at the active clock edge — reset must be held for at least one clock cycle	`always @(posedge clk)`	Clean STA — reset treated as data; no asynchronous paths; better for timing closure	Reset must be asserted for at least one full clock cycle; risk of missing reset if pulse too short
Asynchronous Reset	Reset immediately forces the flip-flop to its reset state regardless of the clock	`always @(posedge clk or negedge rst_n)`	Instantaneous reset; works even if clock is stopped; required in some power-domain scenarios	Introduces asynchronous timing paths (recovery/removal constraints); deassertion must be synchronised to clock to avoid metastability

Asynchronous Reset with Synchronous Deassertion (Best Practice)

The industry best practice for safety-critical RTL design is to use asynchronous assertion (reset can fire anytime) but synchronous deassertion (release is synchronised to the clock edge). This is implemented using a reset synchroniser:

Verilog – Reset Synchroniser (Best Practice)

// Reset synchroniser — 2-stage synchroniser prevents metastability
// on reset deassertion
module rst_sync (
    input  wire clk,
    input  wire async_rst_n,   // Async reset from power-on / button
    output wire sync_rst_n     // Synchronised reset to logic
);
    reg [1:0] sync_ff;

    always @(posedge clk or negedge async_rst_n) begin
        if (!async_rst_n)
            sync_ff <= 2'b00;   // Async assertion propagates immediately
        else
            sync_ff <= {sync_ff[0], 1'b1}; // Sync deassertion through 2 FFs
    end
    assign sync_rst_n = sync_ff[1];
endmodule

11. Clock Domain Crossing (CDC) in RTL Design

Clock Domain Crossing (CDC) is one of the most critical and error-prone aspects of RTL design. A CDC occurs whenever a signal crosses from logic clocked by one clock domain to logic clocked by a different, unrelated (or asynchronous) clock domain. If not handled correctly, CDC violations cause metastability – a state where a flip-flop’s output is neither a clean logic 0 nor a logic 1 – which leads to unpredictable, intermittent chip failures that are extremely difficult to debug.

Types of CDC Signals and Their Solutions

Signal Type	CDC Technique	Description
Single-bit control signal	2-FF Synchroniser	Two back-to-back flip-flops in the destination domain – standard synchroniser for low-frequency single-bit signals
Single-bit pulse	Pulse synchroniser / Toggle synchroniser	Convert pulse to toggle, synchronise toggle, detect edge in destination domain
Multi-bit data bus	Async FIFO (Gray-coded pointers)	FIFO with independent read and write clocks, Gray-coded pointers synchronised across domains – most robust solution
Multi-bit control bus	Handshake protocol	Request/acknowledge handshake with synchronised enable signals
Multi-bit near-static data	MUX synchroniser / Enable pulse	Data changes only when a synchronised enable/sample pulse is asserted

Verilog – 2-FF Synchroniser (Single Bit CDC)

module sync_2ff (
    input  wire clk_dst,   // Destination clock
    input  wire rst_n,
    input  wire d_src,    // Signal from source clock domain
    output wire d_dst     // Synchronised output in destination domain
);
    reg ff1, ff2;

    always @(posedge clk_dst or negedge rst_n) begin
        if (!rst_n) begin
            ff1 <= 1'b0;
            ff2 <= 1'b0;
        end else begin
            ff1 <= d_src;    // First FF: may go metastable
            ff2 <= ff1;      // Second FF: resolves metastability
        end
    end
    assign d_dst = ff2;   // Clean synchronised signal
endmodule

12. Parameterisation and Scalable RTL Design

Writing parameterisable RTL is a hallmark of professional RTL design. A parameterisable module can be reused across multiple projects, multiple chips, and multiple configurations without rewriting the code. Using parameter and localparam in Verilog (or generic in VHDL) to define bus widths, depths, and other configuration values is a fundamental best practice.

Verilog – Parameterisable FIFO Controller

module fifo_ctrl #(
    parameter DATA_WIDTH = 8,   // Configurable data width
    parameter FIFO_DEPTH = 16,  // Configurable depth
    parameter ADDR_WIDTH = 4    // log2(FIFO_DEPTH)
) (
    input  wire                  clk, rst_n,
    input  wire                  wr_en, rd_en,
    input  wire [DATA_WIDTH-1:0] wr_data,
    output reg  [DATA_WIDTH-1:0] rd_data,
    output wire                  full, empty
);
    reg [DATA_WIDTH-1:0] mem [FIFO_DEPTH-1:0];
    reg [ADDR_WIDTH:0]   wr_ptr, rd_ptr;  // Extra bit for full/empty detect

    assign full  = (wr_ptr == {~rd_ptr[ADDR_WIDTH], rd_ptr[ADDR_WIDTH-1:0]});
    assign empty = (wr_ptr == rd_ptr);

    always @(posedge clk or negedge rst_n) begin
        if (!rst_n) begin
            wr_ptr <= '0;
            rd_ptr <= '0;
        end else begin
            if (wr_en && !full) begin
                mem[wr_ptr[ADDR_WIDTH-1:0]] <= wr_data;
                wr_ptr <= wr_ptr + 1'b1;
            end
            if (rd_en && !empty) begin
                rd_data <= mem[rd_ptr[ADDR_WIDTH-1:0]];
                rd_ptr  <= rd_ptr + 1'b1;
            end
        end
    end
endmodule

13. RTL Coding Guidelines and Best Practices

Professional RTL design follows strict coding guidelines that ensure the code is readable, maintainable, synthesisable, and produces predictable hardware. The following guidelines are used by top semiconductor companies and are also enforced by RTL lint tools like Synopsys SpyGlass and Mentor Questa Lint.

#	Guideline	Reason
1	Always use `<=` for sequential and `=` for combinational	Prevents sim-synth mismatches
2	Always use `@(*)` (or `always_comb`) for combinational logic	Prevents incomplete sensitivity list latches
3	Always include `default` in case statements	Prevents latch inference
4	Always cover all branches in if-else for combinational	Prevents latch inference
5	Never use delays (#) in RTL	Not synthesisable
6	Never use `initial` blocks in synthesisable RTL	Not synthesisable (except FPGA RAM init)
7	Use `parameter` for all constants – no magic numbers	Readability and reusability
8	One clock domain per always block – never mix clocks	Prevents CDC issues
9	Name all flip-flops and state signals descriptively	Readability and debug
10	Add file header with author, date, description, revision	Documentation standard
11	Separate combinational and sequential logic into different always blocks	Cleaner synthesis, easier debug
12	Use consistent naming conventions (e.g., `_n` suffix for active-low signals)	Team readability
13	Avoid full-case without simulator/synthesis pragma – use explicit default	Synthesis tool portability
14	Keep module size manageable – < 500 lines per module	Synthesis and debug efficiency
15	Use hierarchical design – break large functions into sub-modules	Enables incremental synthesis and team collaboration

14. RTL Lint Checks and Static Analysis

Before any RTL design is handed to the verification team or synthesis tool, it must pass lint analysis – an automated static analysis that checks the code for coding guideline violations, potential simulation-synthesis mismatches, and logic correctness issues. Lint is the first quality gate in the VLSI design flow after RTL coding.

Industry RTL Lint Tools

Synopsys SpyGlass: The industry-leading RTL lint tool – checks for hundreds of coding quality rules, CDC issues, and synthesis issues
Mentor Questa Lint (Siemens): Comprehensive lint and CDC analysis
Cadence HAL: Cadence’s RTL analysis tool
Verilator (open source): Fast Verilog linter and simulator – excellent for open-source RTL design

Common RTL Lint Violations

Violation	Severity	Cause	Fix
Latch inferred	🔴 Error	Incomplete always block coverage	Add default assignment or else clause
Incomplete sensitivity list	🔴 Error	`always @(a)` instead of `always @(*)`	Use `always @(*)` or `always_comb`
Blocking in sequential block	🔴 Error	`=` used inside `always @(posedge clk)`	Change to `<=`
Multiple drivers on net	🔴 Error	Two always blocks drive same signal	Merge into single always block
Bit-width mismatch	🟡 Warning	Assigning 8-bit to 4-bit without truncation intent	Explicit width matching or truncation
Undriven output	🟡 Warning	Output port not assigned in all conditions	Assign default value
Unsynthesisable construct	🔴 Error	`#delay` or `initial` in RTL	Remove or move to testbench
CDC violation	🔴 Error	Multi-bit bus crossing clock domain without sync	Add async FIFO or handshake

15. Writing RTL That Synthesises Well

Good RTL design is not just about functional correctness – it must also synthesise efficiently to meet timing, power, and area targets. The way you write RTL has a direct impact on the quality of the synthesised hardware. Here are the key principles for writing synthesis-friendly RTL:

Arithmetic Operations and Critical Path Awareness

Every arithmetic operation in RTL maps to physical logic gates. Addition, subtraction and comparisons are fast. Multiplication maps to dedicated multiplier cells or DSP blocks. Division is very slow and area-intensive – always try to replace division with right-shift (for powers of 2) or dedicated lookup-table-based solutions in timing-critical paths.

❌ Slow Critical Path

// Deep chain of operations
// = long critical path
always @(*) begin
  result = (a + b) * c
           / d + e - f;
end

✅ Pipelined for Speed

// Break into pipeline stages
always @(posedge clk) begin
  stage1 <= a + b;   // Cycle 1
  stage2 <= stage1
            * c;     // Cycle 2
  result <= stage2
            + e - f; // Cycle 3
end

16. Common RTL Design Mistakes and How to Avoid Them

#	Mistake	Consequence	How to Avoid
1	Using `#delay` in synthesisable RTL	Synthesis ignores delay – hardware behaves differently from simulation	Never use `#` in RTL – use SDC constraints for timing
2	Incomplete case / if-else (no default)	Latch inferred – RTL lint error, timing analysis failure	Always include `default` in case, `else` in if for combinational
3	Using blocking assignment in sequential logic	Race conditions – simulation passes but silicon fails	Always use `<=` in sequential always blocks
4	Missing CDC synchroniser on cross-domain signals	Metastability – intermittent failures in silicon	Always use 2-FF synchroniser or async FIFO for CDC
5	Gating the clock in RTL (using logic to control clock)	Clock glitches → data corruption; DTA/STA nightmare	Use clock enable signals with ICG (Integrated Clock Gating) cells
6	Reset fan-out too high (one reset driving millions of FFs)	Reset timing violations; reset doesn’t propagate	Use reset tree with buffers; let synthesis handle reset tree
7	Combinational loops (output feeds directly back to input with no register)	Simulation glitching, oscillation; unsynthesisable in most cases	Add a register to break every feedback path
8	Unintentional X propagation (uninitialized signals)	X-propagation – simulation passes but masks real bugs	Initialise all flip-flops via reset; use SVA X-checks
9	Over-complicated RTL in single module	Slow synthesis, hard to debug, poor reusability	Break into hierarchical sub-modules
10	Not running lint before handoff	RTL with latches, CDC issues, coding violations reaches synthesis	Make SpyGlass lint clean a mandatory RTL handoff criterion

🎯 Key Takeaways – RTL Design

RTL design describes digital hardware in terms of register transfers per clock cycle – it is hardware description, not software programming
RTL code must be synthesisable – all non-synthesisable constructs (#delay, initial, $display) belong in testbenches only
Use <= (non-blocking) for sequential logic and = (blocking) for combinational – mixing causes simulation-synthesis mismatches
Always include default in case statements and else in combinational if-else to prevent latch inference
Use always @(*) for all combinational logic to ensure complete sensitivity lists
FSM coding with three separate always blocks (state register, next-state, output) is the industry best practice
CDC signals must always be synchronised – 2-FF synchroniser for single bits, async FIFO for multi-bit buses
Use parameterised RTL modules for reusability across chips and projects
Always run RTL lint (SpyGlass) before handoff to verification and synthesis
Clean RTL = faster synthesis, better timing closure, fewer silicon respins

17. Frequently Asked Questions (FAQ)

Q1: What is RTL design in VLSI?

RTL design in VLSI stands for Register Transfer Level design. It is the process of describing a digital circuit’s functionality using Hardware Description Languages (Verilog, SystemVerilog, or VHDL) at a level of abstraction where the design is described in terms of data transfers between registers per clock cycle and the combinational logic that transforms that data. RTL design is the primary input for logic synthesis tools in the VLSI design flow and is the standard methodology for all digital IC design.

Q2: What is the difference between synthesisable and non-synthesisable Verilog?

Synthesisable Verilog constructs are those that logic synthesis tools (like Synopsys Design Compiler) can translate into actual hardware gates. These include: always blocks with clock or @(*), assign statements, if-else, case, for loops with fixed bounds, module instantiation, and arithmetic/bitwise operators. Non-synthesisable constructs are those used only for simulation: #delay, initial blocks, $display/$monitor/$finish system tasks, force/release, fork/join, and dynamic memory allocation. Using non-synthesisable constructs in RTL code causes the synthesis tool to either error out or produce incorrect hardware.

Q3: When should I use blocking vs non-blocking assignments in RTL design?

Use non-blocking assignments (<=) in ALL sequential (clock-edge triggered) always blocks – this ensures that all flip-flops update simultaneously at the clock edge, which is the correct hardware behaviour. Use blocking assignments (=) in ALL combinational (always @*) blocks – this ensures proper sequential evaluation of combinational logic within the block. Never mix both types in the same always block. This is arguably the single most important rule in Verilog RTL design.

Q4: What causes latch inference in RTL design and how to avoid it?

A latch is inferred in RTL when a signal in a combinational always block is not assigned under all possible input conditions. The synthesis tool assumes that if a signal is not assigned, it must “remember” its previous value – which implies a latch. To avoid latch inference: (1) Always include a default assignment at the beginning of combinational always blocks. (2) Always include a default case in case statements. (3) Always include an else clause in if-else statements within combinational logic. Running RTL lint (SpyGlass) will flag all latch inference warnings.

Q5: What is Clock Domain Crossing (CDC) and why is it important in RTL design?

Clock Domain Crossing (CDC) occurs when a signal must cross from logic driven by one clock to logic driven by a different, asynchronous clock. Without proper CDC handling, the receiving flip-flop may go metastable – its output becomes indeterminate, causing unpredictable chip behaviour. CDC is handled using synchronisers: a 2-FF synchroniser for single-bit signals, and an asynchronous FIFO (with Gray-coded pointers) for multi-bit data buses. CDC violations are a leading cause of first-silicon failures and must be verified using dedicated CDC analysis tools (SpyGlass CDC, Questa CDC) before tapeout.

Q6: What is the three-process FSM coding style?

The three-process FSM coding style separates the FSM implementation into three distinct always blocks: (1) The state register – a sequential always block that updates current_state to next_state on each clock edge. (2) The next-state logic – a combinational always block that determines the next state based on current state and inputs. (3) The output logic – a combinational always block that generates outputs based on the current state (Moore) or current state and inputs (Mealy). This style is the industry best practice because it is the most readable, most maintainable, and most synthesis-friendly way to code FSMs.

18. Conclusion

RTL design is the bedrock of every digital chip manufactured in the world. It is the stage where hardware engineers translate architecture specifications into concrete, synthesisable hardware descriptions using Verilog, SystemVerilog, or VHDL. The quality of the RTL code written at this stage directly determines the success of everything that follows in the VLSI design flow – from how quickly verification achieves coverage closure, to how efficiently logic synthesis produces a quality netlist, to whether the physical design team achieves timing closure without costly design changes.

In this comprehensive guide, we covered all aspects of professional RTL design: the definition and abstraction levels, Verilog and VHDL language fundamentals, synthesisable vs non-synthesisable constructs, sequential and combinational logic coding, FSM coding best practices, reset strategies, CDC handling, parameterisation, coding guidelines, lint checking, and the most common RTL mistakes that cost engineers time and companies money.

The key message is this: RTL design is hardware description. Every line of code you write becomes physical silicon. Write it with the care and precision that hardware demands – not the casualness that software sometimes permits. Clean, lint-free, well-structured, synthesisable RTL code is the foundation of every successful chip.

Discover more from PiEmbSysTech - Embedded Systems & VLSI Lab

Subscribe to get the latest posts sent to your email.

RTL Design: Writing Synthesisable Verilog and VHDL (Complete Guide)

Table of Contents

1. What is RTL Design?

2. RTL vs Behavioural vs Gate-Level Abstraction

Behavioural Level (Not Directly Synthesisable for Timing-Critical Paths)

RTL Level (Synthesisable – Industry Standard)

Gate Level (Post-Synthesis Netlist)

3. Verilog vs VHDL vs SystemVerilog: Which to Use for RTL Design?

4. Verilog Basics for RTL Design

4.1 Module and Port Declaration

4.2 Verilog Data Types for RTL Design

4.3 always Block – The Core of RTL Design

5. VHDL Basics for RTL Design

5.1 VHDL Entity and Architecture

5.2 VHDL Data Types for RTL Design

5.3 VHDL Signal vs Variable

6. Synthesisable vs Non-Synthesisable Constructs

Verilog: Synthesisable Constructs

Verilog: Non-Synthesisable Constructs (Testbench / Simulation Only)

7. Writing Sequential Logic in RTL Design

7.1 D Flip-Flop with Synchronous Reset

7.2 D Flip-Flop with Asynchronous Reset

7.3 Parameterisable Shift Register

8. Writing Combinational Logic in RTL Design

8.1 Priority Encoder Using always @(*)

8.2 Avoiding Latch Inference in RTL Design

❌ Latch Inferred (Bug)

✅ No Latch – Correct RTL

❌ Incomplete case (Latch Bug)

✅ Default covers all paths

9. FSM (Finite State Machine) RTL Coding

9.1 Moore vs Mealy FSMs

9.2 Three-Process FSM Coding Style (Recommended for RTL Design)

9.3 FSM State Encoding

10. Reset Strategies in RTL Design

Asynchronous Reset with Synchronous Deassertion (Best Practice)

11. Clock Domain Crossing (CDC) in RTL Design

Types of CDC Signals and Their Solutions

12. Parameterisation and Scalable RTL Design

13. RTL Coding Guidelines and Best Practices

14. RTL Lint Checks and Static Analysis

Industry RTL Lint Tools

Common RTL Lint Violations

15. Writing RTL That Synthesises Well

Arithmetic Operations and Critical Path Awareness

❌ Slow Critical Path

✅ Pipelined for Speed

16. Common RTL Design Mistakes and How to Avoid Them

🎯 Key Takeaways – RTL Design

17. Frequently Asked Questions (FAQ)

18. Conclusion

Related

Discover more from PiEmbSysTech - Embedded Systems & VLSI Lab

Equivalent Technical Articles

Leave a ReplyCancel reply

fdhfghfgh

Discover more from PiEmbSysTech - Embedded Systems & VLSI Lab