Building Hardware Neurons with SystemVerilog: A Parameterizable Approach and Guided by an AI prompt
Part #1
Artificial Intelligence (AI) and Machine Learning (ML) are rapidly transforming various fields. A core component of many successful ML algorithms, especially neural networks, is the neuron. This post explores implementing a basic, yet parameterizable, neuron in SystemVerilog, a hardware description language used for designing digital circuits. This implementation was guidedby an AI prompt, demonstrating the potential of AI-assisted hardware design.
Unsupervised Learning and Neural Networks
Unsupervised learning empowers algorithms to find patterns in unlabeled data, autonomously identifying relationships and structures. Neural networks, inspired by biological neurons, are highly effective in unsupervised learning tasks. These networks are built from interconnected neurons, each performing a relatively simple computation.
From Prompt to Hardware: Specifying a Neuron in SystemVerilog
To create a hardware neuron, a precise specification is crucial. We used the following prompt to guide the AI generation of SystemVerilog code:
"Write SystemVerilog parameterizable RTL code for the neuron from the picture. x inputs and theta should have separately parameterized width. x inputs and theta should have a separately parameterized number of inputs. x inputs are unsigned. Internal signals and theta input parameters should be signed integers. Final output value limited to the range of -32768 to 32767, then finally transferred to the range 0 to 2*32767 and represented as a 16-bit wide signal, but the actual output is a 14-bit signal consisting of bits [15:2] of this 16-bit value."
This prompt emphasizes several key requirements:
- Parameterization: The code must be flexible, allowing adjustment of the bit widths ofinputs (x) and weights (theta), and the number of inputs for each. 
- Data Types: Internal calculations must use signed integers to avoid truncation and correctly handle negative values. The x inputs are explicitly unsigned. 
- Output Range and Bit Width: The output must be constrained to a specific range, represented by a defined bit width, with a specific selection of output bits. 
- Matching Input Numbers: For this specific neuron implementation (performing a simple dot product), the number of x inputs and theta weights must match. 
Parameterization: The code must be flexible, allowing adjustment of the bit widths ofinputs (x) and weights (theta), and the number of inputs for each.
Data Types: Internal calculations must use signed integers to avoid truncation and correctly handle negative values. The x inputs are explicitly unsigned.
Output Range and Bit Width: The output must be constrained to a specific range, represented by a defined bit width, with a specific selection of output bits.
Matching Input Numbers: For this specific neuron implementation (performing a simple dot product), the number of x inputs and theta weights must match.
The SystemVerilog Implementation
Based on this prompt, the following SystemVerilog code was generated:
Code snippet
Explanation:
- Parameters: X_WIDTH, THETA_WIDTH, NUM_X_INPUTS, and NUM_THETA_INPUTS provide flexibility. Critically, we now have separate parameters for the number of x and theta inputs. 
- Input Matching Check: The initial block now correctly includes the check to ensure that the number of x inputs and thetaweights are equal. This is essential for this dot-product-based neuron. 
- Inputs: x and theta are declared with the correct parameterized widths and number of inputs. x is unsigned, while theta is signed. 
- Internal Signals: int_sum accumulates the weighted sum. limited_sum constrains the sum. shifted_sum shifts the range. 
- Sequential Logic: The always_ff block describes the clocked behavior. 
- Dot Product and Sign Extension: The for loop calculates the dot product. 
Parameters: X_WIDTH, THETA_WIDTH, NUM_X_INPUTS, and NUM_THETA_INPUTS provide flexibility. Critically, we now have separate parameters for the number of x and theta inputs.
Input Matching Check: The initial block now correctly includes the check to ensure that the number of x inputs and thetaweights are equal. This is essential for this dot-product-based neuron.
Inputs: x and theta are declared with the correct parameterized widths and number of inputs. x is unsigned, while theta is signed.
Internal Signals: int_sum accumulates the weighted sum. limited_sum constrains the sum. shifted_sum shifts the range.
Sequential Logic: The always_ff block describes the clocked behavior.
Dot Product and Sign Extension: The for loop calculates the dot product.
The crucial signed'(x[i]) sign-extends the unsigned x values before multiplication with the signed theta values.
- Limiting and Shifting: The sum is limited and then shifted to the positive range. 
- Output: The 14-bit output h_x is extracted. 
Limiting and Shifting: The sum is limited and then shifted to the positive range.
Output: The 14-bit output h_x is extracted.
