# Functional Bug Hunting Guide ## Goal When completing lab assignments, you will inevitably encounter many bugs. This is perfectly normal: **"Only those who do nothing make no mistakes" — © Jason Statham**. It is important to develop a positive attitude toward finding bugs (since discovering them leads to improvements in your design). If you approach bug detection negatively, you will subconsciously try to find bugs half-heartedly — but not seeing bugs does not mean they are not there. With the right mindset, debugging can turn into an exciting detective investigation, where you have a "crime scene" (an observed behavioral discrepancy — usually not the bug itself, but its consequence, ripples on the water) and a "set of clues" (log fragments, source code). Step by step, you will unravel what seems like an impenetrable web, uncovering new clues leading to the true root cause. This document is a hands-on guide to finding such bugs in **SystemVerilog** code. > [!IMPORTANT] > Note: throughout this guide, the term "waveform" refers to the signal timeline display in the Vivado simulator. - [Functional Bug Hunting Guide](#functional-bug-hunting-guide) - [Goal](#goal) - [Bug Hunting Algorithm](#bug-hunting-algorithm) - [Working with the Log When Errors Appear](#working-with-the-log-when-errors-appear) - [Locating the Bug on the Waveform](#locating-the-bug-on-the-waveform) - [Opening the Source File of the Problematic Signal](#opening-the-source-file-of-the-problematic-signal) - [Adding Object Signals to the Waveform](#adding-object-signals-to-the-waveform) - [Restarting the Simulation and Setting Simulation Time](#restarting-the-simulation-and-setting-simulation-time) - [Fixing Signals in the Z-State](#fixing-signals-in-the-z-state) - [Tracing the Bug Through the Signals Driving the Problematic Signal](#tracing-the-bug-through-the-signals-driving-the-problematic-signal) - [Fixing the Logic of the Problematic Signal](#fixing-the-logic-of-the-problematic-signal) - [The Problem of Undeclared Signals](#the-problem-of-undeclared-signals) - [Independent Exercise](#independent-exercise) ## Bug Hunting Algorithm 1. The process usually starts with a message in the test log (nobody manually inspects a waveform of a complex project with thousands of signals changing millions of times per microsecond), but in our lab assignments with relatively simple modules, this step may sometimes be skipped. The log message typically contains the following key information: the name of the signal that received an incorrect value, and the time at which this occurred. The better the testbench is written, the more useful information the message will contain — writing a good testbench is something of an art. 2. Having the signal name and timestamp, we go to the waveform and investigate the error. How do we do this? We need to determine from the code which signals drive our signal of interest and how. There are several possibilities: 1. The driving signals have correct values, but the logic by which they drive the target signal is wrong, causing it to receive an incorrect value. This is the ideal case — we immediately identify the root cause and fix it. 2. The driving logic is correct, but one or more of the driving signals has an incorrect value (let's call that signal `X`). This means the observed discrepancy is a symptom of some other error, and we must return to step 2, now investigating the sources of signal `X`. This repeats until we reach case 1. 3. Both the driving logic and the values of the driving signals are correct. This is the most complex type of error — it implies either a bug in the specification of the device being developed, or a problem in the EDA tool or its components. In the context of this course, you should not be concerned about such errors; if they arise, consult your instructor (after confirming that the error definitely does not fall under cases 1 or 2). 4. Any combination of the above. 3. Once the root cause is identified, we fix it (possibly extending the test suite or revising the specification) and rerun all tests to verify two things: 1. The bug is actually fixed. 2. The fix did not introduce new bugs. Let us practice these steps by debugging errors in the [project](./vector_abs/) that computes the approximate magnitude of a vector, as described in the "[Project Manager](./03.%20Project%20manager.md)" document. ## Working with the Log When Errors Appear After running the simulation, we see multiple errors in the log: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_01.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_01.png) _Figure 1. Example of test error messages._ When faced with many errors, always start with the very first one, since it may be the cause of all the others. Scroll the log to find the first error: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_02.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_02.png) _Figure 2. Example of a specific test error._ The log states that at time `5ns`, the circuit received vector coordinates of `0` and `0`, the reference model computed a vector magnitude of zero, while the circuit returned `x`. ## Locating the Bug on the Waveform Let us find this location on the waveform. Immediately after a simulation run, the waveform typically shows the point where the simulation stopped (possibly at an inconvenient zoom level). First, adjust the zoom so that the entire waveform fits in the window. This can be done by right-clicking in the signal display area and selecting "Full View", by clicking the corresponding button on the waveform toolbar (see _Fig. 4_), or by pressing `Ctrl+0`. Then find the approximate location near the time of interest, place the cursor there, and zoom in (scroll with the mouse wheel while holding `Ctrl`), periodically adjusting the cursor position until you reach the location of interest. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_03.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_03.png) _Figure 3. Example of the waveform immediately after the simulation stops._ ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_04.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_04.png) _Figure 4. Example of fitting the entire waveform into the current window._ ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_05.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_05.png) _Figure 5. Example of the waveform after adjusting the zoom._ ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_06.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_06.png) _Figure 6. Placing the cursor at the start of simulation so that zooming in converges toward the beginning._ ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_07.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_07.png) _Figure 7. Waveform zoomed in to the error time from Fig. 2._ We see exactly the information the testbench reported. Now we need to determine what is causing the X-state. This can happen for several reasons, including: 1. One of the signals driving this signal is in the `X` or `Z` state. 2. Two signals are simultaneously trying to drive the target signal to different values. 3. This signal is a module output but was declared with the `input` keyword. ## Opening the Source File of the Problematic Signal In any case, the first step is to identify the source that drives the value of signal `res`. Open the source file where this signal is defined. To do so, right-click the signal name in the waveform and select `Go To Source Code`: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_08.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_08.png) _Figure 8. Navigating to the declaration of the "problematic" signal._ The code shown in _Listing 1_ will open (with the cursor on the line `logic [31:0] res;`): ```Verilog module tb_vector_abs(); logic [31:0] a; logic [31:0] b; logic [31:0] res; vector_abs dut( .x(a), .y(b), .abs(res) ); //... ``` _Listing 1. Beginning of the simulated testbench code._ Selecting `res`, we see it is also highlighted in the line `abs(res)`. This means we connected our wire into the `dut` instance of module `vector_abs`, and the problem is of the second type (there is no error in the logic of wire `res` itself — it received an incorrect value because that value was passed to it from the inside). This can be confirmed by pulling the signals of module `vector_abs` onto the waveform. To do this, switch to the `Scope` window, which shows the hierarchy of simulated objects. ## Adding Object Signals to the Waveform > [!IMPORTANT] > Note that the `Scope` window hierarchy shows instance names, not module type names. In the code listing above, we created an instance of module `vector_abs` named `dut`, so inside the top-level module we see the object `dut` (not `vector_abs`) in the `Scope` hierarchy. The same applies to all nested instances. Select the `dut` object. The `Objects` window on the right will display all internal signals (ports, internal wires, and registers) of the `dut` instance: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_09.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_09.png) _Figure 9. Displaying the internal signals of the module under test._ We can already see that the `abs` output (which is connected to our wire `res`) is in the X-state, but for the sake of practice, let us walk through how to add new signals to the waveform. There are two ways: 1. Add all signals visible in the `Objects` window from the `Scope` window: either drag the desired instance onto the waveform while holding the left mouse button, or right-click the instance and select `Add to Wave Window`. 2. Add individual signals from the `Objects` window: select them (multiple selection via `Shift` or `Ctrl` modifiers), then either drag them onto the waveform or add them via right-click. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_10.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_10.png) _Figure 10. Adding module signals to the waveform._ ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_11.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_11.png) _Figure 11. Result of adding module signals to the waveform._ As a project grows in complexity, the number of signals on the waveform increases, which makes signal grouping important. To group signals together, first select them. This can be done in two ways: 1. Left-click each signal of interest while holding `Ctrl`. 2. For a range of signals, click the signal at one end, then hold `Shift` and click the signal at the other end of the range. After selecting, right-click the highlighted signals and choose `New Group` from the bottom of the context menu. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_12.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_12.png) _Figure 12. Example of creating a signal group (the context menu has been cropped for clarity)._ After creating the group, assign it a name. When all signals belong to the same module, it is convenient to name the group after that module. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_13.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_13.png) _Figure 13. Example of a created signal group._ The group can be collapsed and expanded by clicking the corresponding arrow to the left of the group name. > [!IMPORTANT] > Notice that some signals display a value (the `abs` signal shows an X-state) while others show nothing. This is because wire `abs` is **continuously connected** to wire `res`. From the simulator's perspective, they are the same entity, and when the simulator recorded values for `res` during simulation, it implicitly recorded them for `abs` as well. This does not apply to the other signals that were not present on the waveform during the simulation run. ## Restarting the Simulation and Setting Simulation Time To obtain the missing values of the newly added signals, we need to repeat the simulation. To do so, reset the simulation time to 0 and run it again. Click the `Restart` button (`|◀`) on the simulation toolbar, then click `Run all` (`▶`) or `Run for` (`▶t`). The button positions in the Vivado window are shown in _Fig. 14_. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_14.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_14.png) _Figure 14. Location of the simulation control buttons in the Vivado window._ Simulation control toolbar buttons: 1. `Restart`, keyboard shortcut: `Ctrl+Shift+F5`; 2. `Run all`, keyboard shortcut: `F3`; 3. `Run for`, keyboard shortcut: `Shift+F2`; 4. `Relaunch Simulation`. `Run for` runs the simulation for the specified amount of time, after which the simulation pauses. The simulation can also be stopped manually or by calling the appropriate instruction from the test code. `Run all` differs from `Run for` in that it runs indefinitely and stops only when manually interrupted or when the appropriate instruction is called from the test code. > [!IMPORTANT] > To populate the missing values for newly added signals, it is best to follow the procedure described above. A similar result can be achieved by clicking `Relaunch Simulation`, but this command takes longer and is unnecessary if you have not modified any source files. Additionally, to prevent the cursor and log from jumping far away from the first error, you can specify the desired simulation time before clicking `Run for`: `5ns`. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_15.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_15.png) _Figure 15. Example of simulating 5 ns._ _Fig. 16_ shows the simulation result with the new signals. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_16.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_16.png) _Figure 16. Result of re-running the simulation after adding new signals to the waveform._ We see two signals in the Z-state and one signal in the X-state. Signals in the Z-state are usually the easiest to fix, as they typically indicate a forgotten or incorrect wire connection. Furthermore, a signal that depends on a Z-state signal may itself end up in an X-state — so fixing the Z-state issue might resolve our problem. Let us inspect wires `min` and `min_half`. Start with `min` and go to step 2 of our algorithm (right-click and select `Go To Source Code`): ```Verilog module vector_abs( input logic [31:0] x, input logic [31:0] y, output logic [31:0] abs ); logic [31:0] min; logic [31:0] min_half; max_min max_min_unit( .a(x), .b(y), .max(max), .min(min) ); //... ``` ## Fixing Signals in the Z-State We can see that signal `min` is connected to the `min` output of the `max_min_unit` instance of module `max_min`. Let us add the signals of this module to the waveform. To do so, expand the list of objects inside the `dut` instance in the `Scope` hierarchy and select the `max_min_unit` object. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_17.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_17.png) _Figure 17. Adding signals from a submodule to the waveform._ Add the internal signals to the waveform, group them under the name `max_min`, and re-run the simulation. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_18.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_18.png) _Figure 18. Result of adding and grouping signals from the `max_min` submodule._ Something strange happened: all internal signals of the `max_min_unit` instance are "green" (no X or Z states), yet the signal `min` connected to this module's output is in the Z-state. How is that possible? If you look closely at the `min` signal in the Z-state, you will notice that its least significant digit is not in the Z-state but shows `0` — the same value shown by the `min` signal of the `max_min_unit` instance. Interesting. Looking even more closely at these two signals, you can see that the `min` signal of the `dut` instance is 32 bits wide, while the `min` signal of the `max_min_unit` instance is only 4 bits wide. This is the problem: we connected the 4 bits of a 4-bit `min` signal to the lower 4 bits of a 32-bit `min` signal, leaving the remaining bits unconnected. Apparently, when writing the `max_min` module, the width of the `min` signal was specified incorrectly: `3` was written instead of `31`. Let us fix this and re-run the simulation. > [!IMPORTANT] > Note that since we modified the source code, this time we must click `Relaunch Simulation` to trigger recompilation of the project. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_19.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_19.png) _Figure 19. Simulation result after fixing the bit-width of signal `min`._ The log now reports 102 errors — exactly one fewer than before. This does not mean there are 102 bugs remaining; it simply confirms that fixing this particular bug actually changed something, and one test scenario that previously failed now passes. Keep in mind that when a project has many bugs, some bugs may be masking the effects of others (two wrongs can sometimes make a right in the context of bug interactions), so be cautious about relying on the error count when it is greater than zero. Let us look at the waveform again and decide on the next steps: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_20.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_20.png) _Figure 20. Waveform after fixing the bit-width of signal `min`._ We see that no signals remain in the X or Z state, meaning we have collected all the "low-hanging fruit" in our investigation. Let us return to the scene of the crime and look for new clues: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_21.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_21.png) _Figure 21. First error in the new simulation log._ ## Tracing the Bug Through the Signals Driving the Problematic Signal The first error in the log is now different from before. Previously, the first incorrect result appeared at time `5ns` with inputs `0` and `0`; now the first error occurs at `10ns` with inputs `1` and `1`. Our circuit computes the result as `3`, while the reference model says it should be `1`. Let us compute it manually to verify the model: To approximate the Euclidean magnitude of a vector (i.e., the hypotenuse of a right triangle, equal to the square root of the sum of the squares of its legs), we use the formula: `sqrt(a^2 + b^2) ≈ max + min/2`, where `max` and `min` are the larger and smaller of the pair, respectively [**Richard Lyons: Understanding Digital Signal Processing, p. 475**]. Substituting our values (since both numbers are equal, it does not matter which is max and which is min): ```text 1 + 1/2 = 1.5 ``` So neither the model nor the circuit is correct? Actually, our device supports only integer arithmetic, so the result is: ```text 1 + 1/2 = 1 + 0 = 1 ``` The model correctly accounted for this property of our device and produced the correct result. So we need to look at how the result is computed inside our circuit. Let us inspect the `abs` output in module `vector_abs`: ```Verilog assign abs = max + min_half; ``` The `abs` output depends on two internal signals: `max` and `min_half`. According to our algorithm, the problem is either in the logic connecting these two signals (the addition operation), in the value of one of them, or a combination of both. Examining the module, we conclude that the assignment logic is correct — it implements `max + min/2` by adding the maximum to half the minimum. So the problem must be in the value of one (or both) of these signals. Let us compute the expected values ourselves (in a complex project, this would be done by the reference model): `1` and `0`. Now let us check the actual values of `max` and `min_half` at time `10ns`. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_22.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_22.png) _Figure 22. Values of signals `max` and `min_half` at time `10 ns` (signals of interest highlighted in green)_ > [!IMPORTANT] > Note: you can change the colors of waveform signals through the context menu of the selected signals. We see that at time `10 ns`, the values of `max` and `min_half` transition as `1 -> 4` and `2 -> 8` respectively. We are interested in the values `1` and `2`, since at time `10ns` the circuit's output still holds the settled result for the previous inputs (the output for the new inputs has not yet been computed). The value `max=1` matches the expected value, while `min_half=2` clearly does not. We have identified the cause of the incorrect result: indeed, `1+2=3`. Now we need to locate the bug in the computation of signal `min_half`. As with signal `abs`, we need to identify the signals that drive `min_half`. This signal is connected to the `quotient` output of the `half_divider` module, so let us inspect its source code: ```Verilog module half_divider( input [31:0] numerator, output[31:0] quotient ); assign quotient = numerator << 1'b1; endmodule ``` What does this module do? It receives a value and divides it by two. The minimum value from our formula is fed to its input. The output of this module depends on the `numerator` input and a left-shift-by-1 operation. So the problem is either in the logic or in the value being fed to the input. Let us add the `numerator` signal to the waveform and check its value at time `10ns`. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_23.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_23.png) _Figure 23. Value of signal `numerator` at time `10 ns`._ We recall that when the circuit started producing an incorrect result, the inputs were `1` and `1`, so the `numerator` received the correct value: the minimum of the two numbers is indeed `1`. Let us now check the module's logic. ## Fixing the Logic of the Problematic Signal Division is a very "expensive" operation in digital logic in terms of resources and critical path, so it is often avoided. In our case, we do not need general-purpose division — we only need to divide by two. In binary arithmetic, dividing a number by two is equivalent to discarding its least significant bit. You routinely do the same in decimal arithmetic when dividing by 10: you simply drop the last digit. This is exactly why our first manual calculation differed from the model: dividing 1 by 2 gives 0.5, but discarding the last digit rounds the result down (1/2 = 0, 15/10 = 1). How do we "discard a digit" in digital logic? We use the right-shift operation. The right-shift operator in **SystemVerilog** is `>>`. The number of bits to shift (i.e., digits to discard) is specified to the right of the operator — in our case, 1. But wait — the assignment currently uses the `<<` operator. That is the bug; let us fix it! Re-run the simulation. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_24.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_24.png) _Figure 24. Simulation result after fixing the shift operator._ One fewer error again. Do not be discouraged — the number of bugs in a project is unlikely to exceed the number of non-empty lines in the code. Let us return to the first error: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_25.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_25.png) _Figure 25. First error in the re-run simulation._ We have now advanced the error-free simulation time to `15 ns`. Let us start our investigation from the beginning: Inputs `3` and `4` are applied to the circuit. The circuit thinks the result of `max + min/2` is `2`, but the model says it should be `5`. Let us compute manually: ```text max=4 min=3 max + min/2 = 4 + 3/2 = 4 + 1 = 5 ``` Once again, the model produced the correct result. Let us examine the values of the signals that form the `abs` output. ## The Problem of Undeclared Signals By this point, the waveform likely has many signals. Remove the unnecessary ones, keeping only the internal signals of module `vector_abs` (select the unwanted signals and delete them with the `Delete` key). ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_26.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_26.png) _Figure 26. Behavior of internal signals of module `vector_abs` on the waveform._ It is immediately apparent that signal `max` looks different from all the others — it behaves like a 1-bit signal. If all other signals are 32-bit, `max` should be as well. Let us navigate to the declaration of this signal to fix it (right-click and select `Go To Source Code`): ```Verilog module vector_abs( input logic [31:0] x, input logic [31:0] y, output logic [31:0] abs ); logic [31:0] min; logic [31:0] min_half; max_min max_min_unit( .a(x), .b(y), .max(max), .min(min) ); //... ``` This is strange — the cursor was placed on the line `.max(max)`, whereas previously it was placed on the line where the selected signal was declared. The reason is that if we look through the file carefully, we find no declaration of this signal at all. How did we use an undeclared signal without the EDA tool reporting an error? The [IEEE 1364-2005](https://ieeexplore.ieee.org/document/1620780) standard for **SystemVerilog** permits this usage. In such a case, the synthesizer implicitly creates a 1-bit signal with the same name — which is exactly what happened. To fix this error, declare the signal `max` with the correct bit-width and re-run the simulation. ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_27.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_27.png) _Figure 27. Simulation result after declaring the missing signal._ ## Independent Exercise The error count dropped to 40! We are clearly on the right track. Repeat the previous steps, returning to the first error: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_28.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_28.png) _Figure 28. First error in the re-run simulation._ This time the first error is the same, except now the circuit computes the result as six (previously it returned `2`). We have already confirmed that the model gives the correct result here, so let us go straight to the signals that form the output: ![../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_29.png](../.pic/Vivado%20Basics/05.%20Bug%20hunting/fig_29.png) _Figure 29. Behavior of internal signals of module `vector_abs` on the waveform._ We can see that the value of signal `min_half`, which contributes to output `abs`, is incorrect (the minimum of `3` and `4` is `3`, and `3/2 = 1`). Looking closely, we also notice that the value of `min`, which drives `min_half`, is incorrect: it is `4`, but should be `3`. Using the source files of the [project](./vector_abs/), try to identify the last bug we found.