Category: Uncategorized

Lean-Agile Principles in IC Design, Part 2 – How to be “Minimal and Small”

September 13, 2023 / Kevin Zheng

In part 1 of this series, we introduced the most popular modern project management methodologies and relevance to IC design. More importantly, we identified the most common IC design wastes and discussed ways to address them.

Let’s bring the discussion to the “agile” side. To recap, agile principles were originally intended for software development, and they put more emphasis on continuous and incremental additions to the final products. On the IC design side, Prof. Elad Alon and his team spearheaded the agile hardware movement (here is his great talk). So we will start by looking at some details in their work.

Agile equals “codify”?

Let’s address the elephant in the room: it seems “agile hardware” just means finding ways to codify IC design process, exemplified by Chisel and BAG. Are we becoming programmers?

Chisel’s goal is to further abstract Verilog. “Abstract” perhaps isn’t the best choice here. The better analogy might be that Chisel is to Verilog like Python is to C. Chisel is more friendly for building complex neural network SoCs, like Python is better at building neural network models with PyTorch.

BAG offers a new paradigm for analog/mixed-signal designers, in which designers capture their design flows in Python code. The generator also comes with an automatic layout drawer, which to many is the most attractive feature in modern process technologies. A new framework called TED was published recently, an interesting read given the context of BAG. Shown below is an example of how BAG works.

So what’s wrong?

I personally have mixed feelings about all this. While I truly hope we can shorten IC design cycles to that of software, we could be going down another rabbit hole if we simply agree to codify everything.

To play devil’s advocate, here are my challenges:

To be fair, BAG’s goal has always been to capture design flows, i.e., how we design, not what we design. However, there is a huge assumption here: our current design flow is worth capturing. We might be opening up another can of worm in picking a design flow, which could lead to more up-front cost.
Proposed generators tend to focus more on “well-established” circuits, like bandgaps, op amps, diff pairs, and non state-of-the-art converters and SerDes. Given the ocean of available IPs within companies, the supposedly huge savings from using generators become diluted quickly in larger cutting-edge projects.
We could be losing the visual (artistic) component, arguably the biggest benefit of traditional IC design flow. A schematic is worth a thousand lines of code. Transferring knowledge with well-drawn schematics is orders of magnitude more efficiently than through code in my humble opinion.

Are we ahead of ourselves?

Here is a good article on the current state of agile hardware development: Does SoC Hardware Development Become Agile by Saying So: A Literature Review and Mapping Study. I believe frameworks like Chisel and BAG are ahead of their time. They could very well be the ultimate form of IC design, but I can’t help but feel there is a missing step in between. Factories become fully automated after assembly line workers follow a manual or semi-automated process consistently first. To me, the entire IC design field hasn’t really reached the assembly line state yet.

Regardless, the agile hardware movement still provides valuable contributions to how we should approach design processes. Codifying IC designs should be the byproduct of the agile hardware movement. Combining with some principles in Eric’s Lean Startup, the following sections outline what we can start doing today towards creating the final IC factories.

Minimum Viable Product (MVP) in IC design

If you haven’t heard it yet, it’s a term made popular by the Lean Startup methodology. In contrast to launching polished products, MVPs are launched with just enough features to run experiments, test the market, and learn from feedbacks. A famous example is Dropbox, whose MVP was just a video demonstrating how the product is supposed to work before a working prototype is ready.

“This is madness! How can we ship MVP chips?” Well, we just have to redefine who IC designers’ customers are before applying the MVP concept. Our customer should be whoever takes our block into the next integration level. If you are building a regulator, your customer is the higher level designer who uses your regulator. If you are a mid-block owner, your customer is the integration person who puts the system together and slaps an IO ring around it. In the case of unfortunate graduate students, you are your own customer.

The FFPPA Pyramid

In order to define an IC product, five meta requirements are usually necessary – function, feature, power, performance and area. People are most familiar with the last three (abbreviated PPA), but I always throw function and feature in the mix. Below is a pyramid depicting what I believe is the typical order of importance for these criteria, with bottom being most important.

The key message here is there exists a pyramid like this for every project (with a slightly different order). We can certainly shuffle around the blocks on top. Some projects might have a killer feature, while others might have non-negotiable power/area constraints. However, FUNCTION should always be the foundation (I once taped out a very energy/area efficient noise generator when I wanted to build an ADC). After we have identified the pyramid, we can start building the MVP from bottom up.

MVPs during design cycle

OK, actually I lied. Your MVP doesn’t even need to functional. Recall from our top-down methodology discussion, symbols themselves already contain enough useful information and could serve as an MVP for your “customer”.

Starting with critical pin definitions is actually the first step toward a functional block. Let’s use an OTA as an example and see how its MVP form can evolve in an agile framework.

Voila! Here is your OTA MVP. You can pass it on to the higher block owner for drawing his/her schematics. Note that being an MVP doesn’t mean it’s a low quality symbol. In fact, we should apply the same standard as if it were the final version.

The MVP’s next version should then have some functions to it, either with ideal models, basic transistor topologies or a mix of both. For a perfectly functional OTA, all we need is an ideal transconductor and a large resistor as its output impedance. The beauty here is that the most critical specifications are already in the model (gm, gain, etc.). MVP here becomes MSP – Minimal Simulatable Product.

Finally, we can start building some real circuits. A natural question to ask is then why don’t we jump straight to this real circuit implementation. For a simple OTA, perhaps the MVP process isn’t that necessary, but the benefits will be amplified for larger scale blocks.

The key advantage of using MVPs is that we barely meet the need the “customer” so that his/her own work can continue. Throughout the process, we maintain a version of the circuit that’s usable in schematics and simulatable to various degrees of accuracy. The same idea applies when layout efforts begin. Layout pin placement is critical for floorplanning; dummy and decap devices can be added incrementally; power down and over-stress protection features could come in next. All these incremental additions happen according to the FFPPA pyramid that we agreed on earlier.

Tape-in and incremental additions

If you feel the example above is too simple to make a compelling point, here is an excellent project by the Berkeley Agile team for a RISC-V processor: An Agile Approach To Building RISC-V Microprocessors.

The figure below taken from the paper should summarizes the essential difference in an agile model. Throughout the entire process, each layer creates an MVP for their own customer layer. For instance, the Design layer only finishes F1 to the Implementation layer, and incrementally adds more features down the line.

My favorite idea in this example is the concept of a “tape-in”. In the authors’ own words, it’s “a trivial prototype with a minimal working feature set all the way through the toolflow to a point where it could be taped out for fabrication”. A “tape-in” is perhaps the closest thing to being a real MVP in the IC design context. In fact, I believe the sooner a project reaches the “tape-in” stage, the more likely it will succeed. The positive psychological impact of seeing a potential path to completion is huge.

Even though AMS design cycles are more manual than digital/SoC flows, the “tape-in” concept is equally applicable. Designers all have a tendency to sit behind closed doors and polish our cherished designs until the last second. However, there is no need to finish all features, include programmability, or even meet power/area budget in one go. Rather, we could constantly push out “releases” of your design and optimize incrementally. This requires no coding but a shift in mindset, which is agile design model’s true essence.

The Power of “Small Batches”

Another lesson in lean thinking is about the power of “small batches”. The line between “lean” and “agile” is a bit muddy on this one for me. Here is the entertaining example Ries used in his book:

A father and two daughters (six and nine) decided to compete on how fast they can stuff envelopes. The daughters went about the intuitive way – label all envelopes, then stamp all envelopes, then stuff all envelopes. The rationale is sound: repetition at each task should make us more focused and efficient. On the other hand, the father’s strategy was to label, stamp, and stuff each envelope individually before moving onto the next one.

The father ended up winning, not because he is an adult, but because of the often overlooked costs in the waterfall approach the daughters took. They miscalculated the effort required moving from one big task to the next, as well as the cost of mistakes found later in the process. Sounds familiar? These are precisely the challenges we face today.

There is no real data to back up the following cartoon graph, but this could very well be what’s happening here. While traditional “large batch” approach is good enough for older designs, it’s evident that we have already crossed a “critical N” line in today’s designs. The crossover that favors “small batch” approach happens when the cost of hand-offs and errors start to dominate in the entire process.

Small batches in AMS design

The agile model in the RISC-V example essentially pushes features into the final design in small batches. When the design process is multi-layered in nature, small batches allow parallelism that creates almost no gap between design tasks. As a result, the same principles should be applicable to AMS designs.

Below is a typical representation of modern AMS design cycles. It has four layers of main tasks, and most of our pain points lie in the layout and post-layout verification steps. As a result, the feedback latency from post-layout results to schematic/layout modifications is too long (i.e. red arrows). Designers’ hands are often “tied” until layout or simulation finishes. Does the code compiling comic above ring any bells again?

Now if we apply the small batch principle to redraw this design cycle, it would look like the following diagram. We can break the monstrously large design cycle into several much more manageable smaller design loops.

It is easier said than done. We as designers need to develop a new skillset to create such a collection of small loops. We need to create schematic hierarchies with clean interfaces, understand dependencies among sub-cells, and most importantly have a good feel about layout and verification efforts. Like all skills, designing circuits with small batches require constant practice, from block to block, from tape-out to tape-out. When done right, the efficiency brough by the small batch approach is so so sweet.

Final Thoughts

I just finished a tape-out at the time of writing. Did I follow the lean/agile principles I wrote about here? If I were honest, no.

We are all creatures of habit, and the bad ones stick around the longest. I plead guilty for sometimes getting carried away in making non-MVPs in large batches. Nevertheless, there are other times when lean/agile methodologies help save huge efforts and resources when the unexpected happened.

Like all of you, I am continuously learning and trying different ways to become a better designer. Lean and agile principles have found their roots in modern design methodologies. They proved to be valuable in reducing design wastes and handling changes. I look forward to hearing about your design methodologies. The day when we all agree on a systematic design flow is the day we can truly start codifying IC designs.

Testbench Templates – How To Reuse and Boost Simulation Efficiency

June 21, 2023 / Kevin Zheng

If my recollection doesn’t fail me, this was about ten years ago. I walked into the lecture hall for EE313, full of excitement. The digital CMOS course was taught by Prof. Mark Horowitz. There has always been some “deity” status attached to the man. As a self-proclaimed analog design student at the time, I was anxious to learn what the “other side” was about.

Halfway into the lecture, Prof. Horowitz began a “sales pitch” of something called CircuitBook. He was candid about us being the genuine pigs for a testbench framework that his Ph.D. student was working on. The goal was to create a reusable analog test solution stack using only Python and SPICE. With a more software approach, the unified framework hopes to “hide” all the possible diverse variations in analog test environments. I distinctly remember feeling a bit confused: why would I ever want to code my testbenches for analog?

Figures taken from the CircuitBook thesis. The framework attempts to use Python for unifying almost all components in a test (top).

Years later after I entered industry, my mind always circles back to this moment. I obviously didn’t get the full picture as a graduate student then. However, CircuitBook has aged like fine wine the longer I work in IC design. The idea sounds better each day when I need to simulate a new circuit or look at others’ testbenches. So why didn’t it take off? Here are the reasons in the author’s own words:

We have found that one of the main challenges with the CircuitBook test framework has been convincing users to adopt the system. We believe this can be attributed to the initial learning required to be productive. The CircuitBook test framework does not significantly speed up the time required to make a new test for first time when the time to learn the framework is included. The productivity gains come from reusing the resulting test collateral for future variants. Users are often concerned more about the task at hand than future benefits, so our current test framework may not be attractive to time-constrained designers.
James Mao, “CIRCUITBOOK: A FRAMEWORK FOR ANALOG DESIGN
REUSE”

No truer words have been spoken. The author went on to discuss filling the framework repository “with reusable test components”. The framework could also “provide building blocks that allow users to quickly construct tests for a particular circuit class”. The proposed strategies aim to make adoption easier. I believe what he alluded to could already be realized in schematic capture tools today – in the form of testbench templates.

Creating reusable testbench cells

As I have mentioned in a previous post, a wrapper might be necessary if it’s used again and again. For example, supplies are perhaps instantiated the most number of times, but we hardly think about creating reusable supply cells. The same applies to input stimulus for DC/AC/transient simulations.

One tip is to parametrize these cells as much as possible (e.g. using pPar) to avoid creating too many variations. It certainly doesn’t provide the full flexibility of coding in Python, but it should be good enough for most testbenches.

Here we discuss a few cases for some most common testbench cells

1. Supplies

For any given project, supply domains are typically agreed upon first. Many mixed-signal circuits require multiple supplies nowadays for optimal performance and power. Grounds could be separately on-chip to provide isolation between analog and digital lands. It begins to make more sense to create a dedicated reusable supply cell for all testbenches like below.

The cell itself is not that fancy: the simplest form involves just ideal voltage sources. Parametrization is what makes it more interesting and powerful. One example is to parametrize each source’s DC voltage for editing at the testbench’s top level.

You might spot in the figure above that the cell uses 0V voltage sources to create ground nodes. Isn’t this redundant? If the goal is to simply break the ideal ground into different names, ideal 0 Ohm resistors can also do the trick. The key here is to allow more parametrization for simulations like power supply rejection ratio (PSRR). One can parametrize the AC magnitude on ground net with pPar(“avss_ac”) for example. We can then configure the supply cell to perform such simulations without any new setups. The same applies to other sources.

Simple models for supply impedance is the natural next step. Each supply can have a RLC network in series to mimic bond wires and package traces. Keep it fully parameterizable for full flexibility.

To make it more user friendly, remember to include default values to save time when initiating this cell. When the supply cell is first initiated in a testbench, its property list and default values might look like below. Note that the default values could also be strings, so some parameters can become design variables automatically.

Example parameter list and default values

2. Input stimulus

By the same token, input stimulus can also be parametrized and made more “general”. We are interested in differential inputs in most applications. My preferred way is using ideal baluns (for reasons that Ken Kundert also wrote about it here).

Depicted above is one simple example of such a differential stimulus cell. For DC and AC sims, parameterizing the differential and common mode DC/AC source values and impedances should cover most cases.

Building on top of this, different variations are possible for transient simulations. I am a broadband signals guy, so a pulse stimulus (i.e. Vpulse or Vpwl) is often my first choice for simulating pulse responses. If you are a narrowband or converter person, Vsin might be your cup of tea. A differential clock can also be generated by using Vpulse and setting vdiff and vcm correctly. All of these sources still have the DC/AC fields, so they remain compatible with DC/AC simulations.

One can in theory build a much general stimulus cells with all possible sources and an analog mux for selection, but the return on investment starts to diminish. I recommend simply creating cells for the most common inputs like stimulus_sine, stimulus_pulse, stimulus_clk, etc., and keep the number of parameters manageable.

Below is a more elaborate version of this stimulus cell. Other features like AC-coupling and external source select can be included. Most of these features can be realized with ideal resistors and math if you don’t wish to write VerilogA modules.

3. Probing and measuring

Testbenches are the only places where we can build perfect analog computers, so let’s take advantage of this.

Here is another awesome use of ideal baluns: they are “bi-directional” and can measure differential and common mode signals. Instead of post processing simulation results, you can put in these balun based probe elements to calculate differential and common modes during simulations.

(a) Differential and common mode to complementary signal conversion. (b) Differential and common mode measurements from complementary signals

Since impedances are also transparent through baluns (like in stimulus cells), ideal voltage buffers can help isolate the probe signals in case some loads are accidentally attached.

So why go through all the trouble of using such a probe cell? Again, this probe element here opens up a new idea: pre-process signals in simulation to simplify post-processing expressions. Some post-processing expressions can become really unreadable really fast (parenthesis nightmare anyone?). I find the measurement expressions easier to follow when I don’t have to jump over several hoops to trace where each signal or variable is defined. Thus, don’t limit your imagination to just this simple probe cell. Start encapsulating some measurements you do repeatedly in reusable cells.

One simple but powerful example is a power meter (do you get the pun?). Try building one using a series voltage source, a current-controlled voltage source (ccvs) and a multiplier. A more complicated example is to build DACs + de-interleaver for combining interleaved ADC outputs. Rather than saving all ADC slices’ outputs and post process in MATLAB or Python for FFT, the combined output is already available as a saved net. Here is another benefit: one could check results during simulation to ensure the signals look healthy. Stop the run in case something doesn’t look right instead of waiting until the long simulation finishes.

4. Deep probing

But what about nets that are inside the DUT? Luckily, we can use deepprobe to bring nets deep in the hierarchy up to the top level. Deepprobe also allow you to modify loading on the internal nets of interest. So one way to probe internal differential nets is as follows

Deepprobes with balun based differential probe cell

Unfortunately, I haven’t found a way to wrap this into a “diff_deepprobe” cell. I am still looking for answers here, but for now this cell group work just as well.

One disclaimer here is that this shouldn’t be a full replacement of your typical save statements. Rather, one should use this probe strategicaly on critical net for better readability (for your own or others’ consumption). Of course, there is a personal preference to this, but I find this approach more attractive than reading netlists and save files due to direct visual feedbacks.

5. Verilog-A modules

Last but not least, you can perhaps create all of the above (and more) if you are proficient in Verilog-A. In addition to what’s already available in standard libraries, most teams might already have a separate well-maintained Verilog-A library. They could have countless hidden gems like digital constant cells, frequency meter or bias gen models. Do spend some time to study them and/or create a couple modules that help with your own simulation flows.

Creating testbench templates

With an arsenal of these reusable cells (not that different from the software modules proposed in CircuitBook), we could start building testbench templates. In essence, they are the bases and starting points for certain simulations of particular circuits.

Below is what a general purpose testbench template might look like. The template contains a supply cell, sine wave transient input stimulus, a power meter, differential probes (including deepprobes), and a digital attribute cell. The default parameter values for each cell should require minimal changes to start some quick DC/AC simulations. The probe and digital attribute cell is there for quick usage reference, further modifications, or duplications. It’s always easier to delete/copy-paste on the same schematic sheet than instantiating new cells. A reasonable simulation state should also be available (e.g., corner setups, signal saves and reference measurement expressions). Overall, the template should provide the essentials to shorten the time to hit that green run button.

Expanding upon this principle, we can build specialized testbench templates for well known characterizations on certain classes of circuits. Here is a template for simulating the regeneration time constant of a dynamic comparator. Input and clock sources are provided. Probe names are already filled in for the regeneration nodes. What’s cool is that the template can also have instructions for DUT instantiation (just like reading through some code comments). The simulation state should already contain critical measurement expressions based on the template (in this case some exponential time constant calculations).

Example testbench template for dynamic comparator regeneration time constant simulation

To keep the momentum going, here is a rapid fire round for template ideas: op-amp characterization, amplifier noise, oscillator phase noise, general purpose feedback loop stability, converter ENOB, periodic steady state analysis, PSRR, power sequencing, … Can you think about others?

I hope the message has come across loud and clear: if we really think about the day-to-day simulations we run, we could exploit some similarities among testbenches to improve productivity. Testbench templates are pre-built schematics that give us a head start when starting simulations. Instead of drawing a new schematic each time, the process becomes finding the right template, create a copy, modify and run.

The million-dollar questions

I can already hear the skeptics yelling: is this really any better than frameworks like CircuitBook? Who should build and manage these templates? Do you seriously think we have the time to manage another library for testbench templates?

For industry designers, using testbench templates could become a way to preserve and pass along knowledge. From a student’s perspective, I believe having access to testbench templates speed up the learning process. These templates allow one to spend more time exploring the design space than fighting for the right setup. They are also more visually direct than reading through codes (a schematic is worth a thousand lines of code).

While some might argue that the struggles are part of the learning, we need to look no further than the open-source software community to see the flaws in this thinking. We have enough problems during the design phase as is. I see a full parallel between building these testbench cells and templates and open-source packages ready for use and modifications.

As to who should do all the “dirty work”, the answer is always graduate students and interns 😉. Jokes aside, I think it’s a “survival of the fittest” system, in which the best cells and templates will prevail (not that different from open-source again). Many teams might already require a full library cleanup after each tape-out. The downtime between tape-outs is the perfect gap for designers to massage these templates, explore new ideas for reusable cells, and improve methodologies.

Now that we have our own version of open-source for chip design underway, readable and reusable testbench templates are just as important as the designs themselves. There is no shortage of brilliant testbench tricks and setups by our community’s gurus. We just need a more systematic and straightforward way to democratize them.