Author: Kevin Zheng (Page 1 of 2)

System for Analog Designers, Pt. 1 – What Comes In and What Goes Out

When we hear “system” in IC design, normally two types pop in our heads – the billion(or trillion!)-transistor chips or the PCBs that host these SoCs. To be completely honest, I never really liked the term “SoC”. It forces us to think a system must have a processor, memory, a plethora of I/Os and much more to be worthy of this name. In reality, every component inside an “SoC” is a system by itself with many interconnected sub-blocks. This is even more so in the advanced CMOS era where mixed-signal processing and system-level co-optimization is crucial, even for a simple amplifier.

Tesla Dojo (left), Cerebras Wafer-Scale Engine (middle), 112Gbps receiver (right)

System thinking has never been an emphasis in analog design curriculums (granted there is just too much to cover). However, this causes designers to be often stuck in a weird place. They sometimes aren’t sure how the requirements come about or how their blocks fit in the system, . And yet, we have all witnessed the huge benefits when a designer understands signal processing and system concepts.

The modern digital-assisted analog or analog-assisted digital paradigms call for more designers who can think deeper about the incoming signals, block interfaces and architectures. These are what I believe to be the top 3 pillars in system thinking for analog designers, which we shall explore in more details in this post series.

The 3 pillars of system thinking

You can start practicing designing with a system mindset by asking the following 3 questions (and some sub-questions)

  1. Do I understand the nature of the signal coming into my block?
    1. Which signal characteristic is the most important?
    2. What is the worst case signal that I need to handle?
    3. Any signal characteristics that my circuit might be able to exploit?
  2. Do I understand my block‘s interface?
    1. Should I use an analog or digital output interface?
    2. Is my load really a resistor or capacitor?
    3. What does my block look like to others?
  3. Do I have the right number of loops?
    1. Should I use sizing or loops?
    2. Too few or too many loops?
    3. Does any loops interfere with any important signal characteristics?

The objective here is to develop a habit of challenging the architecture and circuit requirements, even if we are just “humble block designers”. Let’s dive deeper into the first two questions here (architecture and feedback deserves a post by itself) and learn about some of the key concepts and tools at our disposal.

What am I processing?

One of the first things we are taught was the Nyquist-Shannon sampling theorem. Analog designers have this “2x curse” in the back of our heads – somehow we always need to squeeze out twice the signal bandwidth in the frequency domain. Another trap we tend to fall into is ignoring the lower frequencies (also partly due to the 2x curse). The reality is that increasingly more applications and architectures simply don’t follow Nyquist sampling anymore.

For example, modern wireline links operate on baud-rate sampling. Sub-Nyquist sampling is paramount in some software-defined radios (SDR) and other compressive sensing applications. What enables these architectures is the understanding the difference between signal and information bandwidths. The goal of our analog circuitry has always been to preserve or condition the information contained in the signal. Reconstructing the entire signal waveform(i.e. Nyquist sampling) is just a super set of preserving information.

We should begin viewing our signal conditioning blocks all as Analog-to-Information Converter (AIC), a concept inspired by compressed sensing theory. I believe most of the problems can be reframed in the AIC context. In my own field of wired/optical communication, the overall channel’s inter-symbol interference (ISI), which in the conventional sense is bad for signal bandwidth, actually contains valuable information. A maximum-likelihood sequence estimator (MLSE) desires the right amount of ISI for the decoding algorithm to work.

Getting to know your signal?

I encourage all analog designers to grasp what information their circuits are trying to process first. Below are somethings to ask about the signal characteristics that impact the incoming information

  1. Is the information carried in a broadband (e.g. wireline) or narrowband (e.g. wireless) signal?
  2. Is there a huge discrepancy between the signal bandwidth and the information bandwidth? (e.g. we only care about the long delay times between very sharp periodic pulses, like ECG signal)
  3. Is the information in the signal levels, signal transitions, or both? (e.g. level encoded like PAM vs edge encoded like Manchester code)
  4. Is there any low frequency or even DC information? (e.g. any encoding on the signal that impact low frequency content?)
  5. Is the signal/information arriving continuous or sparsely? (e.g. continuous vs. burst mode)

A fun interview question

The discussion above might sound too high-level or even philosophical to some, so let me give an interview question type example (derived from a real world problem). Let’s say we have a transmit signal that looks like a “train of clocks” as shown below. The signal swing is relatively small and rides on a DC bias on the PCB. A huge DC blocking cap is used on board because the DC bias level is unknown. Your task is to design a receiver circuit for this clock train and preserve the clock duty cycle as much as possible.

The challenge here is a combination of the signal’s burst nature and the board level AC coupling. As a result, the chip’s input signal will have baseline wander, which is always a nuisance.

Our first attempt might be to use a comparator directly. The issue becomes how to set the reference voltage. There is no one reference voltage that can preserve the clock duty cycle for every pulse. The next natural thought is to google all baseline wander techniques out there to see if we can negate the AC coupling completely (then pull my hair out and cry myself to sleep).

Now, if we realize that the information in the clock actually lies in the edges and not the levels, there can be other possibilities. If the edges are extracted and a spike train is created like below, the new receiving circuit might be able to restore the levels from the spikes.

The simplest edge extraction circuit is actually just another AC coupling network, but the cutoff frequency needs to be high enough relative to the clock frequency. A level restorer could be conceptually a pulse triggered latch (w/ the right dosage of positive feedback). Congratulations, we just practiced Analog-to-Information conversion (high-passing to extract edges) and reconstruction (level restoration) and created a much simpler and more robust solution. In fact, the receiver would work equally well if the burst signal is PRBS like.

Exploit signal nature

System thinking in analog design often requires “thinking outside the box” and leads to “easier” solutions. The first step is to understand the information that we aim to process and pinpoint what we could exploit. Like the example above, we took advantage of the fact that the same information lies in the signal transitions as the levels. This led to a solution better suited for this particular application. While we should be proud of making complicated circuits work, we should take equal pride in simpler solutions born from a better understanding of the incoming signal.

What am I driving?

After figuring out what’s coming into our blocks, we now shift the focus to where the output signal is going, or more precisely block interfaces. One major source of frustration is when you tweak a block to perfection but trouble arises when plugged into the system. Either the load doesn’t behave as expected or your own block is the problematic load.

Perhaps everyone can relate to the cringe when seeing heavily Figure of Merit (FOM) engineered publications. Some new circuits are extremely power efficient provided that the input source is a $10,000 box with a wall plug. Needless to say, it’s important to fully understand our blocks’ interface so that we can design and simulate accordingly.

The impedance lies

There aren’t that many lies greater than “my block looks like/drives a pure resistor or capacitor”. While a block’s input or load impedance might look like a pure resistor/capacitor at certain frequencies, every realistic element has a frequency dependent impedance (Exhibit A). Overly relying on simplified R/C loads is another reason why sometimes we can’t trust frequency domain simulations too much.

My readers already know my love for inverters, so let’s take a look at the picture below. As a start, let’s say our circuit is driving an ideal inverter. There shouldn’t be any objection to say the input impedance looks like a capacitor. Fair enough.

Now let’s add a Miller capacitor in there. Right away, things become more complicated than meets the eye. In the case of a small Miller cap relative to the input cap, the Miller cap gets amplified by the inverter gain and one might approximate the input impedance still as a capacitor with a Miller multiplied component. However, if the Miller cap is big enough such that it begins to act as an AC short sooner, the load impedance now behaves as a resistive component because the inverter becomes diode connected (this is also intuition behind pole splitting in Miller compensation).

To be the lord of chaos, I will now throw in a LC-tank at the inverter’s output, and why not cascade another stage (and another and another). Have you lost track of what the input impedance should be yet? Don’t believe this is a real circuit? Here is the resonant clock distribution circuit for a 224Gb/s transmitter. I would feel very uneasy to use simple load capacitors when designing any intermediate stages.

Impedance modeling using nport

The habit of resorting to simple RC loads is not unjustified. They could certainly provide order-of-magnitude results and speed up simulations. However, as illustrated above, that doesn’t guarantee the block would act the same when plugged into a real system. As designers, we need to recognize this possible culprit and address it early on.

We don’t need to look far to see a better way to model our block interfaces. Signal and power integrity (SI/PI) experts have long figured out that every trace on a PCB is an n-port network.

We often forget the first thing we learned. Electronics 101 has prepared us for n-port modeling with Thevenin/Norton equivalent networks, and even a MOS transistor’s small signal model is network based. And yet, we rarely think about our own circuits as a network and having S-parameters. For some reason, S-parameters are synonymous with RF designs, but in reality there is mathematical equivalence between S-parameters and Y/Z parameters, making it applicable for all frequencies. S-parameters are popular simply because they are easier to measure in real life. The point is that S-parameter is a great modeling tool for linear circuits, and we should start utilizing it more.

Passing around *.snp files

The idea then is to have a new routine testbench that extracts the n-port model of our own or load circuits. The simulation is as simple as AC analysis, but provides the entire frequency dependent impedance information.

Most simulators have S-parameter analysis (some just as a special case of AC analysis). The interface between designers then becomes “.s2p” files, which could also have best-case/worst-case variants under different PVT conditions. Simulation time remains fast but accuracy improves dramatically. It serves as the perfect balance between using an ideal capacitor and using the extracted netlist of the next block.

In fact, your DUT can also be modelled as a .s3p, .s4p, etc. as long as we are most interested in the circuit’s linear behavior. The same S-parameter files are equally usable in specialized system modelling tools like MATLAB. Modeling active circuits with S-parameter is not something new, but a wheel definitely worth reinventing (check out this 1970 thesis after a simple search).

Limitations of S-parameter models

As you might have guessed, the key limitation to this S-parameter modeling approach is the linear circuit assumption. When nonlinear effects become important (e.g. input amplitude dependent impedance change), small-signal S-parameters could yield different results (but still much better than an ideal capacitor). While there exists a so-called Large-Signal S-Parameter analysis (LSSP), it falls under harmonic balance (HB) or periodic steady state (PSS) analysis, which means it truly focuses more on RF applications. In addition, S-parameters might be limiting when dealing with mixed-signal processing, like sampling circuits.

Nevertheless, I have found impedance/circuit modeling using S-parameters generally allow fast simulation time, better accuracy and less system level frustration down the line. In fact, Analog designers could also gain system insights when interfacing blocks through S-parameters. Give it a try!

Let’s take a small break

System thinking in analog design is a skill that is increasingly more important. Long gone are the days for building “general purpose” devices, and a good system solution require tailored circuits for each application.

First and foremost, we should understand our circuits are processing and their interfaces. I hope the examples discussed in this post open the door for some aspiring analog designers for adopting system mentalities. In the next post, we will move from the interface to the inside of each block, and talk about perhaps the most important architectural tool for analog designers – feedback. Till next time!

Lean-Agile Principles in IC Design, Part 2 – How to be “Minimal and Small”

In part 1 of this series, we introduced the most popular modern project management methodologies and relevance to IC design. More importantly, we identified the most common IC design wastes and discussed ways to address them.

Let’s bring the discussion to the “agile” side. To recap, agile principles were originally intended for software development, and they put more emphasis on continuous and incremental additions to the final products. On the IC design side, Prof. Elad Alon and his team spearheaded the agile hardware movement (here is his great talk). So we will start by looking at some details in their work.

Agile equals “codify”?

Let’s address the elephant in the room: it seems “agile hardware” just means finding ways to codify IC design process, exemplified by Chisel and BAG. Are we becoming programmers?

Chisel’s goal is to further abstract Verilog. “Abstract” perhaps isn’t the best choice here. The better analogy might be that Chisel is to Verilog like Python is to C. Chisel is more friendly for building complex neural network SoCs, like Python is better at building neural network models with PyTorch.

BAG offers a new paradigm for analog/mixed-signal designers, in which designers capture their design flows in Python code. The generator also comes with an automatic layout drawer, which to many is the most attractive feature in modern process technologies. A new framework called TED was published recently, an interesting read given the context of BAG. Shown below is an example of how BAG works.

So what’s wrong?

I personally have mixed feelings about all this. While I truly hope we can shorten IC design cycles to that of software, we could be going down another rabbit hole if we simply agree to codify everything.

To play devil’s advocate, here are my challenges:

  1. To be fair, BAG’s goal has always been to capture design flows, i.e., how we design, not what we design. However, there is a huge assumption here: our current design flow is worth capturing. We might be opening up another can of worm in picking a design flow, which could lead to more up-front cost.
  2. Proposed generators tend to focus more on “well-established” circuits, like bandgaps, op amps, diff pairs, and non state-of-the-art converters and SerDes. Given the ocean of available IPs within companies, the supposedly huge savings from using generators become diluted quickly in larger cutting-edge projects.
  3. We could be losing the visual (artistic) component, arguably the biggest benefit of traditional IC design flow. A schematic is worth a thousand lines of code. Transferring knowledge with well-drawn schematics is orders of magnitude more efficiently than through code in my humble opinion.

Are we ahead of ourselves?

Here is a good article on the current state of agile hardware development: Does SoC Hardware Development Become Agile by Saying So: A Literature Review and Mapping Study. I believe frameworks like Chisel and BAG are ahead of their time. They could very well be the ultimate form of IC design, but I can’t help but feel there is a missing step in between. Factories become fully automated after assembly line workers follow a manual or semi-automated process consistently first. To me, the entire IC design field hasn’t really reached the assembly line state yet.

Regardless, the agile hardware movement still provides valuable contributions to how we should approach design processes. Codifying IC designs should be the byproduct of the agile hardware movement. Combining with some principles in Eric’s Lean Startup, the following sections outline what we can start doing today towards creating the final IC factories.

Minimum Viable Product (MVP) in IC design

If you haven’t heard it yet, it’s a term made popular by the Lean Startup methodology. In contrast to launching polished products, MVPs are launched with just enough features to run experiments, test the market, and learn from feedbacks. A famous example is Dropbox, whose MVP was just a video demonstrating how the product is supposed to work before a working prototype is ready.

“This is madness! How can we ship MVP chips?” Well, we just have to redefine who IC designers’ customers are before applying the MVP concept. Our customer should be whoever takes our block into the next integration level. If you are building a regulator, your customer is the higher level designer who uses your regulator. If you are a mid-block owner, your customer is the integration person who puts the system together and slaps an IO ring around it. In the case of unfortunate graduate students, you are your own customer.

The FFPPA Pyramid

In order to define an IC product, five meta requirements are usually necessary – function, feature, power, performance and area. People are most familiar with the last three (abbreviated PPA), but I always throw function and feature in the mix. Below is a pyramid depicting what I believe is the typical order of importance for these criteria, with bottom being most important.

The key message here is there exists a pyramid like this for every project (with a slightly different order). We can certainly shuffle around the blocks on top. Some projects might have a killer feature, while others might have non-negotiable power/area constraints. However, FUNCTION should always be the foundation (I once taped out a very energy/area efficient noise generator when I wanted to build an ADC). After we have identified the pyramid, we can start building the MVP from bottom up.

MVPs during design cycle

OK, actually I lied. Your MVP doesn’t even need to functional. Recall from our top-down methodology discussion, symbols themselves already contain enough useful information and could serve as an MVP for your “customer”.

Starting with critical pin definitions is actually the first step toward a functional block. Let’s use an OTA as an example and see how its MVP form can evolve in an agile framework.

Voila! Here is your OTA MVP. You can pass it on to the higher block owner for drawing his/her schematics. Note that being an MVP doesn’t mean it’s a low quality symbol. In fact, we should apply the same standard as if it were the final version.

The MVP’s next version should then have some functions to it, either with ideal models, basic transistor topologies or a mix of both. For a perfectly functional OTA, all we need is an ideal transconductor and a large resistor as its output impedance. The beauty here is that the most critical specifications are already in the model (gm, gain, etc.). MVP here becomes MSP – Minimal Simulatable Product.

Finally, we can start building some real circuits. A natural question to ask is then why don’t we jump straight to this real circuit implementation. For a simple OTA, perhaps the MVP process isn’t that necessary, but the benefits will be amplified for larger scale blocks.

The key advantage of using MVPs is that we barely meet the need the “customer” so that his/her own work can continue. Throughout the process, we maintain a version of the circuit that’s usable in schematics and simulatable to various degrees of accuracy. The same idea applies when layout efforts begin. Layout pin placement is critical for floorplanning; dummy and decap devices can be added incrementally; power down and over-stress protection features could come in next. All these incremental additions happen according to the FFPPA pyramid that we agreed on earlier.

Tape-in and incremental additions

If you feel the example above is too simple to make a compelling point, here is an excellent project by the Berkeley Agile team for a RISC-V processor: An Agile Approach To Building RISC-V Microprocessors.

The figure below taken from the paper should summarizes the essential difference in an agile model. Throughout the entire process, each layer creates an MVP for their own customer layer. For instance, the Design layer only finishes F1 to the Implementation layer, and incrementally adds more features down the line.

My favorite idea in this example is the concept of a “tape-in”. In the authors’ own words, it’s “a trivial prototype with a minimal working feature set all the way through the toolflow to a point where it could be taped out for fabrication”. A “tape-in” is perhaps the closest thing to being a real MVP in the IC design context. In fact, I believe the sooner a project reaches the “tape-in” stage, the more likely it will succeed. The positive psychological impact of seeing a potential path to completion is huge.

Even though AMS design cycles are more manual than digital/SoC flows, the “tape-in” concept is equally applicable. Designers all have a tendency to sit behind closed doors and polish our cherished designs until the last second. However, there is no need to finish all features, include programmability, or even meet power/area budget in one go. Rather, we could constantly push out “releases” of your design and optimize incrementally. This requires no coding but a shift in mindset, which is agile design model’s true essence.

The Power of “Small Batches”

Another lesson in lean thinking is about the power of “small batches”. The line between “lean” and “agile” is a bit muddy on this one for me. Here is the entertaining example Ries used in his book:

A father and two daughters (six and nine) decided to compete on how fast they can stuff envelopes. The daughters went about the intuitive way – label all envelopes, then stamp all envelopes, then stuff all envelopes. The rationale is sound: repetition at each task should make us more focused and efficient. On the other hand, the father’s strategy was to label, stamp, and stuff each envelope individually before moving onto the next one.

The father ended up winning, not because he is an adult, but because of the often overlooked costs in the waterfall approach the daughters took. They miscalculated the effort required moving from one big task to the next, as well as the cost of mistakes found later in the process. Sounds familiar? These are precisely the challenges we face today.

There is no real data to back up the following cartoon graph, but this could very well be what’s happening here. While traditional “large batch” approach is good enough for older designs, it’s evident that we have already crossed a “critical N” line in today’s designs. The crossover that favors “small batch” approach happens when the cost of hand-offs and errors start to dominate in the entire process.

Small batches in AMS design

The agile model in the RISC-V example essentially pushes features into the final design in small batches. When the design process is multi-layered in nature, small batches allow parallelism that creates almost no gap between design tasks. As a result, the same principles should be applicable to AMS designs.

Below is a typical representation of modern AMS design cycles. It has four layers of main tasks, and most of our pain points lie in the layout and post-layout verification steps. As a result, the feedback latency from post-layout results to schematic/layout modifications is too long (i.e. red arrows). Designers’ hands are often “tied” until layout or simulation finishes. Does the code compiling comic above ring any bells again?

Now if we apply the small batch principle to redraw this design cycle, it would look like the following diagram. We can break the monstrously large design cycle into several much more manageable smaller design loops.

It is easier said than done. We as designers need to develop a new skillset to create such a collection of small loops. We need to create schematic hierarchies with clean interfaces, understand dependencies among sub-cells, and most importantly have a good feel about layout and verification efforts. Like all skills, designing circuits with small batches require constant practice, from block to block, from tape-out to tape-out. When done right, the efficiency brough by the small batch approach is so so sweet.

Final Thoughts

I just finished a tape-out at the time of writing. Did I follow the lean/agile principles I wrote about here? If I were honest, no.

We are all creatures of habit, and the bad ones stick around the longest. I plead guilty for sometimes getting carried away in making non-MVPs in large batches. Nevertheless, there are other times when lean/agile methodologies help save huge efforts and resources when the unexpected happened.

Like all of you, I am continuously learning and trying different ways to become a better designer. Lean and agile principles have found their roots in modern design methodologies. They proved to be valuable in reducing design wastes and handling changes. I look forward to hearing about your design methodologies. The day when we all agree on a systematic design flow is the day we can truly start codifying IC designs.

Lean-Agile Principles in Circuit Design, Part 1 – How to Reduce Design Wastes

Working in a startup has forced me to pick up Eric Ries’ “The Lean Startup” again. If you haven’t read it, it’s a book about applying “scientific principles” in a startup or entrepreneurial environments. As a hardware guy, you could imagine my slight “disappointment” the first time I read it. “Well, it’s only for software”, “Big companies probably can’t adopt this”, “Written in 2011? That’s so yesterday”.

I now find some ideas more intriguing after my second viewing (actually listening on Audible during commute). I begin to make connections between the underlying principles behind “lean thinking” and IC design practices. Maybe (and just maybe), IC design is primed to adopt lean principles more formally and systematically. So, you are reading this two-part series as a result of my obsession with such principles for the past couple of months.

Ah management jargons, we meet again

To many, management jargons seem just as foreign (and possibly pompous) as engineering abbreviations. Nevertheless, the end goal of either side remains the same: plan, execute and deliver a satisfactory result in a time and cost efficient manner. I have come to learn that a good “process” is key to sustainable and predictable results. So let’s first put away our engineering hats and look at the three most popular process improvement methodologies compared to the traditional waterfall approach.

Lean

Lean manufacturing was invented by Toyota to achieve a more efficient production system. In the 1950s, Toyota adopted the just-in-time (JIT) manufacturing principle to focus on waste reduction (seven types identified) in the process flow. The system was later rebranded as “lean” and studied by many business schools. In a nutshell, lean systems aim to remove unnecessary efforts that create minimal value for the final output.

Six Sigma

Who doesn’t love normal distributions? Six Sigma’s originator Bill Smith must secretly be a marketing genius because the name fully captures the methodology’s key principle – reducing variations. As one would imagine, Six Sigma processes heavily rely on data and statistical analysis. Decisions are made with data evidence and not assumptions. This notion shouldn’t be that alien to IC designers – after all, we run Monte Carlo simulations precisely for yield reasons. Modern processes combine Lean and Six Sigma and call it Lean Six Sigma (jargons right?).

Agile

You might be the most familiar with this term. After “Manifesto for Agile Software Development” was first published in 2001, it quickly gained steam and almost achieved this “Ten Commandments” type status in the software world. The biggest difference in Agile is its embrace for constant change and readiness to launch small revisions frequently. Many became a fan of Agile during COVID since it proved to be the most resilient system.

Relevance to IC design

It’s easy to classify such process methodologies as “obvious” or “not applicable to hardware”. Some might even falsely generalize Lean as less, Six Sigma as perfect, and Agile as fast. Ironically, “less, fast and perfect” are actually the desirable outcomes from such processes. Acknowledging and studying these ideas can help improve our own design methodologies.

In this post, I want to zoom in on the “waste reduction” aspect in lean (part 1). Not only do we often see over-specifying or over-designing leading to waste, valuable human and machine time are also not fully utilized when schematics are drawn inefficiently.

It’s also no coincidence that some commonalities exist, which might be applicable to circuit design as well. Lean, Six Sigma and Agile all rely on a constant feedback loop of “build-measure-learn”. The difference lies only in the “complexity” and “latency” in the loop (part 2).

Now let’s try putting this in IC design’s perspective: if we are the managers of the “circuit design factory”, how would we adopt these principles?

Waste in IC design

Lean was first applied in manufacturing systems and later extended to other fields. Fortunately, lean is equally applicable to the engineering process. The table below, taken from an MIT course on lean six sigma methods, shows a mapping between the original manufacturing wastes and their engineering equivalents.

Engineering wastes aligned to the wastes in lean manufacturing [source: MIT OCW 16.660j Lec 2-4]

So how can we extend this further to IC design? Here is my attempt at mapping these wastes. I wager that you must have experienced at least one of these frustrations in the following table. I bet even more that we have all been waste contributors at a certain point.

Waste reduction toolbox

Now that we have identified the waste categories, let’s discuss the top 5 ways to reduce them during design cycles. You could call these personal learnings from the good and the bad examples. Many ideas here have parallels to some programming principles. So without further ado, let’s begin.

1. Finding O(1) or O(log N) in your routines

Targets waste #4, #8

I apologize for my software persona popping out, but there is beauty in finishing a task in constant or logarithmic time (check big-O notation). Examples for circuit design involve using hierarchies, array syntax, and bus notations to reduce schematic drawing/modification to O(1) or O(log N) time.

If you are learning to create floorplans, ask your layout partners about groups, array/synchronous copy (for instances), aligning (for pins), and cuts (for routes). I wish someone had told me these lifesaving shortcuts earlier because I have spent way too much time doing copy→paste→move.

Travelling salesman problem [source: xkcd]

2. Systematic library/cellview management

Targets waste #1, #2, #3

Borrowing from software again, revision controls in library manager are widely used nowadays. While the benefit is huge, it could lead to unintended bad habits. Many designers simply create as many variations of the same cell as possible without actively “managing” them. This could result in mass confusion later on, especially if no final consolidation happens. Worst case scenario, you could be checking LVS against one version, but tape out another.

If investigations and comparative studies require multiple versions, I recommend using a different cellview instead of creating a completely new cell. Combining with config views in simulations, the entire library becomes cleaner and more flexible. When library consolidation or migration happens, only the relevant final cells will survive, thus a clean database. I plan to discuss how to create a good cellview system in a more detailed future post.

Don’t sweat over the what the cellview names mean on the right, but do take some educated guesses

3. Symbol/schematic skeleton before optimization

Targets waste #5, #6, #7

Top-down methodology encourages designers to have a bird-eye view of the entire system in addition to the fine details of their responsible cells. One method is to define block and cell level pins earlier in the design phase. This idea is similar (though not as sophisticated) to abstract classes or interface (e.g. Java, Python, etc.) in object-oriented programming languages. Instead of implementing the specific functions right away, a high-level abstract description first defines the key methods and their interfaces. The IC equivalent would be to appropriately name and assign port directions for each block’s pins. The symbol itself contains all the information for its interface.

“How can I build a symbol without knowing what’s inside?” The truth is you must know the most critical pins – an amplifier should at least have power, inputs and outputs. You must also know the most basic required features on a block – power down, reset, basic configuration bits. Informative symbols and schematic skeletons should be possible with these pins alone. The same concept is applicable to layout and floorplans, with pins + black boxing.

Since we are only dealing with symbols and pins here, it’s must easier to modify if specification changes or a new feature is requested. This ties into the “minimum viable product” (MVP) concept that we shall discuss in part 2.

A rough frame w/ non-ideal parts is a better starting point towards building a car than a perfectly round and polished wheel

4. Design w/ uncertainties & effort forecast

Targets waste #5, #6, #7

Now your schematic skeleton looks solid, the device level design begins. You have a clear plan of execution because of the symbol creation exercise, but potential pre and post layout discrepancies bother you till no end. We all have had that fear: what if this thing completely breaks down after layout?

To address this, designers should 1. estimate parasitics early by floorplanning, 2. use sufficient dummies, and 3. add chicken bits. Depicted below is an example of a tail current source in an amplifier. Before starting layout, designers should have a mental (or real) picture of how the unit current cells are tiled together. There could be an always-on branch (8x), two smaller branches for fine adjustments (4x + 2x), and dummies (6x). A critical parasitic capacitor connects to the output node w/ reasonably estimated value.

One could argue the extra programmable branches and dummies are “waste” themselves. Keep in mind that reserving real estate at this stage consumes minimal effort compared to potential changes later in the design process. Swapping dummies and the always-on cells only require metal+via changes. What if layout database is frozen during final stages of the tapeout but some extra juice is required due to specification change? What if the chip comes back and you realize the PDK models were entirely off? The chicken bits might just save you.

5. “Ticketing” pipeline between design and layout

Targets waste #3, #5, #8

This last one is my personal system to communicate with my layout partners. I use a poor man’s “ticketing” tool called POWERPOINT. Yes, you read that right – I am suggesting to use one more .ppt document to cut IC design waste. My personal experience so far is that this interface document provides better communication and results than any zoom calls, especially if there are time zone differences. Below is how an example slide looks like.

Each slide acts as a ticket for a layout modification request. The slides DO NOT need to be pretty at all. A quick snapshot and description serve the purpose of both conveying and documenting the request. As the design gets more complete, this slide deck will grow in size but all changes are tracked in a visual way. This also allows the designer and layout engineer to prioritize and work at their own pace, almost in a FIFO manner. When periodic checkpoints or project milestones are close, this slide deck becomes extremely helpful for reviewing and further planning.

Till next time

Being lean in our design process never means reducing design complexity or using fewer tools. Rather, it’s the mentality that we should begin with the right design complexity and use the right tools.

I hope some techniques mentioned here can provide insights on how to be lean when designing. As promised, there are more to this topic. Specifically, IC design process can also embrace incremental changes in Agile methodology. We can achieve better outcome by breaking large design cycles into smaller ones. So stay tuned for part 2!

The Frequency Domain Trap – Beware of Your AC Analysis

This man right here arguably changed the course of signal processing and engineering. Sure, let’s also throw names like Euler, Laplace and Cooley-Turkey in there, but Fourier transform has become the cornerstone of designers’ daily routine. Due to its magic, we simulate and measure our designs mostly in the frequency domain.

From AC simulations to FFT measurements, we have almost developed a second nature when looking at frequency responses. We pride ourselves in building the flattest filter responses and knowing the causes for each harmonic. Even so, is this really the whole picture? In this post, we will explore some dangers when we trust and rely on frequency domain too much. Let’s make Fourier proud.

Magnitude isn’t the whole story

Math is hard. We engineers apply what makes intuitive sense into our designs, and hide the complicated and head-scratching stuff behind “approximations”. Our brains can understand magnitude very well – large/small, tall/short, cheap/expensive. When it comes to phase and time, we can’t seem to manage (just look at your last project’s schedule).

So naturally, we have developed a preference for the magnitude of a frequency response. That’s why we love sine wave tests: the output is simply a delayed version of the input with different amplitude. It’s easy to measure and makes “intuitive sense”, so what’s the problem?

Sometimes, the phase portion of the frequency responses contains equal if not more information as the magnitude part. Here is my favorite example to illustrate this point (makes a good interview question).

Take a look at this funky transfer function above. It has a left half-plane pole and a RIGHT half-plane zero. Its magnitude response looks absolute boring – a flat line across all frequencies. In other words, this transfer function processes the signal only in the phase domain. If you only focused on the magnitude response, you would pat yourself on the back for creating an ideal amplifier. Shown below is a circuit that could give such a transfer function. Have a little fun and try deriving its transfer function (reference)

But is it even real or just a made up example? If you ever used an inverter, you would recognize the following waveform. Ever wondered where those spikes come from? They come precisely from the feedforward path (right half-plane zero) through the inverter’s Miller capacitor. This RHP zero also contributes to the inverter buffer’s delay. There is no way to predict these spikes from magnitude responses alone.

Magnitude response can still remain a good “indicator” of obvious issues (after all, it’s one of the fastest simulations). However, phase information becomes crucial with the introduction of parasitics and inductors, especially at high frequencies. Sometimes, it’s not the flattest response you should aim for (for those who are interested, look into raised-cosine filters and their applications in communications).

Probability – the third leg in the engineering stool

As mentioned before, we love our sines and cosines, but do we speak on the phone with a C# note? Most real life signals look more like noise than sine waves. In fact, the transmitter in a wireline link typically encodes data to be “random” and have equal energy for all frequencies. The signal’s frequency content simply looks like high energy white noise – flat and not that useful.

What’s interesting, however, is the probabilistic and statistical properties of the signal. Other than time and frequency, the probability domain is often overlooked. Let’s study some examples on why we need to pay extra attention to signal statistics.

1. Signals of different distributions

We will begin by clearing the air on one concept: white noise doesn’t mean it has a Gaussian/normal distribution. The only criteria for a (discrete) signal to be “white” is for each sample to be independently taken from the same probability distribution. In the continuous domain, this translates to having a constant power spectral density in the frequency domain.

We typically associate white noise with Gaussian distributions because of “AWGN” (additive white gaussian noise), which is the go-to model for noise. It is certainly not the case when it comes to signals. Here are four special probability distributions

Again, if independent signal samples are taken from any one of these distributions, the resulting signal is still considered white. A quick FFT of the constructed signal would look identical to “noise”. The implications on the processing circuits’ requirements, however, are completely different.

Take linearity for instance. It wouldn’t be wrong to assume the linearity requirement for processing two digital levels should be much relaxed than a uniformly distributed input signal. The figure below shows that nonlinearity error for a dual-Dirac distribution could effectively become “gain error”, while a uniform input yields a different error distribution. A Gaussian distributed input signal might also require less linearity than a sinusoidal-like distribution because smaller amplitude signal is more likely.

By understanding input signal’s statistical nature, we can gather more insights about certain requirements for our circuits than just from frequency domain. It is frequently a sin when we design just for the best figure of merit (FOM) using sine wave stimulus. Such designs are often sub-optimal or even worse non-functional when processing real life signals.

2. Stationary vs non-stationary signals

Before these distant probability class jargons scare you away, let’s just imagine yourself speaking on the phone again. Unless you are chatty like me, the microphone should be picking up your voice in intervals. You speak, then pause, then speak again. Congratulations, you are now a non-stationary signal source: the microphone’s input signal statistics (e.g. mean, variance, etc.) CHANGES over time.

When we deal with this kind of signal, frequency domain analysis forces us to go into the “either-or” mode. We would perhaps analysis the circuit assuming we are in either the “speak” or the “pause” phase. However, the transition between the two phases might be forgotten.

This becomes especially important for systems where a host and device take turns to send handshake signals on the same line. In these cases, even pseudo-random bit sequences (PRBS) can’t realistically emulate the real signals.

Other scenarios involving baseline wander and switching glitches also fall under this category. Frequency domain analysis works best when signals reach steady-state, but offer limited value for such time and statistical domain phenomena. Figure below depicts a handshake signal example in the HDMI standards. Try and convince me that frequency domain simulations help here.

The small signal swamp

Though they are not entirely the same, small signal analysis are associated with frequency domain simulations because they are all part of the linear analysis family. Designers are eager to dive into the small signal swamp to do s-domain calculations and run AC simulations. There is nothing wrong with it, but far too often we forget about the land that’s just slightly outside the swamp (let’s call it the “medium signal land”).

Overlooking the medium signal land can potentially lead to design issues. Examples include slewing, nonlinearity, undesired settling dynamics, and sometimes even divergent behavior with bad initial conditions. Small signal thinking often tells a performance story: gain, bandwidth, etc. Medium/large signals, on the other hand, tells a functional story. Ask yourself: can I get to the small signal land from here at all? If not, you might have taped out a very high performance brick.

In real life designs, key aspects like biasing, power on sequence, and resets could be more important than the small signal behaviors. And the only way to cover these points is through time domain simulation.

Stand the test of time

My favorite example for why frequency domain measurements could be deceiving is found in this article by Chris Mangelsdorf. Chris’ example demonstrates errors due to very high harmonics (i.e. code glitches) are often not visible in frequency domain. In this particular case, it’s even difficult to spot in time domain without some tricks. This article also touches upon similar sentiments mentioned above including phase information.

While many consider getting good FFT plots and ENOB numbers the finish line in most projects, not understanding time domain errors like glitches can be catastrophic. For example, if an ADC has a code glitches happening every thousand samples (regardless of its perfect ENOB or FOM), it cannot be used in a communication link targeting bit error rate (BER) of 1E-6 or below.

Unfortunately, time domain analysis is, well, time-consuming. In large systems, running large system level transient simulations inevitably crash servers and human spirit. That’s why adopting top-down methodology with good behavior models is of increasing importance. To stand the test of time, we need to be smart about what and how to simulate in the time domain. Below is a list of essential time domain simulations

  1. Power-on reset
    This is on the top of the list for obvious reasons. This is often not discussed enough for students working on tape-out. A good chip is a live chip first.
  2. Power down to power up transition
    Putting a chip into sleep/low power mode is always desired, but can it wake up properly? Run this simulation (not input stimulus is necessary) to check the circuit biasing between power down/up states.
  3. Input stimulus transition from idle to active state
    In some applications, input signal could go from idle to active continuously (e.g. burst mode communication, audio signals, etc.). Make sure your circuit handles this transition well.
  4. Special input stimulus like step or pulse response
    Instead of sine wave testing, consider using steps or pulses to test your circuit. Step and pulse responses reflect the system’s impulse response, which ultimately contains all frequencies’ magnitude/phase information. Techniques like this are helpful in characterizing dynamic and periodic circuits (see Impulse Sensitivity Function)
  5. Other initial condition sweeps
    Power and input signal transitions are just special cases for different initial conditions. Make sure you try several initial conditions that could cover some ground. For example, a feedback circuit might not be fully symmetrical. It could have different settling behaviors for high and low initial conditions.

To state the obvious, this post is not suggesting to ignore Fourier completely, but rather treat it as the first (not last) guiding step in your entire verification process. To build a solid stool on which your design can rest on, we need to consider frequency, time and probability domains together. So whenever you look at another frequency response next time, think about phase, statistics, time. and hopefully this three-legged stool.

SSCS Webinar for Young Excellence – Become Circuit Artists

Thank you for your support for the SSCS Webinar for Young Excellence, “Become Circuit Artists: The Practical Skill of Schematics Drawing and Its Importance for Designers’ Success”. The video recording can be viewed below:

To access the slides, please subscribe to our newsletter below and a download link will be included in the welcome email. Stay tuned for future updates and contents. We appreciate your continued interest. Please leave any comments and suggestions on what you would love to read about next.

Join Our Newsletter!

Testbench Templates – How To Reuse and Boost Simulation Efficiency

If my recollection doesn’t fail me, this was about ten years ago. I walked into the lecture hall for EE313, full of excitement. The digital CMOS course was taught by Prof. Mark Horowitz. There has always been some “deity” status attached to the man. As a self-proclaimed analog design student at the time, I was anxious to learn what the “other side” was about.

Halfway into the lecture, Prof. Horowitz began a “sales pitch” of something called CircuitBook. He was candid about us being the genuine pigs for a testbench framework that his Ph.D. student was working on. The goal was to create a reusable analog test solution stack using only Python and SPICE. With a more software approach, the unified framework hopes to “hide” all the possible diverse variations in analog test environments. I distinctly remember feeling a bit confused: why would I ever want to code my testbenches for analog?

Figures taken from the CircuitBook thesis. The framework attempts to use Python for unifying almost all components in a test (top).

Years later after I entered industry, my mind always circles back to this moment. I obviously didn’t get the full picture as a graduate student then. However, CircuitBook has aged like fine wine the longer I work in IC design. The idea sounds better each day when I need to simulate a new circuit or look at others’ testbenches. So why didn’t it take off? Here are the reasons in the author’s own words:

We have found that one of the main challenges with the CircuitBook test framework has been convincing users to adopt the system. We believe this can be attributed to the initial learning required to be productive. The CircuitBook test framework does not significantly speed up the time required to make a new test for first time when the time to learn the framework is included. The productivity gains come from reusing the resulting test collateral for future variants. Users are often concerned more about the task at hand than future benefits, so our current test framework may not be attractive to time-constrained designers.

James Mao, “CIRCUITBOOK: A FRAMEWORK FOR ANALOG DESIGN
REUSE”

No truer words have been spoken. The author went on to discuss filling the framework repository “with reusable test components”. The framework could also “provide building blocks that allow users to quickly construct tests for a particular circuit class”. The proposed strategies aim to make adoption easier. I believe what he alluded to could already be realized in schematic capture tools today – in the form of testbench templates.

Creating reusable testbench cells

As I have mentioned in a previous post, a wrapper might be necessary if it’s used again and again. For example, supplies are perhaps instantiated the most number of times, but we hardly think about creating reusable supply cells. The same applies to input stimulus for DC/AC/transient simulations.

One tip is to parametrize these cells as much as possible (e.g. using pPar) to avoid creating too many variations. It certainly doesn’t provide the full flexibility of coding in Python, but it should be good enough for most testbenches.

Here we discuss a few cases for some most common testbench cells

1. Supplies

For any given project, supply domains are typically agreed upon first. Many mixed-signal circuits require multiple supplies nowadays for optimal performance and power. Grounds could be separately on-chip to provide isolation between analog and digital lands. It begins to make more sense to create a dedicated reusable supply cell for all testbenches like below.

The cell itself is not that fancy: the simplest form involves just ideal voltage sources. Parametrization is what makes it more interesting and powerful. One example is to parametrize each source’s DC voltage for editing at the testbench’s top level.

You might spot in the figure above that the cell uses 0V voltage sources to create ground nodes. Isn’t this redundant? If the goal is to simply break the ideal ground into different names, ideal 0 Ohm resistors can also do the trick. The key here is to allow more parametrization for simulations like power supply rejection ratio (PSRR). One can parametrize the AC magnitude on ground net with pPar(“avss_ac”) for example. We can then configure the supply cell to perform such simulations without any new setups. The same applies to other sources.

Simple models for supply impedance is the natural next step. Each supply can have a RLC network in series to mimic bond wires and package traces. Keep it fully parameterizable for full flexibility.

To make it more user friendly, remember to include default values to save time when initiating this cell. When the supply cell is first initiated in a testbench, its property list and default values might look like below. Note that the default values could also be strings, so some parameters can become design variables automatically.

Example parameter list and default values
2. Input stimulus

By the same token, input stimulus can also be parametrized and made more “general”. We are interested in differential inputs in most applications. My preferred way is using ideal baluns (for reasons that Ken Kundert also wrote about it here).

Depicted above is one simple example of such a differential stimulus cell. For DC and AC sims, parameterizing the differential and common mode DC/AC source values and impedances should cover most cases.

Building on top of this, different variations are possible for transient simulations. I am a broadband signals guy, so a pulse stimulus (i.e. Vpulse or Vpwl) is often my first choice for simulating pulse responses. If you are a narrowband or converter person, Vsin might be your cup of tea. A differential clock can also be generated by using Vpulse and setting vdiff and vcm correctly. All of these sources still have the DC/AC fields, so they remain compatible with DC/AC simulations.

One can in theory build a much general stimulus cells with all possible sources and an analog mux for selection, but the return on investment starts to diminish. I recommend simply creating cells for the most common inputs like stimulus_sine, stimulus_pulse, stimulus_clk, etc., and keep the number of parameters manageable.

Below is a more elaborate version of this stimulus cell. Other features like AC-coupling and external source select can be included. Most of these features can be realized with ideal resistors and math if you don’t wish to write VerilogA modules.

3. Probing and measuring

Testbenches are the only places where we can build perfect analog computers, so let’s take advantage of this.

Here is another awesome use of ideal baluns: they are “bi-directional” and can measure differential and common mode signals. Instead of post processing simulation results, you can put in these balun based probe elements to calculate differential and common modes during simulations.

(a) Differential and common mode to complementary signal conversion. (b) Differential and common mode measurements from complementary signals

Since impedances are also transparent through baluns (like in stimulus cells), ideal voltage buffers can help isolate the probe signals in case some loads are accidentally attached.

So why go through all the trouble of using such a probe cell? Again, this probe element here opens up a new idea: pre-process signals in simulation to simplify post-processing expressions. Some post-processing expressions can become really unreadable really fast (parenthesis nightmare anyone?). I find the measurement expressions easier to follow when I don’t have to jump over several hoops to trace where each signal or variable is defined. Thus, don’t limit your imagination to just this simple probe cell. Start encapsulating some measurements you do repeatedly in reusable cells.

One simple but powerful example is a power meter (do you get the pun?). Try building one using a series voltage source, a current-controlled voltage source (ccvs) and a multiplier. A more complicated example is to build DACs + de-interleaver for combining interleaved ADC outputs. Rather than saving all ADC slices’ outputs and post process in MATLAB or Python for FFT, the combined output is already available as a saved net. Here is another benefit: one could check results during simulation to ensure the signals look healthy. Stop the run in case something doesn’t look right instead of waiting until the long simulation finishes.

4. Deep probing

But what about nets that are inside the DUT? Luckily, we can use deepprobe to bring nets deep in the hierarchy up to the top level. Deepprobe also allow you to modify loading on the internal nets of interest. So one way to probe internal differential nets is as follows

Deepprobes with balun based differential probe cell

Unfortunately, I haven’t found a way to wrap this into a “diff_deepprobe” cell. I am still looking for answers here, but for now this cell group work just as well.

One disclaimer here is that this shouldn’t be a full replacement of your typical save statements. Rather, one should use this probe strategicaly on critical net for better readability (for your own or others’ consumption). Of course, there is a personal preference to this, but I find this approach more attractive than reading netlists and save files due to direct visual feedbacks.

5. Verilog-A modules

Last but not least, you can perhaps create all of the above (and more) if you are proficient in Verilog-A. In addition to what’s already available in standard libraries, most teams might already have a separate well-maintained Verilog-A library. They could have countless hidden gems like digital constant cells, frequency meter or bias gen models. Do spend some time to study them and/or create a couple modules that help with your own simulation flows.

Creating testbench templates

With an arsenal of these reusable cells (not that different from the software modules proposed in CircuitBook), we could start building testbench templates. In essence, they are the bases and starting points for certain simulations of particular circuits.

Below is what a general purpose testbench template might look like. The template contains a supply cell, sine wave transient input stimulus, a power meter, differential probes (including deepprobes), and a digital attribute cell. The default parameter values for each cell should require minimal changes to start some quick DC/AC simulations. The probe and digital attribute cell is there for quick usage reference, further modifications, or duplications. It’s always easier to delete/copy-paste on the same schematic sheet than instantiating new cells. A reasonable simulation state should also be available (e.g., corner setups, signal saves and reference measurement expressions). Overall, the template should provide the essentials to shorten the time to hit that green run button.

Generic testbench template

Expanding upon this principle, we can build specialized testbench templates for well known characterizations on certain classes of circuits. Here is a template for simulating the regeneration time constant of a dynamic comparator. Input and clock sources are provided. Probe names are already filled in for the regeneration nodes. What’s cool is that the template can also have instructions for DUT instantiation (just like reading through some code comments). The simulation state should already contain critical measurement expressions based on the template (in this case some exponential time constant calculations).

Example testbench template for dynamic comparator regeneration time constant simulation

To keep the momentum going, here is a rapid fire round for template ideas: op-amp characterization, amplifier noise, oscillator phase noise, general purpose feedback loop stability, converter ENOB, periodic steady state analysis, PSRR, power sequencing, … Can you think about others?

I hope the message has come across loud and clear: if we really think about the day-to-day simulations we run, we could exploit some similarities among testbenches to improve productivity. Testbench templates are pre-built schematics that give us a head start when starting simulations. Instead of drawing a new schematic each time, the process becomes finding the right template, create a copy, modify and run.

The million-dollar questions

I can already hear the skeptics yelling: is this really any better than frameworks like CircuitBook? Who should build and manage these templates? Do you seriously think we have the time to manage another library for testbench templates?

For industry designers, using testbench templates could become a way to preserve and pass along knowledge. From a student’s perspective, I believe having access to testbench templates speed up the learning process. These templates allow one to spend more time exploring the design space than fighting for the right setup. They are also more visually direct than reading through codes (a schematic is worth a thousand lines of code).

While some might argue that the struggles are part of the learning, we need to look no further than the open-source software community to see the flaws in this thinking. We have enough problems during the design phase as is. I see a full parallel between building these testbench cells and templates and open-source packages ready for use and modifications.

As to who should do all the “dirty work”, the answer is always graduate students and interns 😉. Jokes aside, I think it’s a “survival of the fittest” system, in which the best cells and templates will prevail (not that different from open-source again). Many teams might already require a full library cleanup after each tape-out. The downtime between tape-outs is the perfect gap for designers to massage these templates, explore new ideas for reusable cells, and improve methodologies.

Now that we have our own version of open-source for chip design underway, readable and reusable testbench templates are just as important as the designs themselves. There is no shortage of brilliant testbench tricks and setups by our community’s gurus. We just need a more systematic and straightforward way to democratize them.

Top Down or Bottom Up – Where Should Designs Begin?

I might not be considered a “seasoned veteran”, but I have experienced some personal design paradigm shifts over the years.

Starting in undergrad, circuit design meant discrete components and breadboard. The equivalent of IC hazing was to read through countless datasheets and choose between a bad and an OK op amp. Moving to graduate studies, shrinking my breadboard designs into GDS was definitely dopamine inducing. Meanwhile, I began to get a taste of the challenges that come with more complex circuits and systems. Various internships taught me the importance of designing for PVT and not just for Ph. D. Working full-time opened my eyes to the internal structures of a well-oiled IC design machine (system, design, layout, verification, etc.). I picked up the design reuse mentality along with a new set of acronyms (DFT, DFM, DFABCD…). Interestingly enough, I need to draw on ALL of these experiences in a startup environment.

What I just described here is how my own methodology went from bottom-up to top-down, and today I live mostly in the middle. To get started, I recommend everyone to read through Ken Kundert’s article on top-down methodology first. Based on what he wrote more than 20 years ago (!), I will then add my take on this topic.

Where’s top? Where’s bottom?

This is an obvious question to ask, but how is “top” and “bottom” really defined? In the good old days, “bottom” meant transistors and “top” meant amplifiers. It was easier to draw the line because there weren’t that many layers. However, the increasing number of hierarchies in SoCs has forced us to rethink what top/bottom means.

It’s easier to define what “top” is: whatever block you are responsible for. “Bottom” becomes trickier. This is where models enter the chat. British statistician George Box famously pointed out the true nature of models, and it is especially true for IC design. My definition of “bottom” is the layer at which the model is still useful but details become cumbersome for the design of interest.

Digital designers have moved their collective “bottom” to the gate level because transistor details become unnecessary. For a PLL charge pump designer, transistors could mean the bottom, but for the overall PLL owner, the bottom stops at the current source/sink model of the charge pump. My top can be your bottom like the picture below. The hierarchical tree depicted here shows a clean boundary between each owner, but sometimes there could even be overlaps. Therefore, every designer has the opportunity to practice “top-down” methodology and think like a system architect, which I would expand upon in a later section.

The simulation problem

My post won’t be complete without a xkcd reference, so here it is:

Compiling. [credit: xkcd]

Change “compiling” to “simulating”, you get a pretty accurate representation of our daily lives. I am kidding of course, but the underlying message is valid. Large IC systems nowadays are simply impossible to simulate. The fact that a billion-transistor chip works at all is nothing short of a miracle.

There are mainly two ways simulation speed can be dragged down:

1. Netlist is too big

Do I hear a resounding “duh”? In modern PDKs, the transistor models themselves are becoming more complex already. Multiple flags and parameters are included in the model for layout dependent effects and parasitic estimates. When we add extra transistors in the circuit, we are also adding more resistors and capacitors. Layout extraction (especially RC extraction) makes the netlist size explode further.

2. Time constant gaps

More and more mixed-signal systems run into this issue. Examples include oversampled converters, digital/hybrid PLLs, TIA with DC offset cancellation, etc. A block may have signal and loop bandwidths that are orders of magnitude apart. A high speed TIA processes GHz signal, but the DC offset loop might only have kHz bandwidth. In order to fully simulate functionality, a millisecond long simulation with a picosecond time step might be needed. This becomes a problem regardless of the netlist size.

To make matters worse, designers also often relegate into the role of “SPICE monkeys”. Without a good understanding of the top level requirements and behaviors, many fall in a trap of “tweak, sweep and press run”. Perhaps this is the reason why many fear an take-over by the AI over lord, because computers are way better at loops than us.

The simulation bottleneck worsens the already long time-to-market for IC products. To address these issues, top-down methodology introduces the use of behavioral models to allow trade-offs between simulation time, accuracy and gaining insights.

The top-down loop

Behavioral models are the key enablers in a top-down design flow. Top-down design typically requires new modeling languages other than SPICE to describe block behaviors. One can use software programming languages like Python and MATLAB, or hardware description languages (HDL) like Verilog-AMS or SystemVerilog.

When I went through my graduate program, our group had a unwritten rule. No one touched PDKs without at least a year of work in MATLAB. Our daily work revolved around models and algorithms before we can finally put down transistors. Unfortunately, not many circuit design programs enforce students to pick up a modeling language skill, which is reflected in the industry today.

With the benefits of behavioral models, I often find myself in a top-down loop at a design’s early phase. Here is what I mean:

  1. Begin at the top level for the system, and assume relative ideal blocks. Verify that your proposed system/architecture works with these assumptions using behavioral models.
  2. Question your assumptions and each block’s ideal nature. Start adding non-idealities into your models and re-evaluate. The key here is to pinpoint the non-idealities that matter the most in your system, and keep the ideal approximation for other aspects.
  3. You should have the “preliminary specs” for each block at this point. Now question if these specs are reasonable.
  4. Do order-of-magnitude low level simulations for feasibility study. Note that we are already at the “bottom” layer here!
  5. Repeat the process until the specifications converge as more low level simulation data becomes available

A simplified illustration of this top-down loop is shown above. If everything goes well, we traverse on the green and blue arrows until we reach a final design. Note that the green path signifies a top-down approach and blue is for bottom-up. When people refer to top-down approach today, they are really talking about this loop, not just the green path. It’s the continuous re-evaluations and requirement updates at the model and circuit levels that ensure optimal designs and smooth executions.

Sometimes we might run into the red arrow where a fundamental limit pushes us to rethink the overall system (and worse, our career choice). While it sounds disastrous, a brand new architecture or a neat circuit trick typically comes to life to break this limit. About 80% of my current job happens when I am driving. My mind goes around the loop several times, shuffles around some blocks and plays more mental gymnastics when I believe a fundamental limit is reached. It takes some practice and time, but anyone can grow into a “system architect” after living in this loop long enough.

System architect – the man, the myth, the legend

Ken Kundert specifically wrote about system architects in another similar article. A system architect’s job is to own the top-level schematic, help define block interfaces, develop simulation and modeling plans, work with verification and test engineers, etc. A system architect basically acts as the middle man who speaks different languages to coordinate multiple efforts during a design cycle. They are the go-to person when an issue arises or change is necessary.

Sounds like a talent that’s extremely hard to come by. Yet, every team has to task a single engineer to be this person in a top-down design flow. All too often a system architect ends up being a guru with models but with minimal circuit design experience. Thus they wouldn’t spot a fundamental limitation until it’s too late.

My belief is that every designer can be a system architect to some extent and on different scales. Regardless of how complex your circuit block is, you can adopt the top-down loop methodology as long as you treat it as a system. Here are some ways for you to try and play system architect

1. Always question specifications

While specifications serve as the first line of interface between designers and the final product, that is really all it is. No requirement is sacred and no sizing is sacred, as my last manager loves to say. One example is the use of effective number of bits (ENOB) for specifying data converters. There has been a shift from using this generic figure of merit to a more application specific way of defining converter requirements. A noiseless but nonlinear ADC will impact the system differently than a noisy but perfectly linear one. So next time when you are handed a specification table, ask WHY.

2. Always question the signal nature

Most circuit requirements come from assuming some signal type going into the circuit. Sinusoidal signals have been the go-to choice because we love Fourier and AC responses. They are easier to simulate and measure. Unfortunately, almost no real application only processes single-tone sine waves. With the system architect’s hat on, you should understand fully the signal nature. There might be characteristics in the signal that can be exploited to simplify or improve your circuits. Is the signal DC-balanced? What does its statistics look like? How does it handle weak and hard nonlinearity?

3. Create simple simulatable models in schematics

Building models is difficult and requires a picking up a new skillset. However, you could build many useful models with ideal analog components without any knowledge of Verilog or MATLAB. More complex primitives are already available including delays, multipliers, converters, etc. Start building simulatable models with these components first. You will be surprised at how effective they can be in reducing simulation time and providing insights. There are more sophisticated modeling tools, like Xmodel, if you have become more comfortable and proficient later on.

4. Define symbols and pins early before drawing transistors

Lastly, a system architect has a bird’s-eye view of what the finished product looks like. Start with the end product and you will get a better picture of how to get there. Try identifying and naming the most critical pins for each cell first. While you create symbols, your mind is already computing how to connect each block and prioritizing what the do next. Empty schematics with meaningful symbols can still be full of information. Be mindful that these symbols will definitely change later, so nothing needs to be perfect. Treat this exercise just as drawing block diagrams on a scratch pad. Your muscle memory for drawing schematics will put you in design turbo mode and keep you motivated to continue the design effort.

Conclusions

The boundary between “top” and “down” is muddier than most think. Top-down design is really a mindset where designers treat the circuit block as a system rather than a soup of transistors. Education and training programs in IC design still tend to produce good designers, but do little to steer them towards becoming architects.

In my personal view, schematics and basic component libraries provide enough tools for anyone to play the role of a system architect at all levels. I encourage all students and designers to start incorporating behavioral models (with ideal components and/or Verilog-AMS) in their schematics, even if there is a separate army for system modeling. The right models can help reduce simulation efforts, assist in debug, and solidify your own understanding of the circuit.

It is no secret that polyglots have huge advantages in the globalized world, and the analogy is equally true for circuit designers. Adopting a top-down design mentality is like learning multiple new languages, which will definitely prove fruitful in the long run.

The Unsung Heroes – Dummies, Decaps, and More

Like most fields, circuit design requires a great deal of “learning on the job”. My first encounters with dummies and decoupling capacitors (decaps) were through internships. In fact, they could be the difference makers in a successful tape-out (analog and digital alike). In this post, we will take a deep dive and discuss the best ways to manage these unsung heroes in schematics.

Smart use of dummies

As the name suggests, dummies are devices that are sitting in your designs doing nothing functionally and looking “dumb”. The use of dummies fall under the category of “Design For Manufacturability” or DFM. They ensure that the real operating devices behave as closely to the simulation models as possible. Below are the three main reasons to include dummies

1. Reduce layout dependent effects (LDE) for best device characteristics

The biggest two LDEs are well proximity and length of diffusion effects illustrated below. Basically, all FETs like to think they are the center of the universe. The right thing to do is sacrificing the self-esteem of some dummies to extend the well edge and diffusion length. This is also why multi-finger devices are preferred over single-finger devices despite having the same W/L.

Well proximity and LOD effects (left), and their impact on device threshold voltage (right)
Adding dummies reduce LDEs for active devices in the middle (left); multi-finger devices suffer less LDE than single-finger devices (right)

Every process node’s LDE is different, but a general rule of thumb is to add 1-2um worth of dummies on either side for a peace of mind (L0 in the graph above where Vt plateaus). So before starting your design, study the DFM recommendations or even better, draw some devices and simulate.

2. Identical device environments for matching

Even when diffusions can’t be shared (for example, compact logic gates or self-heating limitations), dummies are still necessary to ensure device matching. This also applies to other elements like resistors and capacitors. Specifically, the devices of interest should have the same environments, even including metallization. Below are some examples of where to use dummies for device matching

(a) dummy inverters for consistent diffusion edge environments; (b) dummies around active resistors; (c) dummies next to matching current sources; (d) dummies next to matched MOM fingers

It’s not easy to share diffusion for single finger inverters without adding extra parasitic loading like in (a). Dummy inverters can be added on both sides to ensure at least the diffusion edges see another diffusion edge consistently. Similar principles apply to resistors in a ladder, matching current sources or MOM fingers in DACs. The idea is to create a regular layout pattern and the active cells are in the middle of said pattern.

3. Spare devices for easier late-stage design tweaks

Preparing for last minute design changes is crucial for any projects. The worst kind of change is for device size because FEOL space is precious and who knows what new DRCs these changes can trigger. There is a whole industry created around ECOs (Engineering Change Order) to handle late-stage design changes, especially for large VLSI systems. By placing dummies (or spare cells) strategically, only metal changes might be necessary for late design changes. My favorite example is the dummy buffers for custom digital timing fixes shown below.

Dummy buffers as spares for potential timing fixes

Take a simple timing interface, and let’s say it’s setup time critical in this case in a high-speed custom digital path. The clock path needs some extra delay to give the flip flop sufficient setup margin. We won’t know whether the margin is enough or not until we do post layout simulation. A good practice is to put down some extra buffer/inverter cells, tied as dummies for post layout modifications. Of course, it requires some experience to spot where these spare cells are needed, so start practicing as soon as possible.

Another quick example is putting spare gates for low speed combinatorial logic for fixes late in or even after tape-outs. You might have heard people put NAND and NOR spare gates everywhere for this reason. One tip is to use 4-input NAND/NOR, and tie NAND’s input to high and NOR’s input to low as dummy . This way, they can still be used as 2- or 3-input gates functionally. Modern synthesis and digital flows already automate this, but analog/mixed-signal designers need to be aware of this as well.

This idea also applies to analog circuits. Take the dummies that might exist in a CML circuit: bias current dummies, differential pair dummies and resistor load dummies. They are all available as spares for last minute tweaks in order to squeeze out extra gain or bandwidth. The key here is to reserve the real estate so that only metal changes are necessary. Most layout engineers I worked with are magicians when it comes to quick metal fixes.

The catalog for decaps

There is no such thing as a pure capacitor outside of the mathematics land. That is why you probably have run into pictures like below at some point (a simple tutorial here). The effective series inductance/resistance (ESL/ESR) of a capacitor suppresses its high frequency bypass capability. Even worse, a capacitor can really behave inductively at high enough frequency.

Realistic PCB capacitor model (top) and decoupling network impedance over frequency (bottom)

This picture continues on chip. The PCB capacitors rely on in-package or on-die decaps to further suppress the supply impedance rise at higher frequencies. However, on-chip decaps face their own unique challenges, like ESD, leakage, lower quality factor, etc. Let’s first detail out the possible decap choices.

1. PMOS/NMOS gate decap

This is probably the first thing that comes to our minds. We will connect the gate of a PMOS/NMOS to supply/ground, and connect the source and drain to the other. Typically the supply voltage is much larger than the device Vt, so we will get a linear enough decap. To build a high-Q cap, the gate length is typically quite long for smaller gate resistance. However, the overall ESR is still considerable when taking all layers of VIAs and metals into account. Nevertheless, these decaps have much higher capacitance density.

NMOS/PMOS gate decap schematics and example layout

So are we done? Not quite. The biggest issues for these decaps lie in reliability, specially ESD and leakage performance. For many deep sub-micron nodes, the oxide is thin enough for electrons to tunnel through, leading to gate leakage current. For the same reason, the oxide layer is susceptible to breakdown when high voltage is present or an ESD event happens. As a result, these decaps can lead to catastrophic failures if not taken care of. For example, if a positive ESD event happens on the supply, which directly connects to the NMOS’s gate, the device would likely break down, causing huge leakage current or even collapsing the supply.

Between the two flavors, PMOS tend to be the more reliable (not necessarily the better performance) decap choice for most small geometry processes. Planar PMOS has lower gate leakage than NMOS. The parasitic diodes between the Nwell and substrate provide some extra ESD protection. The extra parasitic capacitance between the Nwell and substrate is another point in PMOS’ favor.

Cross section of planar PMOS and NMOS
2. Cross-coupled decap

To further improve on-chip decaps’ reliability, a cross-coupled decap structure came onto the scene(here is a nice paper on decaps). The structure does look funny – a positive feedback loop leads to a stable biasing point in this decap. Under this operating point, the circuit behaves as two parallel device capacitors, each with a device on-resistance in series. This ESR is much higher than that of the gate decaps, thus will be less effective for high frequency bypassing. However, the increased gate resistance provides extra protection during an ESD event by limiting the current through the gate oxide. Most decaps in standard cell libraries today use similar structures to tradeoff reliability for performance. After all, nothing matters if your chip has a hole burnt through it.

Cross-coupled decap schematic, model and impedance over frequency
3. Thin vs. thick oxide

Another way to tradeoff reliability and performance is through the use of thick oxide (TOX) devices. TOX devices have much lower leakage current and are rated for higher voltages, and thus have a better chance of surviving ESD events. The cost, however, is smaller capacitance density (smaller capacitance due to larger distance between gate and channel).

There was an anecdote in my Ph.D. lab that a chip returned with huge off-state currents, and unfortunately nothing worked. The root cause was the large area of thin oxide NMOS decaps, coupled with perhaps improper handling of antenna effects, making the chips dead on arrival. After that incident, “only TOX decaps allowed” was an enforced rule in the group.

Industry and academia environments are certainly different and more rigorous rule checks are available today. Nevertheless, I still make my decap choices carefully because of this horror story.

4. MOM, MIM and power grid

Last but not least, we have the good old metal caps. They typically provide better quality factor, linearity and reliability than device caps, but at much lower cap density. Below is an illustration of the physical structures of MOM and MIM caps

Example bird eye view of MOM capacitor (a) and cross section view of MIM capacitor (b)

In most cases, a MOM capacitor can be stacked directly on top of a device decap to effectively increase density and quality factor. Roughly 20% cap density improvement is achievable with optimized layout. MIM caps might seem efficient because they sit in between top two layers with better density than MOM caps, but the thin plates’ high resistance is a bummer. I never used MIM caps for supply decoupling because they disrupt power grids and have mediocre performance at high frequencies. However, don’t let my personal preference deter you from trying them out and maybe they are the right fit for you.

One other “freebie” for decaps is the sidewall parasitic capacitances between power straps. Therefore, try to interleave your supply/ground lines whenever possible.

Decoupling signals

Let’s get this out of the way first: your supply is a signal. Sadly, not many people realize this until supply noise becomes a problem. What it really means is that a supply or ground pin in schematics is not a small-signal ground, so connecting decaps to these nodes requires some thoughts.

Let’s take a PMOS current bias voltage for instance. Normally a low pass filter exists between the current mirror and the destination current source (either C or RC) to lower noise. The question now is which decap type should we use.

First of all, since the decaps see a finite impedance to supply/ground, ESD is less of a concern (i.e. use of NMOS gate caps is OK). We probably want the highest cap density for area saving, so let’s stack as much MOM capacitors as possible. Ground is typically “quieter”, so let’s bypass to ground. Thus, here is our first attempt:

First attempt at decoupling current bias voltage

At first glance, there is nothing wrong with this considering noise sources from Iref or the diode connected PMOS. However, as soon as we think about noise from the supply (which we believed is noisier than ground), it sees a common gate amplifier on the right side at high frequency! If this bias current goes to an oscillator, boy would we have some jitter problems. The correct connection is to bypass the bias voltage to supply, stabilizing Vgs across the PMOS device. At the same time, a PMOS gate cap would be the better choice in terms of layout.

Supply noise injection comparisons between different decoupling schemes

Decoupling signals is often not as straightforward as it seems. I have dealt with signals that needed to have specific ratio of decoupling to supply and ground for optimal performance. Such exercises become more challenging when area becomes a constraint as well. This might seem obvious to some of you, but I am sure we all have made similar mistakes somewhere along the way. I hope this little snippet could save new designers some troubles.

Managing dummies

Finally, we get to the schematics part after a crash course on dummies and decaps.

You might already know my stance on who should initiate and manage dummies/decaps. I strongly believe designers should own the decisions on usage and placements of these devices. As evidenced above, dummies and decaps directly impact circuit performance, and sometimes determines if we have taped out a resistor or brick. So start thinking about them as soon as a schematic hierarchy is created.

There are mainly two types of transistor dummies: ones that connect to a circuit node and ones connected to supplies. My recommendation is to try your best to draw the first type in schematics as intended in layout. It’s OK to leave the supply connected dummies in a corner if you want to make schematics look cleaner, but definitely create your own floorplan. To illustrate, take the simple diff pair example below. One connects dummies to node isrc explicitly, and the other tucks them away in the corner with net name connections. Many schematics out there contain dummies like the left example. For bigger and flatter schematics, it can quickly become difficult to trace.

Different dummy drawing styles for example differential pair

The next tip involves aligning dummies in the same row as the active devices to reflect layout. The diff pair example didn’t follow this because it’s a simple circuit. We will use a conventional StrongARM latch as an example for this point.

Aligning dummies to rows of active devices in a StrongARM latch example

Note that the dummies on the vx nodes remain part of the active schematic similar to the diff pair example. On the right is a boxed section for supply connected dummies put into rows. This might seem redundant since all NMOS devices could be combined, but it creates a template for layout engineers and highlights the relative dummy locations. The dummy sizes DON’T need to be accurate when the schematic is first created. They serve as placeholders for either layout or you to fill in later. Again, dummies are for LDEs, so always keep layout in mind.

If you haven’t already realized, some PMOS dummies on the top row are connected as decaps. In general, don’t waste opportunities to turn dummies into decaps (for supply or bias alike) right next to your circuits. They are the first line of defense against switching currents or capacitive feedthroughs like in a dynamic comparator.

Should we create dedicated dummy wrapper cells? My cop out answer is that it’s a personal choice. However, if you designed the schematic hierarchy right, no level should have enough dummies to even consider a wrapper cell. So my real answer is if a wrapper cell is ever needed, it could just mean your schematic is too flat. Start wrapping active and dummy devices together.

Managing decaps

Most teams probably already have reusable decap cells. If you don’t have them, make them now!

For my first Ph.D. tapeout, the unit decap cell was the biggest time saver towards the end of the project. By using mosaic instantiation, the empty areas around the core circuits were filled up in no time. My first chip didn’t work for other reasons, but I was very proud of the decaps I taped out (can you hear me choking up?).

Cartoon chip layout, with decap mosaics for different supply domains (orange & yellow)

There could be many details that go into making these reusable decaps. Schematic-wise, they are a collection of unit decap cells with different flavors pulled from the catalog. In modern CMOS designs, these decaps’ unit area in layout fits within a power or standard cell grid. The standard cell decaps are excellent examples. We now just take that concept and apply it to higher level custom decaps.

The first advice might sound silly: make reasonably small symbols for unit decap cells. Decaps are important, but they are not the star of the show. Therefore, the real estate they take up in a schematic sheet should still be small. Case in point: a decap cell symbol in a standard library most likely is smaller than an inverter symbol. Along the same line of thinking, your custom decap cell’s symbol could be slightly bigger to include information about the decap type, but not that much bigger.

Below are some example custom decap symbols, comparing to the sizes of a typical standard cell decap and MOS symbols. By making them small but still informative, tucking these decaps away in a corner makes them less distracting in schematics.

Example custom unit decap symbols compared to standard cell decap and MOS symbols
Example StrongARM latch schematic with dummies and decaps

Moving up the schematic hierarchy, different decaps types are necessary for multiple supplies. For example, use thick oxide for IO voltage, combination of thin and thick oxide for core voltage, etc. The advice here is to ALWAYS make a dedicated wrapper cell for all the higher level decaps. The example below is not really drawn to scale. One can imagine the decap wrapper cell symbol being significantly smaller than the rest of the core circuits. Key is to again put the cell away in a corner, but it’s still easily accessible.

Decap wrapper example at higher level schematics

So what’s the big deal? Aside from a more modular schematic, there are two other main benefits.

  1. This creates a clean interface between design and layout engineers. The layout engineer can update the decap count inside the wrapper cell him/herself without interfering with ongoing changes in the core circuits. This will save everyone some efforts during crunch time.
  2. The magic of black boxing makes this schematic more simulatable. Accompanying the decaps are millions of parasitic capacitance and resistance when fully extracted. That’s one of the reasons why post-extraction simulations for higher level schematics are almost impossible. With this schematic, we can mix and match the extraction outputs for all blocks. The decap wrapper can stay in schematic or use C-only extraction. The opposite case could be to have the core circuit stay in schematic, but with a full RC extraction on the decaps and power grids.

The decap wrapper cell doesn’t have to be only on the top most level. In fact, I would recommend putting these cells starting at almost all mid-level blocks. Once you become used to it, it just becomes a habit of copy/paste.

Conclusions

Dummies and decaps are not the sexiest things to talk about (I have tried very hard here). They are nevertheless the key elements that ensure our circuits operate as intended. Here is a quote about decaps by Kent Lundberg (my circuit and feedback class instructor during undergrad): “Decoupling capacitors are like seat belts. You have to use them every time, whether you think you’re going to need them or not.” The same applies to dummies in today’s process nodes.

Subjects like dummies and decaps are often learned on the job or from expensive mistakes. There are many other “boring” but critical elements that require more of our attentions in a design process (mostly DFM related). Often times, fresh grads are overwhelmed with new terminologies, methodologies and productization concepts that weren’t taught in school. To address this, rating the correct usage of dummies/decaps and overall schematics for a class project might be a good starting point .

Mistakes in chip design are expensive. Ironically, the hard truth is that sometimes people learn best from expensive mistakes. The best tradeoff, then, might be to share and openly discuss more “horror stories” in order to save younger designers from these million-dollar downfalls.

Metal Resistors – Your Unexpected Friend In Wire Management

Yes, you read the title right. If you haven’t seen or used metal resistors (a.k.a. metres, rm, etc.) in your schematics, I hope this post opens a new door. Most modern PDKs already include metal resistor cells in the library. If not, you could create your own with CAD team’s help (if you have access to one). Normally, we work hard to avoid metres because they show up uninvited and mess up everything after extraction. However, they can be extremely helpful when placed properly, especially for simulation, testing and documentation purposes. In this post, I will explain in more details how to effectively utilize metres in these areas.

Some wires deserve a name

Metal resistors have been around for a long time. I only began using them more heavily when finFETs came about. As explained in another post, layout and parasitics are now dominant factors in a design’s success. Therefore, many routing wires need to be scrutinized just like devices, and they deserve to have dedicated cells.

The easy example we can all agree on is on-chip inductors. Although many PDKs come equipped with inductor pcells, we probably still end up drawing our own. There are many methods to deal with inductor LVS (black boxing, creating pcell, etc.), but my current favorite is to use metal resistors. These schematics are boring to say the least (the resistance is negligible, often <<1Ohm and essentially a short), but they will pass LVS as is without any funky setups. To simulate, you replace it with another schematic view generated from your favorite EM solver, be it an inductor model or nport model. The possibilities are endless: a similar schematic can apply to transmission lines as another example.

metres for inductor LVS
metres for transmission lines

Perhaps my favorite use case is for creating a standalone routing cell for a critical net. This happens the most when a signal need to branch out and reach multiple destinations. Metal resistors can help define this design intent early on (especially if you have already experimented with the floorplan). This is just another aspect of the “draw it like you see it” mentality. The example shown below is for a simple clock route, but you can easily expand this to be a distributed or tree structure. Note that the schematic could be “less boring” now that I added some parasitic capacitors to both supply and ground.

metres for an example clock route

Let’s compare the two schematics below. On the top is a straightforward one showing a clock buffer driving the signal to the top and bottom buffers. Although not drawn here, one can imagine an improved version including annotated routing information and a cartoon floorplan in the corner. So how can we further improve upon that? That’s where the bottom schematic come in with a routing cell created with metal resistors.

Schematics improvement with routing cell

Here are some of the biggest benefits of the bottom schematic:

  1. It forces you to treat routing plans seriously and acknowledge that it’s part of your design. Heck, it makes everyone who looks at this say that routing cell must be very important.
  2. There are two more unique and critical nodes (i.e. clk_top & clk_bot) for easy probing during simulation. There might be some who are fluent in netlist and know exactly where the signal of interest is, but that’s not me. With this schematic I can easily probe these two nodes and obtain useful information right away (e.g. delay matching).
  3. This schematic intrinsically separates the design problem into two parts: driver fan-out sizing and parasitics. So if the post layout simulation results weren’t as desired, we could have a better plan of attack for debug. Is it routing parasitics or fanout limited? Maybe I should try C-only extraction for the routing cell to see if it’s resistance dominant. Maybe there is some layout issue in the buffer instead of the wire routes, so let’s use extracted view only for the routing cell. I hope that you see this is a more efficient scheme to help designers isolate layout issues.
  4. Let’s talk about the supply/ground pins. The obvious reason is give the extracted capacitors a better reference rather than “0”. The more important reason is that these pins will remind you to include power grids surrounding the wires in layout. Many designers find out much later that top level integration slapped a dense power grid over their critical signals. This can lead to yelling, hair pulling and sometimes redesign. Putting power pins on routing cells lower the chance of such “surprises”.

Despite the examples focusing on high speed wires, metal resistors could be equally important for lower speed applications. When resistance matching is critical (e.g. summing node for a current DAC), segmenting a net with metal resistors can work wonders.

On-chip probe points

Now let’s go to other extreme of the speed spectrum: DC test voltages. For the uninitiated, real world designs often require the ability to measure critical on-chip signals. For digital blocks, an internal mux and a protocol of your choice (I2C, SPI, monitor bus, etc.) is sufficient to select and send signals off chip. The principle is the same for analog signals, except you have to decide the exact locations to probe in the physical layout.

There are mainly two categories of test signals: performance critical and location critical. Performance critical signals are ones that you don’t wish to disturb when you look at them. For example, you don’t wish to add extra capacitive loading on a high speed net or you want to make sure no extra noise can be injected into the VCO control voltage through the test path. The typical solution is to use a large isolation resistor (could be ~100k) locally before sending the voltage to a far-away analog mux. In this case, the resistor is an actual device like a poly resistor.

In other cases, extra loading is not problematic but you are specific about the location and metal layer where the signal is probed. Supply and ground network is the best example for this use. Our friendly metal resistor can be the perfect fit here. My suggestion is to create a corner in the schematic that summarizes the probe signals and their respective metals like below. This little corner provides layout engineers enough initial information (fine tuning is certainly required), and also serves as documentation.

Metres for sense voltage connections

To those who are pcell savvy or wish to improve upon this, you can create a wrapper cell with custom symbols with metal information written on them. The size can also be adjusted for more compact drawings (schematic real estate is valuable too). Depending on your appetite and the scale of your design, this might be an overkill. However, there is a similar use case in the digital land that might make more sense for some.

Digital mapper

Let’s take the diagram above and flip it left and right. Then, you have a bus coming in on the left, and branched out to new unique pins on the right. Remember the configuration section of the symbol here? This list can grow quickly for a larger block, and propagating all these pins to higher level could become troublesome. Somewhere in the schematic hierarchy one needs to connect these meaningfully named pins to the dull digital buses. Perhaps you have seen something like this before

Digital bits distribution by net names

Ah, the good old connect by net name crime. The noConn cells are there to eliminate the warnings due to a few unused bits, but now the whole bus is “not connected”. There is no structure in how the digital bits are connected to their destinations. No amount of “dynamic net highlighting” is gonna save your eyes when you need to debug a wrong connection. Your layout partner is probably also making a voodoo doll and sharpening some needles. Introducing the digital mapper cell, sponsored by metal resistors

Digital bits distribution by mapper cell

The magic happens inside the mapper like below. Luckily, tools today recognize buses that are tapped off from a bigger bus without complaining. This results in a much cleaner look for the schematic and nothing is connected by net name. Right away, it conveys more information about each bit’s usage, expected metal layer and even relative locations in the layout. For example, the noConns signify routing tracks reserved for shielding around critical signals, like power down and reset.

Unit mapper group example

Building upon this unit metres mapper group, a complete mapper cell can contain much more information about the circuit’s programmability. You guessed it – here can be go-to place for all the default values if you annotate them down. What’s better is you could see which configurations share the same register map address. You can even read off the combined value in hex for digital and verification folks. This is just another example of schematic as documentation, made possible by metal resistors. From layout’s point of view, the initial effort might be similar, but any incremental change becomes easier to track with this cell.

Example of complete mapper schematic

I have one final note about the digital mapper cell. Depending on your team’s own methodology, the mapper inputs could potentially be lumped into a single wider bus. This can help simplify the symbol of a mid-level block and makes higher level schematics easier to read and draw. But again, it’s up to your personal taste.

High level schematic symbol style flexibility with mapper cell

Dear Santa

As a build up to my personal wish list, here is a my bad attempt at a Christmas poem:

‘Twas the night before Christmas when an unexpected friend showed up in the PDK,

Her name was Metres, who smiled and said “your wires are going to be OK”.

Forgive me Santa for being so greedy,

but I still wish Metres can be a bit more handy.

Don’t you know a special elf named CAD?

Perhaps he can help, I heard he’s a good lad.

I know he is busy all season long,

but here is the list for what I want

  1. As mentioned above, the metres symbols should display metal layer info directly.
  2. Currently, pins can be assigned a “type” (power, signal, analog, etc.), but I personally never used them and understood their purpose. Is it possible to create a “digital” pin type and give me a field to input “default values”? It would be nice if the default value can show up in the symbol pin automatically.
  3. Is it possible to read in a digital mapper cell and generate a spreadsheet for the configuration table? This probably requires #2 to happen first.
  4. To expand upon #3, perhaps the process of creating configuration spreadsheets can be fully automated if special metres are recognized when tracing an entire netlist. Now designers only need to make sure their schematics contain the configuration details, and never have to touch Excel.
  5. A similar methodology might also work for analog test signals, just need another special flavor of metres pcell.

These might still be pipe dreams, but dreams do come true if we wish hard enough. The bigger point, however, is that we need to keep thinking about ways to enhance productivity, improve design scalability and reduce chances of error. An effective use of a tiny element like metres can translate to huge gain in efficiency. You never know what would be the next gem you find or create in the PDK.

Draw It Like You See It – Schematic and Layout Duality

The verdict is final: layout IS the design. The images below taken from Alvin Loke’s CICC slides (watch his recent talk here) summarize the key challenges in modern CMOS designs. Layout effort grows linearly (I think it’s more than that) due to more stringent DRC rules and longer design iterations. As a result, it has forced changes on most of our design mentality (although we designers are a stubborn breed). I began to re-evaluate the way I draw schematics because I know eventually a good layout is king. I still hold the personal belief that it’s more likely to work if it looks good. This applies to both schematics and layout.

Design and layout complexity in modern CMOS process [credit: Alvin Loke]

Design in reverse

I was fortunate enough to have early access to finFET processes (16nm to be exact) during my Ph.D years. Funny story: when I first stared at a finFET layout, I assumed all gates are shorted due to the continuous poly lines. You can imagine my facial expressions when I learned about poly and MD cuts (😲+ 🤬 + 😭). It took me about 3-4 circuit blocks to fully understand the “evil” that is parasitics in finFETs. The LVS ignorable parasitic caps that double your circuit’s power seem innocent next to the parasitic resistors that smirk at you as they make your netlist size explode. Eventually, I decided to do my designs in reverse: never begin schematics/simulations before layout placement. It might sound counter intuitive, but here are the biggest reasons and benefits for adopting this methodology

  1. Save time for everyone in the design cycle
    Think in the shoes of layout engineers. They need to tell you that your circuit is impossible to draw due to some DRC. They probably spent a whole day trying to figure this out, but this bad news seemed inevitable. Frustrations and tension fill the room. These scenarios could be avoided if the designers already understood the floorplan/DRC limitations. So instead of “optimizing” the circuits only in SPICE land, start with layout.
  2. Layout can determine your sizing, not simulations
    Designers tend to “overdesign” a cell when assigned a small puzzle piece in a big block. We run hundreds of simulations to squeeze out that 0.1dB extra performance, only to find out later that there is no difference with post layout extractions. Nature is kind to us: we often deal with shallow optima and a single transistor’s size wouldn’t matter too much in the grand scheme of things. So instead of running tons of simulations, your design choice might be informed by what works better in layout. One example would be increasing a device’s finger by one would help reduce the cell’s width due to abutment.
  3. Begin thinking about creating hierarchies because of layout “tediousness”
    A good schematics hierarchy could also help increase layout’s efficiency. To fully understand the importance of good hierarchical schematics, you need to experience the painful repetitive tasks in layout firsthand.
  4. Embed your design intent for parasitics into the floorplan
    No matter how good your cartoon floorplans in schematics are, it doesn’t come close to real layout floorplans. You might gain new insights after just laying down your tiny transistors and some wires. You might also want to break the OD diffusion to squeeze in more contacts. So, you change an NMOS from a single 10 finger device to 10x single finger devices. You can then draw schematics with design intents for parasitics, but you need to OWN the layout floorplan for that to happen.

The design/layout Venn diagram

I have this mental Venn diagram for design and layout. I remind our team’s designers that a majority of their design efforts should be in layout, with awareness of floorplan, DRC and parasitics at a minimum. On the other hand, a good layout engineer should be an electrical engineer at heart, knowing when to tradeoff parasitic capacitors and resistors, capable of suggesting better floorplans, and just a wizard with those hot keys.

It is certainly easier said than done, and I believe designers should take the initiative to reach across the aisle. DO NOT think you are doing layout engineers’ job, but rather you are helping yourself down the line. I promise you that your design’s “time to completion” will reduce significantly. Everyone in the layout team will shower you with appreciation if you are just a little more layout savvy.

Design and layout Venn diagram

Layout-like schematics

Enough of my ranting about how designers should learn layout, let’s discuss how at least we can draw schematics with layout in mind.

Different companies have different ways of managing their schematics at various levels. For large SoC companies, there might be a transition from a more manual and analog way of managing schematics to more digital like methodologies (i.e. netlist, Verilog, etc.) somewhere in the hierarchy. In these cases, the higher level schematics are mostly auto generated and human unreadable. Sometimes this makes sense because the chip becomes just a combination of macros and functional verifications can be more efficient with the help of digital tools. Nevertheless, drawing good schematics that reflect layout is still a good idea at mid/low level. It is really a personal choice deciding at which level or for which cell to draw a layout-like schematic, but it is a practice that could fit any hierarchy level.

It’s time to get our hands dirty. The biggest hurdle we need to jump over first is the default rectangle symbol shapes handed to us by the EDA tools. Its partner in crime is the selection box, the invisible border that defines the symbol boundary and limits your creativity. The conventional wisdom says input on the left and outputs on the right. We have been going with the flow for a while, and to be fair, they certainly get the job done. To break from this convention, here is the corner selection box that allows you to draw symbols with any shape.

Corner selection box

This allows you to create very layout like symbols, yet provide a clear entry point to descent down. To illustrate, below is a top level schematic with a pad ring. The boring rectangle symbols will result in a schematics that look like this (I didn’t put all the pin names on there for simplicity)

Boring pad ring symbol at top level

Now if I draw the pad ring symbol as a real ring with the corner selection box, the schematics can turn into something like below

An actual chip drawn in schematics w/ a ring symbol

Let’s detail out the ways why this is better

  • The pad locations are explicit, so you can get a lot of information from schematics alone. You can already visualize signal/current flows. You know exactly how many and where the spare pads are just in case. You know how to floorplan internal block’s I/O pins. The list goes on.
  • It makes more sense to have duplicate pins on this pad ring symbol because it reflects the physical layout. Thus, you have an immediate idea of how many pads are available for the same signals, especially important for supply/ground.
  • Although I didn’t draw it here, you can imagine how one can expand this symbol to annotate each pad type (e.g. ESD strength, I/O pad type, etc.), adopting the schematics as documentation principle.
  • The sense of pride that comes with drawing and admiring this finished schematic, which you treasure almost as much as your kids (ok, maybe not that much).

Another dummy example

Now let’s move to a lower level dummy block for another example. I want to emphasize that this is not a real design and probably doesn’t even work. However, it’s a great example to show how to draw layout-like schematics. Take a digital LDO (since we did an analog one before) and we will focus on the custom digital controller and the PMOS array. Below shows the block diagram

Dummy digital LDO core block diagram

As you can see, this block diagram serves as a pseudo floorplan for the layout as well. I will show you the final schematics first, and go into each sub-block respectively.

Digitally controlled PMOS array schematics

We will dive into the PMOS array (or more precisely matrix) first. This cell embodies the notion that the layout is the design. It’s quite straightforward schematic wise but the nuances are all in layout. My preferred way to draw this cell for layout is to create row and column hierarchies like below

Row and column schematics for the PMOS matrix

Note that I purposedly make the csel bus come from the bottom to match the layout. The vin/vout pin directions are more conventional since there is no easy way to indicate a 3D structure (i.e. VIA up and down) in schematics.

Those eagle-eyed among you may already see that the schematics can be more simplified and scaling friendly using bus notations. When the matrix size is large (e.g. >256 elements or 8 bits), the bus notation makes sense. Otherwise, I think 16 rows + 16 cols can still be drawn out explicitly to reflect layout (that’s roughly 2log2(16) = 8 Ctrl C+V operations, so not that bad). Together with a cartoon floorplan and more notes in the schematics, you can confidently hand this off to your layout partner.

Simplified row and column schematics for PMOS matrix

Now we will move onto the custom digital block. The more interesting subcell here is the shift register, so I will expand on it further. For the digital block itself, you can clearly see the three subcells and their relative position w.r.t each other. They can be placed into an L-shape with standard cells, fillers and decaps, just like in the schematics. Of course I didn’t forget to draw a note box to indicate the higher level connections to the PMOS matrix. One benefit with this style of schematics (which might not be obvious until you try it yourself) is you rarely have to connect by net names because the signal directions are preserved like in the layout.

Digital controller schematics

If we descend into the shift register cell, I would draw something like the following. The example design intent here is to run the clock signal against the data path to lower the hold time violation risk. Thus the data input needs to be buffered to the other side of the register chain. The extra buffer delay will act as hold time margin for the first DFF.

Note that I also space out the data buffers and annotate the approximate distance between them. The data buffer size is already chosen appropriately because I know the metal usage and wire length in advance. The clock signal also has the same annotations along with a note symbol for shielding. It’s all possible because I have played around with layout floorplan before drawing this schematic. Again, we can simplify this schematic before the shift register gets too long. It might lose some layout information, so you can add a cartoon floorplan in the corner as a supplement.

Shift register schematics, explicit (top) vs. simplified (bottom)

Final points

In the interest of keeping this post at reasonable length, I won’t include any more specific examples. However, here is a list for layout information that can be explicitly shown in schematics

  1. Routing path features including 45/90 degree turns and branching for critical signals, especially if distributed long distance (e.g. clocks).
  2. Directionality between critical signals (e.g. show if data and clock paths are parallel or orthogonal).
  3. Special routing plans like a tree structure for matching or star connection for power.
  4. Inductor coil placements relative to other cells.
  5. Higher level block symmetry (for instance, replicated I/Q mixer in a RF signal path).
  6. Common centroid placements and connections for first order gradient cancellation (differential pairs, binary/segmented DACs, etc.).
  7. The list can go on as you start to draw it like you see it…

As a closing thought, I started this post with a focus on modern CMOS and finFET, but the principles of design in reverse and drawing layout-like schematics is equally suitable for older process technologies. Designers have to evolve and understand bottlenecks and constraints often lie in other aspects, especially layout. By the same token, I also encourage designers to learn about new ideas in signal processing and systems.

In an ideal world, the Venn diagram described above would have a third circle for system design. Work flows and available talents nowadays force most teams to operate like the diagram on the left. Each circle will expand over time thanks to new technology and tools, but it’s the right overlaps that push innovation forward and ensure execution. We should all aspire to be in the middle of the intersections, and younger generation engineers should be trained as such. So gauge yourself against this picture, and move towards the center 1dB each day and one day at a time.

« Older posts

© 2024 Circuit Artists

Theme by Anders NorenUp ↑