Category: Crash course

System for Analog Designers, Pt. 1 – What Comes In and What Goes Out

November 3, 2023 / Kevin Zheng

When we hear “system” in IC design, normally two types pop in our heads – the billion(or trillion!)-transistor chips or the PCBs that host these SoCs. To be completely honest, I never really liked the term “SoC”. It forces us to think a system must have a processor, memory, a plethora of I/Os and much more to be worthy of this name. In reality, every component inside an “SoC” is a system by itself with many interconnected sub-blocks. This is even more so in the advanced CMOS era where mixed-signal processing and system-level co-optimization is crucial, even for a simple amplifier.

Tesla Dojo (left), Cerebras Wafer-Scale Engine (middle), 112Gbps receiver (right)

System thinking has never been an emphasis in analog design curriculums (granted there is just too much to cover). However, this causes designers to be often stuck in a weird place. They sometimes aren’t sure how the requirements come about or how their blocks fit in the system, . And yet, we have all witnessed the huge benefits when a designer understands signal processing and system concepts.

The modern digital-assisted analog or analog-assisted digital paradigms call for more designers who can think deeper about the incoming signals, block interfaces and architectures. These are what I believe to be the top 3 pillars in system thinking for analog designers, which we shall explore in more details in this post series.

The 3 pillars of system thinking

You can start practicing designing with a system mindset by asking the following 3 questions (and some sub-questions)

Do I understand the nature of the signal coming into my block?
1. Which signal characteristic is the most important?
2. What is the worst case signal that I need to handle?
3. Any signal characteristics that my circuit might be able to exploit?
Do I understand my block‘s interface?
1. Should I use an analog or digital output interface?
2. Is my load really a resistor or capacitor?
3. What does my block look like to others?
Do I have the right number of loops?
1. Should I use sizing or loops?
2. Too few or too many loops?
3. Does any loops interfere with any important signal characteristics?

The objective here is to develop a habit of challenging the architecture and circuit requirements, even if we are just “humble block designers”. Let’s dive deeper into the first two questions here (architecture and feedback deserves a post by itself) and learn about some of the key concepts and tools at our disposal.

What am I processing?

One of the first things we are taught was the Nyquist-Shannon sampling theorem. Analog designers have this “2x curse” in the back of our heads – somehow we always need to squeeze out twice the signal bandwidth in the frequency domain. Another trap we tend to fall into is ignoring the lower frequencies (also partly due to the 2x curse). The reality is that increasingly more applications and architectures simply don’t follow Nyquist sampling anymore.

For example, modern wireline links operate on baud-rate sampling. Sub-Nyquist sampling is paramount in some software-defined radios (SDR) and other compressive sensing applications. What enables these architectures is the understanding the difference between signal and information bandwidths. The goal of our analog circuitry has always been to preserve or condition the information contained in the signal. Reconstructing the entire signal waveform(i.e. Nyquist sampling) is just a super set of preserving information.

We should begin viewing our signal conditioning blocks all as Analog-to-Information Converter (AIC), a concept inspired by compressed sensing theory. I believe most of the problems can be reframed in the AIC context. In my own field of wired/optical communication, the overall channel’s inter-symbol interference (ISI), which in the conventional sense is bad for signal bandwidth, actually contains valuable information. A maximum-likelihood sequence estimator (MLSE) desires the right amount of ISI for the decoding algorithm to work.

Getting to know your signal?

I encourage all analog designers to grasp what information their circuits are trying to process first. Below are somethings to ask about the signal characteristics that impact the incoming information

Is the information carried in a broadband (e.g. wireline) or narrowband (e.g. wireless) signal?
Is there a huge discrepancy between the signal bandwidth and the information bandwidth? (e.g. we only care about the long delay times between very sharp periodic pulses, like ECG signal)
Is the information in the signal levels, signal transitions, or both? (e.g. level encoded like PAM vs edge encoded like Manchester code)
Is there any low frequency or even DC information? (e.g. any encoding on the signal that impact low frequency content?)
Is the signal/information arriving continuous or sparsely? (e.g. continuous vs. burst mode)

A fun interview question

The discussion above might sound too high-level or even philosophical to some, so let me give an interview question type example (derived from a real world problem). Let’s say we have a transmit signal that looks like a “train of clocks” as shown below. The signal swing is relatively small and rides on a DC bias on the PCB. A huge DC blocking cap is used on board because the DC bias level is unknown. Your task is to design a receiver circuit for this clock train and preserve the clock duty cycle as much as possible.

The challenge here is a combination of the signal’s burst nature and the board level AC coupling. As a result, the chip’s input signal will have baseline wander, which is always a nuisance.

Our first attempt might be to use a comparator directly. The issue becomes how to set the reference voltage. There is no one reference voltage that can preserve the clock duty cycle for every pulse. The next natural thought is to google all baseline wander techniques out there to see if we can negate the AC coupling completely (then pull my hair out and cry myself to sleep).

Now, if we realize that the information in the clock actually lies in the edges and not the levels, there can be other possibilities. If the edges are extracted and a spike train is created like below, the new receiving circuit might be able to restore the levels from the spikes.

The simplest edge extraction circuit is actually just another AC coupling network, but the cutoff frequency needs to be high enough relative to the clock frequency. A level restorer could be conceptually a pulse triggered latch (w/ the right dosage of positive feedback). Congratulations, we just practiced Analog-to-Information conversion (high-passing to extract edges) and reconstruction (level restoration) and created a much simpler and more robust solution. In fact, the receiver would work equally well if the burst signal is PRBS like.

Exploit signal nature

System thinking in analog design often requires “thinking outside the box” and leads to “easier” solutions. The first step is to understand the information that we aim to process and pinpoint what we could exploit. Like the example above, we took advantage of the fact that the same information lies in the signal transitions as the levels. This led to a solution better suited for this particular application. While we should be proud of making complicated circuits work, we should take equal pride in simpler solutions born from a better understanding of the incoming signal.

What am I driving?

After figuring out what’s coming into our blocks, we now shift the focus to where the output signal is going, or more precisely block interfaces. One major source of frustration is when you tweak a block to perfection but trouble arises when plugged into the system. Either the load doesn’t behave as expected or your own block is the problematic load.

Perhaps everyone can relate to the cringe when seeing heavily Figure of Merit (FOM) engineered publications. Some new circuits are extremely power efficient provided that the input source is a $10,000 box with a wall plug. Needless to say, it’s important to fully understand our blocks’ interface so that we can design and simulate accordingly.

The impedance lies

There aren’t that many lies greater than “my block looks like/drives a pure resistor or capacitor”. While a block’s input or load impedance might look like a pure resistor/capacitor at certain frequencies, every realistic element has a frequency dependent impedance (Exhibit A). Overly relying on simplified R/C loads is another reason why sometimes we can’t trust frequency domain simulations too much.

My readers already know my love for inverters, so let’s take a look at the picture below. As a start, let’s say our circuit is driving an ideal inverter. There shouldn’t be any objection to say the input impedance looks like a capacitor. Fair enough.

Now let’s add a Miller capacitor in there. Right away, things become more complicated than meets the eye. In the case of a small Miller cap relative to the input cap, the Miller cap gets amplified by the inverter gain and one might approximate the input impedance still as a capacitor with a Miller multiplied component. However, if the Miller cap is big enough such that it begins to act as an AC short sooner, the load impedance now behaves as a resistive component because the inverter becomes diode connected (this is also intuition behind pole splitting in Miller compensation).

To be the lord of chaos, I will now throw in a LC-tank at the inverter’s output, and why not cascade another stage (and another and another). Have you lost track of what the input impedance should be yet? Don’t believe this is a real circuit? Here is the resonant clock distribution circuit for a 224Gb/s transmitter. I would feel very uneasy to use simple load capacitors when designing any intermediate stages.

Impedance modeling using nport

The habit of resorting to simple RC loads is not unjustified. They could certainly provide order-of-magnitude results and speed up simulations. However, as illustrated above, that doesn’t guarantee the block would act the same when plugged into a real system. As designers, we need to recognize this possible culprit and address it early on.

We don’t need to look far to see a better way to model our block interfaces. Signal and power integrity (SI/PI) experts have long figured out that every trace on a PCB is an n-port network.

We often forget the first thing we learned. Electronics 101 has prepared us for n-port modeling with Thevenin/Norton equivalent networks, and even a MOS transistor’s small signal model is network based. And yet, we rarely think about our own circuits as a network and having S-parameters. For some reason, S-parameters are synonymous with RF designs, but in reality there is mathematical equivalence between S-parameters and Y/Z parameters, making it applicable for all frequencies. S-parameters are popular simply because they are easier to measure in real life. The point is that S-parameter is a great modeling tool for linear circuits, and we should start utilizing it more.

Passing around *.snp files

The idea then is to have a new routine testbench that extracts the n-port model of our own or load circuits. The simulation is as simple as AC analysis, but provides the entire frequency dependent impedance information.

Most simulators have S-parameter analysis (some just as a special case of AC analysis). The interface between designers then becomes “.s2p” files, which could also have best-case/worst-case variants under different PVT conditions. Simulation time remains fast but accuracy improves dramatically. It serves as the perfect balance between using an ideal capacitor and using the extracted netlist of the next block.

In fact, your DUT can also be modelled as a .s3p, .s4p, etc. as long as we are most interested in the circuit’s linear behavior. The same S-parameter files are equally usable in specialized system modelling tools like MATLAB. Modeling active circuits with S-parameter is not something new, but a wheel definitely worth reinventing (check out this 1970 thesis after a simple search).

Limitations of S-parameter models

As you might have guessed, the key limitation to this S-parameter modeling approach is the linear circuit assumption. When nonlinear effects become important (e.g. input amplitude dependent impedance change), small-signal S-parameters could yield different results (but still much better than an ideal capacitor). While there exists a so-called Large-Signal S-Parameter analysis (LSSP), it falls under harmonic balance (HB) or periodic steady state (PSS) analysis, which means it truly focuses more on RF applications. In addition, S-parameters might be limiting when dealing with mixed-signal processing, like sampling circuits.

Nevertheless, I have found impedance/circuit modeling using S-parameters generally allow fast simulation time, better accuracy and less system level frustration down the line. In fact, Analog designers could also gain system insights when interfacing blocks through S-parameters. Give it a try!

Let’s take a small break

System thinking in analog design is a skill that is increasingly more important. Long gone are the days for building “general purpose” devices, and a good system solution require tailored circuits for each application.

First and foremost, we should understand our circuits are processing and their interfaces. I hope the examples discussed in this post open the door for some aspiring analog designers for adopting system mentalities. In the next post, we will move from the interface to the inside of each block, and talk about perhaps the most important architectural tool for analog designers – feedback. Till next time!

Lean-Agile Principles in Circuit Design, Part 1 – How to Reduce Design Wastes

August 18, 2023 / Kevin Zheng

Working in a startup has forced me to pick up Eric Ries’ “The Lean Startup” again. If you haven’t read it, it’s a book about applying “scientific principles” in a startup or entrepreneurial environments. As a hardware guy, you could imagine my slight “disappointment” the first time I read it. “Well, it’s only for software”, “Big companies probably can’t adopt this”, “Written in 2011? That’s so yesterday”.

I now find some ideas more intriguing after my second viewing (actually listening on Audible during commute). I begin to make connections between the underlying principles behind “lean thinking” and IC design practices. Maybe (and just maybe), IC design is primed to adopt lean principles more formally and systematically. So, you are reading this two-part series as a result of my obsession with such principles for the past couple of months.

Ah management jargons, we meet again

To many, management jargons seem just as foreign (and possibly pompous) as engineering abbreviations. Nevertheless, the end goal of either side remains the same: plan, execute and deliver a satisfactory result in a time and cost efficient manner. I have come to learn that a good “process” is key to sustainable and predictable results. So let’s first put away our engineering hats and look at the three most popular process improvement methodologies compared to the traditional waterfall approach.

Lean

Lean manufacturing was invented by Toyota to achieve a more efficient production system. In the 1950s, Toyota adopted the just-in-time (JIT) manufacturing principle to focus on waste reduction (seven types identified) in the process flow. The system was later rebranded as “lean” and studied by many business schools. In a nutshell, lean systems aim to remove unnecessary efforts that create minimal value for the final output.

Six Sigma

Who doesn’t love normal distributions? Six Sigma’s originator Bill Smith must secretly be a marketing genius because the name fully captures the methodology’s key principle – reducing variations. As one would imagine, Six Sigma processes heavily rely on data and statistical analysis. Decisions are made with data evidence and not assumptions. This notion shouldn’t be that alien to IC designers – after all, we run Monte Carlo simulations precisely for yield reasons. Modern processes combine Lean and Six Sigma and call it Lean Six Sigma (jargons right?).

Agile

You might be the most familiar with this term. After “Manifesto for Agile Software Development” was first published in 2001, it quickly gained steam and almost achieved this “Ten Commandments” type status in the software world. The biggest difference in Agile is its embrace for constant change and readiness to launch small revisions frequently. Many became a fan of Agile during COVID since it proved to be the most resilient system.

Relevance to IC design

It’s easy to classify such process methodologies as “obvious” or “not applicable to hardware”. Some might even falsely generalize Lean as less, Six Sigma as perfect, and Agile as fast. Ironically, “less, fast and perfect” are actually the desirable outcomes from such processes. Acknowledging and studying these ideas can help improve our own design methodologies.

In this post, I want to zoom in on the “waste reduction” aspect in lean (part 1). Not only do we often see over-specifying or over-designing leading to waste, valuable human and machine time are also not fully utilized when schematics are drawn inefficiently.

It’s also no coincidence that some commonalities exist, which might be applicable to circuit design as well. Lean, Six Sigma and Agile all rely on a constant feedback loop of “build-measure-learn”. The difference lies only in the “complexity” and “latency” in the loop (part 2).

Now let’s try putting this in IC design’s perspective: if we are the managers of the “circuit design factory”, how would we adopt these principles?

Waste in IC design

Lean was first applied in manufacturing systems and later extended to other fields. Fortunately, lean is equally applicable to the engineering process. The table below, taken from an MIT course on lean six sigma methods, shows a mapping between the original manufacturing wastes and their engineering equivalents.

Engineering wastes aligned to the wastes in lean manufacturing [source: MIT OCW 16.660j Lec 2-4]

So how can we extend this further to IC design? Here is my attempt at mapping these wastes. I wager that you must have experienced at least one of these frustrations in the following table. I bet even more that we have all been waste contributors at a certain point.

Waste reduction toolbox

Now that we have identified the waste categories, let’s discuss the top 5 ways to reduce them during design cycles. You could call these personal learnings from the good and the bad examples. Many ideas here have parallels to some programming principles. So without further ado, let’s begin.

1. Finding O(1) or O(log N) in your routines

Targets waste #4, #8

I apologize for my software persona popping out, but there is beauty in finishing a task in constant or logarithmic time (check big-O notation). Examples for circuit design involve using hierarchies, array syntax, and bus notations to reduce schematic drawing/modification to O(1) or O(log N) time.

If you are learning to create floorplans, ask your layout partners about groups, array/synchronous copy (for instances), aligning (for pins), and cuts (for routes). I wish someone had told me these lifesaving shortcuts earlier because I have spent way too much time doing copy→paste→move.

Travelling salesman problem [source: xkcd]

2. Systematic library/cellview management

Targets waste #1, #2, #3

Borrowing from software again, revision controls in library manager are widely used nowadays. While the benefit is huge, it could lead to unintended bad habits. Many designers simply create as many variations of the same cell as possible without actively “managing” them. This could result in mass confusion later on, especially if no final consolidation happens. Worst case scenario, you could be checking LVS against one version, but tape out another.

If investigations and comparative studies require multiple versions, I recommend using a different cellview instead of creating a completely new cell. Combining with config views in simulations, the entire library becomes cleaner and more flexible. When library consolidation or migration happens, only the relevant final cells will survive, thus a clean database. I plan to discuss how to create a good cellview system in a more detailed future post.

Don’t sweat over the what the cellview names mean on the right, but do take some educated guesses

3. Symbol/schematic skeleton before optimization

Targets waste #5, #6, #7

Top-down methodology encourages designers to have a bird-eye view of the entire system in addition to the fine details of their responsible cells. One method is to define block and cell level pins earlier in the design phase. This idea is similar (though not as sophisticated) to abstract classes or interface (e.g. Java, Python, etc.) in object-oriented programming languages. Instead of implementing the specific functions right away, a high-level abstract description first defines the key methods and their interfaces. The IC equivalent would be to appropriately name and assign port directions for each block’s pins. The symbol itself contains all the information for its interface.

“How can I build a symbol without knowing what’s inside?” The truth is you must know the most critical pins – an amplifier should at least have power, inputs and outputs. You must also know the most basic required features on a block – power down, reset, basic configuration bits. Informative symbols and schematic skeletons should be possible with these pins alone. The same concept is applicable to layout and floorplans, with pins + black boxing.

Since we are only dealing with symbols and pins here, it’s must easier to modify if specification changes or a new feature is requested. This ties into the “minimum viable product” (MVP) concept that we shall discuss in part 2.

A rough frame w/ non-ideal parts is a better starting point towards building a car than a perfectly round and polished wheel

4. Design w/ uncertainties & effort forecast

Targets waste #5, #6, #7

Now your schematic skeleton looks solid, the device level design begins. You have a clear plan of execution because of the symbol creation exercise, but potential pre and post layout discrepancies bother you till no end. We all have had that fear: what if this thing completely breaks down after layout?

To address this, designers should 1. estimate parasitics early by floorplanning, 2. use sufficient dummies, and 3. add chicken bits. Depicted below is an example of a tail current source in an amplifier. Before starting layout, designers should have a mental (or real) picture of how the unit current cells are tiled together. There could be an always-on branch (8x), two smaller branches for fine adjustments (4x + 2x), and dummies (6x). A critical parasitic capacitor connects to the output node w/ reasonably estimated value.

One could argue the extra programmable branches and dummies are “waste” themselves. Keep in mind that reserving real estate at this stage consumes minimal effort compared to potential changes later in the design process. Swapping dummies and the always-on cells only require metal+via changes. What if layout database is frozen during final stages of the tapeout but some extra juice is required due to specification change? What if the chip comes back and you realize the PDK models were entirely off? The chicken bits might just save you.

5. “Ticketing” pipeline between design and layout

Targets waste #3, #5, #8

This last one is my personal system to communicate with my layout partners. I use a poor man’s “ticketing” tool called POWERPOINT. Yes, you read that right – I am suggesting to use one more .ppt document to cut IC design waste. My personal experience so far is that this interface document provides better communication and results than any zoom calls, especially if there are time zone differences. Below is how an example slide looks like.

Each slide acts as a ticket for a layout modification request. The slides DO NOT need to be pretty at all. A quick snapshot and description serve the purpose of both conveying and documenting the request. As the design gets more complete, this slide deck will grow in size but all changes are tracked in a visual way. This also allows the designer and layout engineer to prioritize and work at their own pace, almost in a FIFO manner. When periodic checkpoints or project milestones are close, this slide deck becomes extremely helpful for reviewing and further planning.

Till next time

Being lean in our design process never means reducing design complexity or using fewer tools. Rather, it’s the mentality that we should begin with the right design complexity and use the right tools.

I hope some techniques mentioned here can provide insights on how to be lean when designing. As promised, there are more to this topic. Specifically, IC design process can also embrace incremental changes in Agile methodology. We can achieve better outcome by breaking large design cycles into smaller ones. So stay tuned for part 2!

The Frequency Domain Trap – Beware of Your AC Analysis

August 2, 2023 / Kevin Zheng

This man right here arguably changed the course of signal processing and engineering. Sure, let’s also throw names like Euler, Laplace and Cooley-Turkey in there, but Fourier transform has become the cornerstone of designers’ daily routine. Due to its magic, we simulate and measure our designs mostly in the frequency domain.

From AC simulations to FFT measurements, we have almost developed a second nature when looking at frequency responses. We pride ourselves in building the flattest filter responses and knowing the causes for each harmonic. Even so, is this really the whole picture? In this post, we will explore some dangers when we trust and rely on frequency domain too much. Let’s make Fourier proud.

Magnitude isn’t the whole story

Math is hard. We engineers apply what makes intuitive sense into our designs, and hide the complicated and head-scratching stuff behind “approximations”. Our brains can understand magnitude very well – large/small, tall/short, cheap/expensive. When it comes to phase and time, we can’t seem to manage (just look at your last project’s schedule).

So naturally, we have developed a preference for the magnitude of a frequency response. That’s why we love sine wave tests: the output is simply a delayed version of the input with different amplitude. It’s easy to measure and makes “intuitive sense”, so what’s the problem?

Sometimes, the phase portion of the frequency responses contains equal if not more information as the magnitude part. Here is my favorite example to illustrate this point (makes a good interview question).

Take a look at this funky transfer function above. It has a left half-plane pole and a RIGHT half-plane zero. Its magnitude response looks absolute boring – a flat line across all frequencies. In other words, this transfer function processes the signal only in the phase domain. If you only focused on the magnitude response, you would pat yourself on the back for creating an ideal amplifier. Shown below is a circuit that could give such a transfer function. Have a little fun and try deriving its transfer function (reference)

But is it even real or just a made up example? If you ever used an inverter, you would recognize the following waveform. Ever wondered where those spikes come from? They come precisely from the feedforward path (right half-plane zero) through the inverter’s Miller capacitor. This RHP zero also contributes to the inverter buffer’s delay. There is no way to predict these spikes from magnitude responses alone.

Magnitude response can still remain a good “indicator” of obvious issues (after all, it’s one of the fastest simulations). However, phase information becomes crucial with the introduction of parasitics and inductors, especially at high frequencies. Sometimes, it’s not the flattest response you should aim for (for those who are interested, look into raised-cosine filters and their applications in communications).

Probability – the third leg in the engineering stool

As mentioned before, we love our sines and cosines, but do we speak on the phone with a C# note? Most real life signals look more like noise than sine waves. In fact, the transmitter in a wireline link typically encodes data to be “random” and have equal energy for all frequencies. The signal’s frequency content simply looks like high energy white noise – flat and not that useful.

What’s interesting, however, is the probabilistic and statistical properties of the signal. Other than time and frequency, the probability domain is often overlooked. Let’s study some examples on why we need to pay extra attention to signal statistics.

1. Signals of different distributions

We will begin by clearing the air on one concept: white noise doesn’t mean it has a Gaussian/normal distribution. The only criteria for a (discrete) signal to be “white” is for each sample to be independently taken from the same probability distribution. In the continuous domain, this translates to having a constant power spectral density in the frequency domain.

We typically associate white noise with Gaussian distributions because of “AWGN” (additive white gaussian noise), which is the go-to model for noise. It is certainly not the case when it comes to signals. Here are four special probability distributions

Again, if independent signal samples are taken from any one of these distributions, the resulting signal is still considered white. A quick FFT of the constructed signal would look identical to “noise”. The implications on the processing circuits’ requirements, however, are completely different.

Take linearity for instance. It wouldn’t be wrong to assume the linearity requirement for processing two digital levels should be much relaxed than a uniformly distributed input signal. The figure below shows that nonlinearity error for a dual-Dirac distribution could effectively become “gain error”, while a uniform input yields a different error distribution. A Gaussian distributed input signal might also require less linearity than a sinusoidal-like distribution because smaller amplitude signal is more likely.

By understanding input signal’s statistical nature, we can gather more insights about certain requirements for our circuits than just from frequency domain. It is frequently a sin when we design just for the best figure of merit (FOM) using sine wave stimulus. Such designs are often sub-optimal or even worse non-functional when processing real life signals.

2. Stationary vs non-stationary signals

Before these distant probability class jargons scare you away, let’s just imagine yourself speaking on the phone again. Unless you are chatty like me, the microphone should be picking up your voice in intervals. You speak, then pause, then speak again. Congratulations, you are now a non-stationary signal source: the microphone’s input signal statistics (e.g. mean, variance, etc.) CHANGES over time.

When we deal with this kind of signal, frequency domain analysis forces us to go into the “either-or” mode. We would perhaps analysis the circuit assuming we are in either the “speak” or the “pause” phase. However, the transition between the two phases might be forgotten.

This becomes especially important for systems where a host and device take turns to send handshake signals on the same line. In these cases, even pseudo-random bit sequences (PRBS) can’t realistically emulate the real signals.

Other scenarios involving baseline wander and switching glitches also fall under this category. Frequency domain analysis works best when signals reach steady-state, but offer limited value for such time and statistical domain phenomena. Figure below depicts a handshake signal example in the HDMI standards. Try and convince me that frequency domain simulations help here.

The small signal swamp

Though they are not entirely the same, small signal analysis are associated with frequency domain simulations because they are all part of the linear analysis family. Designers are eager to dive into the small signal swamp to do s-domain calculations and run AC simulations. There is nothing wrong with it, but far too often we forget about the land that’s just slightly outside the swamp (let’s call it the “medium signal land”).

Overlooking the medium signal land can potentially lead to design issues. Examples include slewing, nonlinearity, undesired settling dynamics, and sometimes even divergent behavior with bad initial conditions. Small signal thinking often tells a performance story: gain, bandwidth, etc. Medium/large signals, on the other hand, tells a functional story. Ask yourself: can I get to the small signal land from here at all? If not, you might have taped out a very high performance brick.

In real life designs, key aspects like biasing, power on sequence, and resets could be more important than the small signal behaviors. And the only way to cover these points is through time domain simulation.

Stand the test of time

My favorite example for why frequency domain measurements could be deceiving is found in this article by Chris Mangelsdorf. Chris’ example demonstrates errors due to very high harmonics (i.e. code glitches) are often not visible in frequency domain. In this particular case, it’s even difficult to spot in time domain without some tricks. This article also touches upon similar sentiments mentioned above including phase information.

While many consider getting good FFT plots and ENOB numbers the finish line in most projects, not understanding time domain errors like glitches can be catastrophic. For example, if an ADC has a code glitches happening every thousand samples (regardless of its perfect ENOB or FOM), it cannot be used in a communication link targeting bit error rate (BER) of 1E-6 or below.

Unfortunately, time domain analysis is, well, time-consuming. In large systems, running large system level transient simulations inevitably crash servers and human spirit. That’s why adopting top-down methodology with good behavior models is of increasing importance. To stand the test of time, we need to be smart about what and how to simulate in the time domain. Below is a list of essential time domain simulations

Power-on reset
This is on the top of the list for obvious reasons. This is often not discussed enough for students working on tape-out. A good chip is a live chip first.
Power down to power up transition
Putting a chip into sleep/low power mode is always desired, but can it wake up properly? Run this simulation (not input stimulus is necessary) to check the circuit biasing between power down/up states.
Input stimulus transition from idle to active state
In some applications, input signal could go from idle to active continuously (e.g. burst mode communication, audio signals, etc.). Make sure your circuit handles this transition well.
Special input stimulus like step or pulse response
Instead of sine wave testing, consider using steps or pulses to test your circuit. Step and pulse responses reflect the system’s impulse response, which ultimately contains all frequencies’ magnitude/phase information. Techniques like this are helpful in characterizing dynamic and periodic circuits (see Impulse Sensitivity Function)
Other initial condition sweeps
Power and input signal transitions are just special cases for different initial conditions. Make sure you try several initial conditions that could cover some ground. For example, a feedback circuit might not be fully symmetrical. It could have different settling behaviors for high and low initial conditions.

To state the obvious, this post is not suggesting to ignore Fourier completely, but rather treat it as the first (not last) guiding step in your entire verification process. To build a solid stool on which your design can rest on, we need to consider frequency, time and probability domains together. So whenever you look at another frequency response next time, think about phase, statistics, time. and hopefully this three-legged stool.

The Unsung Heroes – Dummies, Decaps, and More

May 4, 2023 / Kevin Zheng

Like most fields, circuit design requires a great deal of “learning on the job”. My first encounters with dummies and decoupling capacitors (decaps) were through internships. In fact, they could be the difference makers in a successful tape-out (analog and digital alike). In this post, we will take a deep dive and discuss the best ways to manage these unsung heroes in schematics.

Smart use of dummies

As the name suggests, dummies are devices that are sitting in your designs doing nothing functionally and looking “dumb”. The use of dummies fall under the category of “Design For Manufacturability” or DFM. They ensure that the real operating devices behave as closely to the simulation models as possible. Below are the three main reasons to include dummies

1. Reduce layout dependent effects (LDE) for best device characteristics

The biggest two LDEs are well proximity and length of diffusion effects illustrated below. Basically, all FETs like to think they are the center of the universe. The right thing to do is sacrificing the self-esteem of some dummies to extend the well edge and diffusion length. This is also why multi-finger devices are preferred over single-finger devices despite having the same W/L.

Well proximity and LOD effects (left), and their impact on device threshold voltage (right)

Adding dummies reduce LDEs for active devices in the middle (left); multi-finger devices suffer less LDE than single-finger devices (right)

Every process node’s LDE is different, but a general rule of thumb is to add 1-2um worth of dummies on either side for a peace of mind (L0 in the graph above where Vt plateaus). So before starting your design, study the DFM recommendations or even better, draw some devices and simulate.

2. Identical device environments for matching

Even when diffusions can’t be shared (for example, compact logic gates or self-heating limitations), dummies are still necessary to ensure device matching. This also applies to other elements like resistors and capacitors. Specifically, the devices of interest should have the same environments, even including metallization. Below are some examples of where to use dummies for device matching

(a) dummy inverters for consistent diffusion edge environments; (b) dummies around active resistors; (c) dummies next to matching current sources; (d) dummies next to matched MOM fingers

It’s not easy to share diffusion for single finger inverters without adding extra parasitic loading like in (a). Dummy inverters can be added on both sides to ensure at least the diffusion edges see another diffusion edge consistently. Similar principles apply to resistors in a ladder, matching current sources or MOM fingers in DACs. The idea is to create a regular layout pattern and the active cells are in the middle of said pattern.

3. Spare devices for easier late-stage design tweaks

Preparing for last minute design changes is crucial for any projects. The worst kind of change is for device size because FEOL space is precious and who knows what new DRCs these changes can trigger. There is a whole industry created around ECOs (Engineering Change Order) to handle late-stage design changes, especially for large VLSI systems. By placing dummies (or spare cells) strategically, only metal changes might be necessary for late design changes. My favorite example is the dummy buffers for custom digital timing fixes shown below.

Dummy buffers as spares for potential timing fixes

Take a simple timing interface, and let’s say it’s setup time critical in this case in a high-speed custom digital path. The clock path needs some extra delay to give the flip flop sufficient setup margin. We won’t know whether the margin is enough or not until we do post layout simulation. A good practice is to put down some extra buffer/inverter cells, tied as dummies for post layout modifications. Of course, it requires some experience to spot where these spare cells are needed, so start practicing as soon as possible.

Another quick example is putting spare gates for low speed combinatorial logic for fixes late in or even after tape-outs. You might have heard people put NAND and NOR spare gates everywhere for this reason. One tip is to use 4-input NAND/NOR, and tie NAND’s input to high and NOR’s input to low as dummy . This way, they can still be used as 2- or 3-input gates functionally. Modern synthesis and digital flows already automate this, but analog/mixed-signal designers need to be aware of this as well.

This idea also applies to analog circuits. Take the dummies that might exist in a CML circuit: bias current dummies, differential pair dummies and resistor load dummies. They are all available as spares for last minute tweaks in order to squeeze out extra gain or bandwidth. The key here is to reserve the real estate so that only metal changes are necessary. Most layout engineers I worked with are magicians when it comes to quick metal fixes.

The catalog for decaps

There is no such thing as a pure capacitor outside of the mathematics land. That is why you probably have run into pictures like below at some point (a simple tutorial here). The effective series inductance/resistance (ESL/ESR) of a capacitor suppresses its high frequency bypass capability. Even worse, a capacitor can really behave inductively at high enough frequency.

Realistic PCB capacitor model (top) and decoupling network impedance over frequency (bottom)

This picture continues on chip. The PCB capacitors rely on in-package or on-die decaps to further suppress the supply impedance rise at higher frequencies. However, on-chip decaps face their own unique challenges, like ESD, leakage, lower quality factor, etc. Let’s first detail out the possible decap choices.

1. PMOS/NMOS gate decap

This is probably the first thing that comes to our minds. We will connect the gate of a PMOS/NMOS to supply/ground, and connect the source and drain to the other. Typically the supply voltage is much larger than the device Vt, so we will get a linear enough decap. To build a high-Q cap, the gate length is typically quite long for smaller gate resistance. However, the overall ESR is still considerable when taking all layers of VIAs and metals into account. Nevertheless, these decaps have much higher capacitance density.

NMOS/PMOS gate decap schematics and example layout

So are we done? Not quite. The biggest issues for these decaps lie in reliability, specially ESD and leakage performance. For many deep sub-micron nodes, the oxide is thin enough for electrons to tunnel through, leading to gate leakage current. For the same reason, the oxide layer is susceptible to breakdown when high voltage is present or an ESD event happens. As a result, these decaps can lead to catastrophic failures if not taken care of. For example, if a positive ESD event happens on the supply, which directly connects to the NMOS’s gate, the device would likely break down, causing huge leakage current or even collapsing the supply.

Between the two flavors, PMOS tend to be the more reliable (not necessarily the better performance) decap choice for most small geometry processes. Planar PMOS has lower gate leakage than NMOS. The parasitic diodes between the Nwell and substrate provide some extra ESD protection. The extra parasitic capacitance between the Nwell and substrate is another point in PMOS’ favor.

2. Cross-coupled decap

To further improve on-chip decaps’ reliability, a cross-coupled decap structure came onto the scene(here is a nice paper on decaps). The structure does look funny – a positive feedback loop leads to a stable biasing point in this decap. Under this operating point, the circuit behaves as two parallel device capacitors, each with a device on-resistance in series. This ESR is much higher than that of the gate decaps, thus will be less effective for high frequency bypassing. However, the increased gate resistance provides extra protection during an ESD event by limiting the current through the gate oxide. Most decaps in standard cell libraries today use similar structures to tradeoff reliability for performance. After all, nothing matters if your chip has a hole burnt through it.

Cross-coupled decap schematic, model and impedance over frequency

3. Thin vs. thick oxide

Another way to tradeoff reliability and performance is through the use of thick oxide (TOX) devices. TOX devices have much lower leakage current and are rated for higher voltages, and thus have a better chance of surviving ESD events. The cost, however, is smaller capacitance density (smaller capacitance due to larger distance between gate and channel).

There was an anecdote in my Ph.D. lab that a chip returned with huge off-state currents, and unfortunately nothing worked. The root cause was the large area of thin oxide NMOS decaps, coupled with perhaps improper handling of antenna effects, making the chips dead on arrival. After that incident, “only TOX decaps allowed” was an enforced rule in the group.

Industry and academia environments are certainly different and more rigorous rule checks are available today. Nevertheless, I still make my decap choices carefully because of this horror story.

4. MOM, MIM and power grid

Last but not least, we have the good old metal caps. They typically provide better quality factor, linearity and reliability than device caps, but at much lower cap density. Below is an illustration of the physical structures of MOM and MIM caps

Example bird eye view of MOM capacitor (a) and cross section view of MIM capacitor (b)

In most cases, a MOM capacitor can be stacked directly on top of a device decap to effectively increase density and quality factor. Roughly 20% cap density improvement is achievable with optimized layout. MIM caps might seem efficient because they sit in between top two layers with better density than MOM caps, but the thin plates’ high resistance is a bummer. I never used MIM caps for supply decoupling because they disrupt power grids and have mediocre performance at high frequencies. However, don’t let my personal preference deter you from trying them out and maybe they are the right fit for you.

One other “freebie” for decaps is the sidewall parasitic capacitances between power straps. Therefore, try to interleave your supply/ground lines whenever possible.

Decoupling signals

Let’s get this out of the way first: your supply is a signal. Sadly, not many people realize this until supply noise becomes a problem. What it really means is that a supply or ground pin in schematics is not a small-signal ground, so connecting decaps to these nodes requires some thoughts.

Let’s take a PMOS current bias voltage for instance. Normally a low pass filter exists between the current mirror and the destination current source (either C or RC) to lower noise. The question now is which decap type should we use.

First of all, since the decaps see a finite impedance to supply/ground, ESD is less of a concern (i.e. use of NMOS gate caps is OK). We probably want the highest cap density for area saving, so let’s stack as much MOM capacitors as possible. Ground is typically “quieter”, so let’s bypass to ground. Thus, here is our first attempt:

First attempt at decoupling current bias voltage

At first glance, there is nothing wrong with this considering noise sources from Iref or the diode connected PMOS. However, as soon as we think about noise from the supply (which we believed is noisier than ground), it sees a common gate amplifier on the right side at high frequency! If this bias current goes to an oscillator, boy would we have some jitter problems. The correct connection is to bypass the bias voltage to supply, stabilizing Vgs across the PMOS device. At the same time, a PMOS gate cap would be the better choice in terms of layout.

Supply noise injection comparisons between different decoupling schemes

Decoupling signals is often not as straightforward as it seems. I have dealt with signals that needed to have specific ratio of decoupling to supply and ground for optimal performance. Such exercises become more challenging when area becomes a constraint as well. This might seem obvious to some of you, but I am sure we all have made similar mistakes somewhere along the way. I hope this little snippet could save new designers some troubles.

Managing dummies

Finally, we get to the schematics part after a crash course on dummies and decaps.

You might already know my stance on who should initiate and manage dummies/decaps. I strongly believe designers should own the decisions on usage and placements of these devices. As evidenced above, dummies and decaps directly impact circuit performance, and sometimes determines if we have taped out a resistor or brick. So start thinking about them as soon as a schematic hierarchy is created.

There are mainly two types of transistor dummies: ones that connect to a circuit node and ones connected to supplies. My recommendation is to try your best to draw the first type in schematics as intended in layout. It’s OK to leave the supply connected dummies in a corner if you want to make schematics look cleaner, but definitely create your own floorplan. To illustrate, take the simple diff pair example below. One connects dummies to node isrc explicitly, and the other tucks them away in the corner with net name connections. Many schematics out there contain dummies like the left example. For bigger and flatter schematics, it can quickly become difficult to trace.

Different dummy drawing styles for example differential pair

The next tip involves aligning dummies in the same row as the active devices to reflect layout. The diff pair example didn’t follow this because it’s a simple circuit. We will use a conventional StrongARM latch as an example for this point.

Aligning dummies to rows of active devices in a StrongARM latch example

Note that the dummies on the vx nodes remain part of the active schematic similar to the diff pair example. On the right is a boxed section for supply connected dummies put into rows. This might seem redundant since all NMOS devices could be combined, but it creates a template for layout engineers and highlights the relative dummy locations. The dummy sizes DON’T need to be accurate when the schematic is first created. They serve as placeholders for either layout or you to fill in later. Again, dummies are for LDEs, so always keep layout in mind.

If you haven’t already realized, some PMOS dummies on the top row are connected as decaps. In general, don’t waste opportunities to turn dummies into decaps (for supply or bias alike) right next to your circuits. They are the first line of defense against switching currents or capacitive feedthroughs like in a dynamic comparator.

Should we create dedicated dummy wrapper cells? My cop out answer is that it’s a personal choice. However, if you designed the schematic hierarchy right, no level should have enough dummies to even consider a wrapper cell. So my real answer is if a wrapper cell is ever needed, it could just mean your schematic is too flat. Start wrapping active and dummy devices together.

Managing decaps

Most teams probably already have reusable decap cells. If you don’t have them, make them now!

For my first Ph.D. tapeout, the unit decap cell was the biggest time saver towards the end of the project. By using mosaic instantiation, the empty areas around the core circuits were filled up in no time. My first chip didn’t work for other reasons, but I was very proud of the decaps I taped out (can you hear me choking up?).

Cartoon chip layout, with decap mosaics for different supply domains (orange & yellow)

There could be many details that go into making these reusable decaps. Schematic-wise, they are a collection of unit decap cells with different flavors pulled from the catalog. In modern CMOS designs, these decaps’ unit area in layout fits within a power or standard cell grid. The standard cell decaps are excellent examples. We now just take that concept and apply it to higher level custom decaps.

The first advice might sound silly: make reasonably small symbols for unit decap cells. Decaps are important, but they are not the star of the show. Therefore, the real estate they take up in a schematic sheet should still be small. Case in point: a decap cell symbol in a standard library most likely is smaller than an inverter symbol. Along the same line of thinking, your custom decap cell’s symbol could be slightly bigger to include information about the decap type, but not that much bigger.

Below are some example custom decap symbols, comparing to the sizes of a typical standard cell decap and MOS symbols. By making them small but still informative, tucking these decaps away in a corner makes them less distracting in schematics.

Example custom unit decap symbols compared to standard cell decap and MOS symbols

Example StrongARM latch schematic with dummies and decaps

Moving up the schematic hierarchy, different decaps types are necessary for multiple supplies. For example, use thick oxide for IO voltage, combination of thin and thick oxide for core voltage, etc. The advice here is to ALWAYS make a dedicated wrapper cell for all the higher level decaps. The example below is not really drawn to scale. One can imagine the decap wrapper cell symbol being significantly smaller than the rest of the core circuits. Key is to again put the cell away in a corner, but it’s still easily accessible.

Decap wrapper example at higher level schematics

So what’s the big deal? Aside from a more modular schematic, there are two other main benefits.

This creates a clean interface between design and layout engineers. The layout engineer can update the decap count inside the wrapper cell him/herself without interfering with ongoing changes in the core circuits. This will save everyone some efforts during crunch time.
The magic of black boxing makes this schematic more simulatable. Accompanying the decaps are millions of parasitic capacitance and resistance when fully extracted. That’s one of the reasons why post-extraction simulations for higher level schematics are almost impossible. With this schematic, we can mix and match the extraction outputs for all blocks. The decap wrapper can stay in schematic or use C-only extraction. The opposite case could be to have the core circuit stay in schematic, but with a full RC extraction on the decaps and power grids.

The decap wrapper cell doesn’t have to be only on the top most level. In fact, I would recommend putting these cells starting at almost all mid-level blocks. Once you become used to it, it just becomes a habit of copy/paste.

Conclusions

Dummies and decaps are not the sexiest things to talk about (I have tried very hard here). They are nevertheless the key elements that ensure our circuits operate as intended. Here is a quote about decaps by Kent Lundberg (my circuit and feedback class instructor during undergrad): “Decoupling capacitors are like seat belts. You have to use them every time, whether you think you’re going to need them or not.” The same applies to dummies in today’s process nodes.

Subjects like dummies and decaps are often learned on the job or from expensive mistakes. There are many other “boring” but critical elements that require more of our attentions in a design process (mostly DFM related). Often times, fresh grads are overwhelmed with new terminologies, methodologies and productization concepts that weren’t taught in school. To address this, rating the correct usage of dummies/decaps and overall schematics for a class project might be a good starting point .

Mistakes in chip design are expensive. Ironically, the hard truth is that sometimes people learn best from expensive mistakes. The best tradeoff, then, might be to share and openly discuss more “horror stories” in order to save younger designers from these million-dollar downfalls.