Current location - Education and Training Encyclopedia - Education and training - Introduction of CTS for Clock Tree Synthesis
Introduction of CTS for Clock Tree Synthesis
When it comes to clock tree synthesis, we need to think closely around 3W and 1H, that is, what, why, when and how. I summed up these words myself, and you may not have heard them. So what do they mean respectively?

What is CTS?

Why do you want to do CTS?

When is the CT done?

How to do CTS? How to ensure the quality of CTS results?

The above questions seem simple, but not many engineers can really answer them well. CTS is a favorite question that interviewers ask when recruiting digital IC back-end jobs, because from the breadth and depth of the answers, candidates generally know each other's level.

PS: It is at the peak of the annual job-hopping. If you have technical problems or job selection problems, you can trust me privately (ic-backend20 18) for free. Those who can help will try their best to help (knowledge planet membership privilege).

So, I hope you can think more about the above questions, especially the last question, how to do a good job in China Travel Service? Never tell the interviewer which parameters you set and which command you executed.

The clock insertion delay (clock tree delay) is the shortest.

The longer the clock tree, the longer the clock tree series, and the longer the series, the greater the power on the tree. At the same time, influenced by the OCV effect, the schedule is more difficult to meet.

Take the above picture as an example. If the PLL is located in the lower right corner and the flip-flop is located in the upper left corner, the clock tree delay of the flip-flop will be the longest physical delay. Because it must be balanced with other flip-flops, other registers will also be lengthened.

Question: Suppose there is a pile of flop trees in the middle area of the core, what will be the impact?

If the PLL is placed in the middle position and a part of the upper left corner is marked with a soft block, as shown in the figure below, the delay of the whole clock tree will be greatly reduced.

Recently, a star friend on the planet of knowledge asked if registers could be placed in the narrow channel between memories. Theoretically, all kinds of channels can put registers, but in the case of deep memory files, from the perspective of CTS, it is best not to put registers, because registers in channels may drag down the whole clock tree.

Therefore, CTS-friendly plane layout is very important, which directly determines the quality of CTS.

Minimum clock skew

See this article for the concept of clock skew. Why should we pay attention to clock deviation? How to skew the small clock? How does clock skew affect setup and hold?

In most cases, we want the clock skew to be as small as possible, because it is beneficial to establish and maintain.

But sometimes we think that introducing a certain clock skew, such as CCD of S tools, can make full use of the timing margin before and after to improve the timing. For example, from the perspective of IR Drop, we don't want the registers to flip at the same time.

Violation of design rules (DRV)

DRV mainly refers to max_transition, max_cap and max_fanout. The first two are hard conditions that must be met at the signing stage.

The common clock path is as long as possible.

When the delay length of the clock tree is fixed, the common clock path should be as long as possible, so that the more CRPR compensation, the better timing.

The non-common clock path should be as short as possible.

Multistage clock gating

From the point of view of power consumption, we hope that the clock gating unit is as close to the root as possible, and most registers should be controlled by the clock gating unit. However, when the ICG is placed near the root end, the setting of the ICG enabling end is prone to problems.

Clock gated cloning/clock gated splitting

When doing CTS, PR tool can not only do clock-gated merging operation, but also do cloning and splitting operation. Above, we discussed the shortest non-common clock path in CTS phase, so it can be realized by clonal clock gating in many cases.

Clock duty cycle

The main reason for the duty cycle problem is the imbalance between the rising delay and the falling delay of the cell. So when we do CTS synthesis, we often use the clock inverter as the clock tree.

Clock signal integrity

In order to ensure the quality of the clock signal, the clock network will be wound before the signal routing and NDR will be set for it. When necessary, the clock network will be shielded.

Custom clock tree synthesis

With simple design, maybe clock_opt -cts or ccopt_design -cts can make the tree well. But for SOC design with complex clock structure, can we directly execute commands to make a tree?

Obviously you can't.

Generally, clock tree synthesis of SOC chips with medium scale and above needs to write clock tree constraint files. Because there are too many clocks, tools are difficult to handle, especially when a bunch of clocks are operated by mux, and tools often make trees very long. The tool is difficult to make, and the main reason why the tree can't be done well is that the clock structure is complex, and its complexity is beyond its ability.

Writing method of CTS constraint

If we can split the clock structure and tell the tool clearly, then it can still be done very beautifully. Of course, this premise is that you have to understand the clock structure of the whole chip.

Drawing clock structure diagram and writing clock constraint file are both essential skills of digital IC back-end engineers, and of course they are also very core skills. If you can master this skill well, is it more difficult to design and implement the whole digital IC backend?

Here, combining the case shown in the above figure, analyze how to write a clock constraint file.

First of all, the clock path of the whole chip can be divided into three parts. The first part is-> crystal oscillator; PLL, the second part is PLL->; Clock GEN, the third part is the output of the frequency divider->; Each functional module.

Secondly, it is clear that the CLOCK Gen module is used to generate various frequency-divided clock signals, and it will not directly interact with other logic. Therefore, the registers in this module are asynchronous with other registers.

Finally, the endpoint and clock anomalies of each clock path are sorted out, such as floating pins, excluding pins, not stopping pins, disconnecting some clocks that do not need to go through mux, and so on.

Transfer from: digital IC back-end clock tree integration CTS technology experience sharing (high salary is necessary! )-Zhihu (zhihu.com)