|
|
|
| |
Design Methodologies
for DSM ASIC designs The latest
process technologies require interactions among tools and
the various aspects of the design to ensure successful silicon.
by Ravi Thummarukudy |
|
|
|
|
|
| The advent of deep-submicron
process geometries and compact wiring pitches allows the
integration of millions of gates of logic on a single die--true
systems on a chip. To adequately address the resulting design
size and complexity requires that designers use ever higher
levels of abstraction. At the same time, the design implementation
in a deep-submicron process requires much more attention
and detailed analysis in the areas of timing, signal integrity,
electromigration, and a host of other physical areas. Designers
must ensure that the EDA tools and methodologies they use
address not only the physical DSM effects but also the capacity,
speed, and repeatability required to produce successful
designs.
The new methodologies come in a variety of flavors, all
working toward a more effective and streamlined design process.
EDA tools attempt to eliminate uncertainty caused by DSM
interconnect delay by creating tighter links between the
logical and physical domains, thus reducing the number of
times the ASIC vendor must pass the design back to the design
team. Such new methods rely on feedback from the floorplanner,
as well as the logic synthesizer's analysis of timing constraints,
to achieve greater timing convergence. Other new methodologies
take advantage of some sophisticated postlayout optimization
features offered by recent tools. Using a hierarchical rather
than flat approach to partitioning speeds the entire process
by localizing problems that arise during design and implementation.
Finally, static timing analysis tools guarantee sign-off
and work faster than traditional simulators; raising sign-off
to the RT level reduces the number of transactions between
the silicon vendor and the designer. |
| Figure 1 - DSM interconnect
effects |
 |
| |
Processes larger than DSM
use short, wide interconnect (a); DSM processes, on the
other hand, use tall and narrow wires, increasing capacitance
and noise effects (b).
Of course, not everyone has to undertake
major methodology changes, since many DSM effects cause
major problems only in large (high-gate-count) designs.
Designers should thus evaluate their design environments
and future requirements before investing time and money
in newer tools and methodologies. But for those designers
struggling with large designs, the new methodologies can
pay off handsomely.
Until the time comes when every EDA tool
shares the same formats, libraries, and delay calculation
tools, we're left with only one choice: to both work toward
integrating those tools and to put them together--all by
ourselves--with a good methodology. Designers should start
with a basic flow from one of the major EDA vendors, building
a methodology around tools and filling in the gaps with
tools of their own or from other companies. Of course, that
process takes time and effort, but a sound DSM design methodology
is as good an intellectual property as the design itself.
Designers can only achieve success though an integrated
approach to ongoing methodology development combined with
documentation and training. From what we now know, it's
safe to assume that automation tools will continue to lag
behind the technology; only the designer's customized tools
and utilities can bridge the gap between a design challenge
and the limits of automation.
Shrinking tradition
As process geometries shrink to 0.25- and
0.18-µµm technologies, several physical characteristics
of the IC fabrication process begin to predominate. The
biggest standout is the dominance of interconnect delays
(wire delays) over the intrinsic delays of logic gates.
When the technology moves to lower geometries, the interconnects
scale differently from the logic gates. They reduce the
wiring pitches to achieve higher packing densities while
increasing the metal to constrain elevated resistance in
the wires. Such nonuniform scaling produces tall thin wires
rather than short broad ones (see Figure 1). The resulting
geometry, however, still means an increase in wire resistance.
Higher densities and smaller wiring pitches also increase
the coupling capacitance between wires. The taller, thinner
wires tend to produce more parallel-plate capacitance than
capacitance between layers--parallel capacitance that in
fact dominates the total capacitance in DSM designs. Together,
the greater resistance and capacitance lead to dramatic
increases in interconnect delays. Various studies of 0.18-µµm
processes attribute more than 70 percent of the critical
path delays to the interconnect delay. |
| Figure 2 Timing-driven
DSM flow with floorplanning |
 |
| |
To allow timing and design
convergence, the DSM design flow must contain very tight
links from synthesis to place and route by way of the floorplanner.
Similarly, the narrow lines and increased
metal line resistance can cause an IR drop that drives the
supply voltage below the required voltage. Narrow lines
can also break over time because of metal fatigue--an effect
known as electromigration. Improved design reliability requires
attention to electromigration issues, which are particularly
prominent in the signals that switch frequently. Designers
need to address increased power densities as well.
The current design methodology, which simply
can't handle the new technologies efficiently, is as follows:
The design team first specifies and verifies the design
in either Verilog or VHDL at the RT level. After ensuring
that the design meets the specifications, the team begins
its synthesis, targeting it to a specific technology library
from its ASIC vendor. The process transfers the technology-independent
design information to a technology-specific netlist. The
design is again verified for functionality and analyzed
for its timing characteristics. Once the netlist passes
the gate-level analysis, the design team hands it over to
the ASIC vendor in what's typically known as the first sign-off.
The vendor ensures that the netlist is clean (contains no
naming rule violations, for example), meets the design rules,
and can survive the physical design process.
During physical design, the design moves
through a series of steps starting with floorplanning, placement,
clock tree synthesis, and routing. After placement and routing,
the vendor extracts the layout parasitics (wire resistance
and capacitance) from the design and a delay calculator
creates the RC delays corresponding to each of the gates
as well as the interconnect in the Standard Delay Format
(SDF). The vendor then returns the SDF to the designer,
who verifies the timing of the design at the gate level
for its specs with back-annotated delays. If the design
meets the timing and functionality requirements, the designer
hands the design over to the vendor for prototypes (known
as second sign-off). If the design doesn't meet the specs
with the back-annotated SDF, it may require further changes.
Clearly, the current methodology--at least as it applies
to DSM designs--offers too many opportunities for reiteration
and delay.
DSM ASIC methodologies
Interconnect delays can wreak havoc on the
traditional design flow, because we can't determine the
exact wire lengths in the chip--hence the resistance and
capacitance of the interconnect--until routing is complete,
and by then it's too late. Any ensuing design changes from
new information would thus cause time-consuming reiteration
of the synthesis and place and route. The synthesis tools
at the front end optimize the design for timing and area
based on the logic gates and estimated interconnect (wire
load models) available in the target library. The size and
aspect ratio of blocks of logic determine the ASIC vendor
library's estimated wireload models, which in turn depend
on estimated capacitance per fanout. The actual interconnect
characteristics of the block under design may thus differ
from the wireload model used by the synthesis tool.
Inaccurate estimations of the interconnect
length in the first place, combined with the synthesis tool's
inability to optimize the design for both cell area and
interconnect, cause a working design at the logic design
level to fail to meet timing specs where it counts: at the
physical design level. Current procedures call for multiple
repetitions of the synthesis and placement and routing until
the physical design meets the timing specs. Such a process,
however, is itself unstable because a potential fix for
one timing violation can result in a violation in another
path. Designs can end up highly congested and impossible
to route.
The solution to the timing convergence issue
is to estimate the interconnect characteristics of the block
under design and feed that information to the synthesis
tool as early as possible in the design flow. Specifically,
using floorplanning tools early on and setting up a dataflow
between the floorplanning and synthesis tools can significantly
reduce the number of necessary iterations.
A timing-driven approach
Several currently available methodologies
improve the chances of timing convergence for large designs.
The first stage of the timing-driven flow uses a floorplanning
tool to obtain a more accurate estimate of the interconnect
characteristic of the block under design (see Figure 2).
Today's floorplanning tools generally require a gate-level
netlist to estimate the size of the block, though initial
versions of some recent tools work at the RT level. In either
case the floorplanning tool can assign the pin locations,
plan the location of megacells and memories, and determine
the size and aspect ratio of groups of logic. The tool thus
groups the logic to ensure that the logic within each group
exhibits a high interconnectivity when the logic between
groups exhibits low interconnectivity. |
| Figure 3 Static
timing sign-off |
 |
| |
| The relatively small, short
iterations in the timing-driven layout cycle facilitate
thorough coverage of static timing analysis, simplifying
the timing verification for ASIC sign-off.
The floorplanning tools are interactive,
allowing designers to move one group or megacell with respect
to the I/O connectivity and adjust the connectivity with
respect to other groups or megacells. Once the designers
partition the logic into groups and determine the locations
of groups and megacells within the die, they can estimate
the interconnect characteristics of the blocks more accurately.
Such design-specific wireloads, known as custom wireload
models, are more accurate than the estimated wireload models
in the vendor libraries. After the synthesizer uses the
custom wireload models to synthesize the blocks, it passes
the resulting netlist to the floorplanning tool. The floorplanner
can also analyze the congestion and routability of the design.
The second stage of the this approach uses
the logic synthesis tool to identify the design's timing
constraints, which the tool annotates forward to the place-and-route
engine. The engine can then complete the placement and routing
based on the path delay constraints. The success of that
approach depends on the early estimation of parasitics,
the forward annotation of design constraints, and the two-way
flow of information from physical design to logic design--each
loop progressively reducing the timing divergence. If implemented
well, that approach can substantially reduce the uncertainty
of the DSM design process. The trick is to make the tools
work well together under a friendly user interface.
A high percentage of an SOC design--for
instance, memories, hard IP, and macros whose interconnect
characteristics are already known--is predefined, so only
part of a design will be totally new. The floorplanners
can use the information in analyzing the designs for connectivity,
congestion, and routability. Such early estimation could
yield a more accurate representation of the interconnect
characteristics, allow the synthesis tool to achieve better
results in the first pass itself. Tools that such a flow
could integrate include Synopsys's Design Compiler and Floorplan
Manager and Cadence's Design Planner, Qplace, CTgen, and
Warp Router from the Silicon Ensemble.
The layout of the land
Postlayout optimization features available
within the Synopsys tools can return the progressively accurate
capacitance and delay files to Design Compiler to re-optimize
the design. The user can control the re-optimization, initiating
changes as simple as replacing low drive cells with higher
drive cells, or as complex as adding altogether new buffers
to reduce timing violations. Once the netlist absorbs the
changes, the user can integrate them into the physical design
database at the place or route level, depending on the number
of changes made to the previously routed design. Fewer changes
to the netlist could allow an engineering change order without
changing existing placement and clock tree buffers. More
changes might mandate removing the wires and clock tree
and redoing the placement, clock tree synthesis, and routing.
That process at least offers an alternative to redoing the
entire layout--which would mean a return to the original
unpredictability and timing convergence issues. Though it
does require an additional tool license, the alternate optimization
method successfully fixes timing violations after placement
and routing, especially when enhanced by tighter integration
with the physical design tools.
The traditional physical design methodology
flattens the design, as the physical partitioning differs
from the logical hierarchy. For most small designs, that
approach is still the easiest. But the increasing complexity
and size are making timing convergence using a flat physical
design ever more difficult to accomplish. Likewise, DSM
process technologies are challenging the capacity and speed
of established EDA tools, which are breaking as gate counts
exceed the tools' capacity limitations.
To overcome those limitations, designers
need to partition their designs into chunks that tools can
handle easily at the block level, reassembling them at the
top level. Such a maneuver isn't as easy as it sounds, since
it must address many issues concerning timing convergence,
clock tree balancing, and routability across block boundaries.
The partitions should produce the shortest interblock wiring
and contain the fewest subblocks possible. Clock-balancing
across multiple blocks is tricky, perhaps requiring a different
set of tools altogether. The tools in the methodology must
also address other features, such as special routing (shielding
for key signals, for example).
Shifting sign-offs
The traditional ASIC sign-off process checks
for the design rules, test vector rules, and sign-off simulation
results. The ASIC vendors normally qualify a few popular
EDA tools for that purpose. The sign-off usually occurs
at the netlist level: the ASIC designer performing the simulation
and synthesis, the silicon vendor the physical design. Changes
to the sign-off paradigm are under way because the SDF files
are too large to simulate at the gate level. The event-driven
simulators tend to slow significantly at the gate level,
especially with the back-annotated SDF and all the timing
checks turned on. Since simulation results are only as good
as the vectors used to test the design, the sign-off is
now shifting from event-driven simulation to static timing
analysis (see Figure 3). Static timing analysis tools guarantee
sign-off based on their timing reports, while providing
much faster chip-level timing analysis than the simulators.
A subset of the test vectors used in the RTL simulation
can verify the gate-level functionality. Another possible
combination might take the form of a static timing analysis
tool and a formal verification tool that can compare the
design between the RT and gate levels or among the gate
levels. One drawback to this methodology is that STA tools
somewhat restrict the design style.
Many transactions and hand-off procedures--particularly
the exchange of design and delay files--occur between the
ASIC vendor and customer during the sign-off process. The
uncertainty of timing convergence in DSM can result in an
even larger number of interactions at the hand-off level.
One option is to raise sign-off to the RT level: Once the
design team has verified the design at the RT level and
well documented the constraints for the designs, it can
hand the design over to the vendor, who implements the design
according to the design specifications and constraints.
The benefit of that approach arises from a reduction in
the number of transactions between the silicon vendor and
designer. The drawback is that the silicon vendor ends up
having to learn both the design and the implementation processes--a
responsibility the vendor may find highly challenging.
To solve timing convergence problems, the
EDA tools in design flows must support compatible design
and constraint formats, libraries, and delay calculation
and timing analysis tools--since the tools must understand
the design and constraints as well as analyze and report
the timing based on the same calculation. A look at the
EDA landscape and the core competencies of its companies
makes it quite clear that a working solution demands the
integration of tools from many vendors; better yet, it needs
an integrated flow from leading vendors bolstered by high-performance
point tools from exciting start-ups. EDA tools from a single
company often don't support the same libraries, delay calculators,
and timing analysis engines--a problem that becomes more
acute when the tools are from different companies. As always,
though, the effectiveness of the tool ultimately depends
on the skill and care of the carpenter. |
| |
|