System-level test (SLT), once used largely as a stopgap measure to catch issues missed by automated test equipment (ATE), has evolved into a necessary test insertion for high-performance processors, chiplets, and other advanced computational devices. Today, SLT is critical for ensuring that chips function correctly in real-world conditions, and all major CPUs, APUs, and GPUs now go through an SLT insertion before shipment.
Adding SLT in production is being considered for network processors and automotive processors for driver assistance. However, implementing SLT techniques effectively at scale poses key challenges in terms of managing costs, test times, and manufacturers’ expectations.
One of the biggest misconceptions about SLT is that it functions like ATE. ATE primarily uses pre-defined test patterns to stimulate circuit paths and check expected responses within individual cores or circuit blocks. On the other hand, SLT focuses on system interactions that occur between those cores or outside the chip.
That includes software, power management, sensor integration, and communication between internal cores and peripheral devices. Since SLT is often used to test cutting-edge chips, the test environment needs to be flexible so that it can handle application-specific conditions and different interface protocols.
This distinction is particularly relevant as the industry shifts toward chiplet-based architectures. With chiplets, manufacturers need to test how signals propagate across multiple interconnected dies, rather than just validating individual components in isolation.
Test pattern creation for traditional ATE methods, used for chip package-level testing, offers limited access to internal interactions within a multi-chip package. SLT, on the other hand, can exercise how data flows between chiplets and how this influences performance, power consumption, and overall system functionality.
However, this approach comes with its own unique complications, especially since many SLT methodologies are implemented manually.
Test coverage challenges
Using conventional design for test (DFT) techniques to generate test patterns ahead of production ramp, chip designers are lucky to get 99% coverage of all the transistors. However, for devices with 100 billion transistors, such as today’s advanced artificial intelligence (AI) processors, 1 billion transistors still go untested. Using purely ATE test methods, achieving that last 1% of test coverage could take months of development and significant tester time.
Moreover, today’s complexity of integrating heterogeneous chiplets into one large package challenges the stability and repeatability of the electromechanical stack-up in a high-volume test environment. There are limited test access points to the outside world that must stimulate pathways through multiple dies.
Because the packages are large, there may be warpage and restricted mechanical compression points for actuating the device-under-test (DUT) connections in the socket. Exercising processors and memories inside the same package under extreme test conditions, there are inevitable hot spots that must be managed to prevent damage to the device.
To provide a durable automated system-level tester with high availability in manufacturing, the customer test content must be tightly integrated with socket actuation and thermal control, along with power management and test sequencing.
Compounding the complexity of test content development is how many parties may be involved in optimizing the SLT insertion. Vendors of SLT equipment, sockets, and design and test IP must collaborate with the silicon designer/integrator, custom ASIC end-user, outsourced semiconductor assembly and test providers (OSATs), board designers, and even customers—for example, manufacturers of data centers, computer vendors and cellphone devices—to make sure the test station represents the real-world application it’s intended to test.
As the demand for processing power increases, chip designs have evolved to meet market requirements. This increase in processing power results in higher energy consumption and heat generation. So, test time for a typical SLT insertion can be a half hour or more, requiring many test stations to meet the monthly volume demands.
The buildings built to test the parts must have special facilities for electrical power and thermal control. Therefore, these test facilities aim to maximize their investment by testing as many devices in the smallest floor space as possible. However, the devices and their test application boards are getting bigger and consume more power.
Emerging developments in chip testing
Chip designers and EDA vendors have developed and introduced new DFT techniques that allow structural test content to be delivered as packetized data over standard high-speed serial ports like USB and PCIe. During SLT, these ports must be enumerated at the application level so that the port operates as intended.
Once this connection is made, the test program can switch into a test mode using a small number of high-speed pins to run structural test patterns or other built-in self-test functions. Once these serial data ports are working, the test content can be reused and correlated either to ATE with similar test stations (such as Link Scale) or post-silicon validation test stations (such as SiConic) to improve time to market and reuse.
Managing the heat dissipation of these high-power devices under extreme workload is a ubiquitous problem being addressed at the engineering, bench, ATE and SLT test insertions, and even in data-center-wide operation. Air, liquid, and refrigerants are all utilized, with an eye on environmental sustainability. Production test handlers have the added challenge of cycling heat and mechanical engagement multiple times per day.
The use of AI and machine learning (ML) is also being applied to semiconductor testing. Sharing the test result data between different test insertions, including ATE, burn-in, and SLT, feeds into AI and ML tools to improve yield, accelerate test-program development, and optimize test times.
Looking ahead
As semiconductor manufacturing becomes more complex, SLT will continue to grow in importance. For it to be truly effective, companies must integrate it into their overall testing strategy rather than treating it as a separate, isolated step. And the next generation of system-level testers must focus on addressing the challenges cited above. Success will require collaboration across design, test, and high-volume manufacturing teams, as well as a willingness to rethink traditional approaches to validation.
In an era defined by multi-chip packages, heterogeneous integration, and ever-tightening performance demands, SLT will remain a crucial tool for ensuring that cutting-edge chips perform as expected in real-world applications.
Davette Berry is senior director of Customer Programs & Business Development at Advantest.
Related Content
- System-level verification tool rolls
- 3 tips for system-level test-driven development
- Automating C test cases for embedded system verification
- Performance-Regression Pitfalls Every Project Should Avoid
- Understanding and comparing the differences in ESD testing
The post System-level test’s expanding role in producing complex chips appeared first on EDN.