In the test and measurement industry, faster processor clock rates have traditionally reduced test time and cost. Though many companies, especially those in semiconductor and consumer electronics, have benefited from upgrading the PCs that control test hardware, the days of depending on faster clock rates for computational performance gains are numbered.
Faster clock rates have an inverse correlation to processor thermal dissipation and power efficiency. Therefore over the last decade, the computing industry has focused on integrating multiple parallel processing elements or cores instead of increasing clock rates for increasing the computing performance of CPUs. Moore’s law states that transistor counts double every two years, and processor vendors use those additional transistors to fabricate more cores. Today, dual- and quad-core processors are common in desktop, mobile, and ultra-mobile computing segments, and servers typically have 10 or more cores.
Traditionally, the test and measurement industry has relied on computers with a desktop and/or server class of processor for higher performance. As recent sales trends indicate, the desktop segment of the computing industry is shrinking. From Q3 2013 toQ3 2014, Intel saw a 21 percent increase in notebook platform volumes but only a 6 percent increase in desktop volumes. Thistrend reveals that casual consumers are moving toward more portable yet powerful platforms such as ultrabooks, tablets, and all-in-ones. For better addressing the demands of the faster growing market segment, the computing industry is focusing on improving the graphics performance and power efficiency of the ultra-mobile, mobile, and desktop classes of processors. Increasing computational performance for these processor categories is generally a tertiary consideration. High-end mobile and desktop processors will continue to offer adequate computational performance for test and measurement applications. However, limited improvements in their raw processing capabilities between newer generations of these processors should be expected.
For the server class of processors, the main applications for the computing industry are IT systems, data centers, cloud computing, and high-performance computing for commercial and academic research. These applications are significantly more computationally intensive and are pushing the computing industry to continue to invest in increasing the raw computational capabilities of this server class of processors. These applications are inherently parallel, and they typically spawn numerous virtual machines or software processes based on user demand. Their usage model along with processor power efficiency concerns is driving the computing industry to add more processing cores rather than focus on increasing core frequencies. As an example, in late 2014 Intel released its Xeon E5-2699 v3 processor with 18 general-purpose processing cores.
Over the last five years, single-threaded applications have achieved performance gains when moving to the next generation of processor by leveraging innovations such as Intel Turbo Boost Technology. But as evident from computing industry trends, it’s highly unlikely these applications will continue to see these benefits. Applications designed to take advantage of parallel computing can leverage the benefit of added cores to realize impressive performance gains. Engineers who use NI PXI embedded controllers, which are modular PCs, have become accustomed to a 15 to 40 percent computing performance increase by modifying their applications to be multi-threaded and upgrading to the next generation of processor. But without the proper care in creating an application with parallelism, these gains will be minimal.
Many-Core
More cores are being pushed into smaller, lower power footprints. Processors are becoming “many-core” as core counts soar higher than the 10 cores common in server-class processors today. Super computers provide an idea of what the processors of tomorrow will look like. The last five years of the TOP500, which lists computers ranked by their performance on the LINPACK Benchmark, show that the world’s fastest computers are built with millions of cores. Though these multimegawatt beasts aren’t well-suited to run a test station on the production floor, the idea that more functionality will be packed into smaller and smaller spaces means many-core processors will be able to scale down to other processor classes as the power footprint is reduced. An example of this trend is the Intel Xeon Phi class of coprocessors, which offers up to 61 cores and provides the ability to concurrently execute 244 threads.
Clearly, some cores are being devoted to special functions instead of solely to general computing. Graphics processing engines are a good example, with video displays at high resolutions showing more realistic 3D rendering. Other special-purpose cores include security engines that perform root-of-trust and encryption/decryption operations and manageability engines that allow for out-of-band management if the processor is hung, in reset, or otherwise unreachable. However, for these many-core processors, the majority of cores will be available for general computing.
Leveraging Many-Core
With the relative plateauing of the general-purpose computing capabilities of high-end mobile and desktop processors, engineers who want their test applications to maximize performance, lower test times, and hence reduce the overall cost of ownership will need to start adopting server-class processors with many-core architectures.
Software architectures that divide computing work and can scale to leverage more than 10 processor cores will be required. Consider which tasks can be implemented inparallel from the beginning when designing new applications. For example, in an end-of-line test scenario, measure and analyze multiple units at once and, for each unit, perform more than one test at a time. Thinking further ahead, split data analysis into multiple parallel tasks by processing chunks of data at a time or reordering parts of the algorithm to make more tasks available for computation at once. Though completing more work in parallel means measurements and analyses will have to be carefully correlated and collected to achieve a coherent overall test result, the reward is worth the effort.
When considering implementation, choose tools that allow a user to maximize the parallelism in an application. Selecting an optimizing compiler, multithreaded analysis routines, and thread-safe drivers is a good starting point. Also make sure that implementation languages offer strong support for threading and an appropriate level of abstraction so that the increased software complexity does not negatively affect developer efficiency.
Ignoring parallelism, at best, will result in tepid performance gains as processors evolve. The market is pushing for graphics improvements and higher core counts. Though test and measurement applications most likely won’t use the graphics features, newer processors with higher core counts offer valuable performance gains to test applications designed to benefit from the upward trend in core count.
The article was contributed by Gabriel Narus, Principal Hardware Engineer, National Instruments.
Disclaimer: All the views and opinions expressed in this article are those of the author.
Leave a Reply