vme and critical systems,gpus          Other topics:   OpenVPX, RTOS, multicore, VPX, AdvancedTCA, microcontroller, FPGAs, analog-to-digital
VME and Critical Systems
home
articles & topics
product search
White Papers
newswire
E-letter
E-cast Schedule
articles >
Technology Feature
vme and critical systems,gpus
RSS Link
Industry News - Top Stories:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • ALT Software selected to deliver DO- 178B certifiable OpenGL Drivers for ATI RadeonTM E4690 GPU
    1 year ago
  • Kontron XMC-ETH2: Robust dual Gigabit Ethernet mezzanine board with long-term availability
    10 months ago
  • DDC-I Announces Safety-Critical Migration Path for Wind River Users Targeting Non-Intel Platforms
    1 year ago
  • More Industry News headlines...
Technology Partnerships:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • US Technologies Offers Testing and Repair of VME, MVME, VMEbus Products
    1 year ago
  • AMD and SiSoftware Collaborate on Development of an Industry Benchmark Suite for OpenCL(tm)
    8 months ago
  • Mercury Computer Systems and PNNL Leverage Gaming Technology to Develop Solutions for National Security, Cyberspace, and More
    3 years ago
  • More Technology Partnerships headlines...
Contracts:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • Tundra Semiconductor's Serial RapidIO Switch Selected by VMETRO
    2 years ago
  • Tekmicro supplies signal processing system for NASA
    3 years ago
  • BAE Systems selects VMETRO Conduction Cooled Processors
    4 years ago
  • More Contracts headlines...
New Products:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • Mercury Computer Systems Tackles Processing, Exploitation, and Dissemination Challenge with Powerful GPU-Based Rugged Solution
    3 months ago
  • Elma Bustronic has Over 30 Standard Slot Sizes for 6U and 7U VME64x Backplanes
    11 months ago
  • New VPX Mesh Hybrid Backplane from Bustronic
    1 year ago
  • More New Products headlines...
People:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • USMC 234th Birthday Tribute Video
    9 months ago
  • VMETRO Bolsters Leadership Team
    5 years ago
  • Nallatech Appoints New Chairman
    3 years ago
  • More People headlines...
Mergers and Acquisitions:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • Eurotech Acquires Japanese Embedded Systems Company Advanet
    2 years ago
  • Kontron signs contract to acquire Thales Computers
    2 years ago
  • SKY Computers Expands Product Portfolio With Addition of Analogic's Test and Measurement Products
    5 years ago
  • More Mergers and Acquisitions headlines...
Conferences and Awards:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • RapidIO is the Right Serial Interconnect for Critical Embedded Systems
    3 years ago
  • Klocwork Insight Selected For VME and Critical Systems Editor's Choice Award
    2 years ago
  • Avionics Europe 2010 Embedded Graphics Solutions
    5 months ago
  • More Conferences and Awards headlines...
Media and Education:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • OpenSystems Publishing Renames VMEbus Systems Magazine to 'VME and Critical Systems' Magazine
    3 years ago
  • OpenSystems Publishing Launches New VME E-site
    3 years ago
  • Mission Critical COTS Solutions Described in New Aitech Shortform Brochure
    5 years ago
  • More Media and Education headlines...
Standard Certifications and References:
vme and critical s...
  • RSS
    • Top Stories Only
    • All Headlines
  • Atom (full text)
    • Top Stories Only
    • All Headlines
  • BittWare Commits to Long Term VITA 41 VXS Roadmap
    4 years ago
  • VITA Releases Summary of Ratified Standards
    7 months ago
  • Tundra Semiconductor Tsi578 Serial RapidIO Switch Passes RIOLAB Level 1 Device Interoperability Test
    3 years ago
  • More Standard Certifications and References headlines...
Browse topics
Search Articles
Browse Articles
See Also:
Military Articles
Embedded Computing Articles
CompactPCI Articles
Magazine >

About the Magazine
Editorial Topics
Free Subscription
Reader Service Card
Search Articles
Search Products
Contact Information
Columns

Editor's Foreword
VITA News
VITA Standards
Technology in Europe
Military Technology Insider
Guest Editorial
Defining Standards
Departments

Editor's Choice Products
by Chris A. Ciufo
VMEnow Blog
What is VME?
VME: Then & NOW
Webcasts

Upcoming E-casts
Archived E-casts
Submissions

Submit a Press Release
Submit a New Product
Submit an Abstract for Review
Vendors/Sponsors

Do an E-cast
Preferred Vendors
Upcoming Issue
Advertise
Editorial Calendar
Media Kits










Spring 2010

General-purpose GPUs breathe new life into high-performance embedded computing

By
Anne Mascarin
Mercury Computer Systems
and
Scott Thieret
Mercury Computer Systems

The GFLOPS/Watt metric is now seen as an essential measure of embedded image processing applications in defense programs, where GFLOPS/Watt metrics are crucial. However, General-Purpose GPUs (GPGPUs) are currently being used instead of CPUs in high-performance embedded computing applications where GFLOPS/Watt metrics are paramount. Prior to deciding on the CPU or GPGPU path, though, it’s important to understand the differences between GPUs and GPGPUs – and to comprehend how these GPGPUs (as opposed to CPUs) are an ideal fit for high-performance embedded computing applications.

Many advanced applications for high-performance embedded computing demand excessive amounts of computing power. Real-time imaging systems in applications such as persistent surveillance and electronic warfare applications, among others, require the highest possible GFLOPS/Watt to meet performance requirements without exceeding the power budget. Traditional CPU-based boards simply don’t meet these power budget constraints.

General-Purpose GPUs (GPGPUs) are currently being used in high-performance embedded computing applications where GFLOPS/Watt metrics are paramount. Before deciding whether to embark on the CPU or GPGPU route, however, it’s important to explore the differences between GPUs and GPGPUs – and to understand how these GPGPUs (as opposed to CPUs) are a natural fit for high-performance embedded computing applications.

The rise of the GPU versus CPU equation

Government programs are putting the squeeze on prime contractors to develop more warfighting capability faster. At the same time, the needs of embedded defense computing platforms are accelerating: to acquire more data and arrange and process it more quickly, with the goal of extracting actionable information immediately and making it available in real time to the warfighter. The need for creative and innovative solutions to the “actionable information” problem has never been stronger.

Government agency mandates and the requirement for actionable information aren’t the only pressures that affect prime contractors. Consider Size, Weight, and Power (SWaP) and historical constraints that greatly impact the adoption and performance of deployed platforms. Together, these issues force prime contractors to turn to innovative solutions in order to squeeze every ounce of performance out of their subsystems.

GFLOPS/Watt matters

Real-time imaging systems in deployed environments such as persistent surveillance, onboard exploitation, and electronic warfare applications, among others, require the highest possible GFLOPS/Watt to meet deployed performance requirements. Frequently, these subsystems are the last to be added onto the airframe and are subsequently allotted the platform power budget’s smallest portion.

Many CPU-based boards can’t keep up with the stringent GFLOPS/Watt requirements. For example, peak theoretical GFLOPS/Watt for the IBM Cell processor is 1 GFLOP/Watt, while the peak theoretical GFLOPS/Watt for AMD’s ATI RV770 GPGPU is 9.23 GFLOPS/Watt. Graphical Processing Units (GPUs), first introduced by NVIDIA in 1999, have always had very high GFLOPS/Watt metrics. Although the GFLOPS/Watt value increases with every new chip revision, GPUs traditionally performed best in the application for which they were designed – graphics processing in desktop systems.

Serial or parallel processing?

GPUs have been architected to maximize arithmetic and logic performance on one type of very self-similar data. CPUs, on the other hand, offer more control support and disposition of data – such as flow control and caching. In general, CPUs operate on data in a serial fashion; for example, even when a matrix operation is performed, it is necessary for the CPU to perform the overhead task of loading each input element sequentially. CPUs were architected to support flow control for comparisons and decision making, in addition to calculations. Conversely, GPUs process data in a parallel arrangement because they contain a matrix of multiple cores of simple Arithmetic-Logic Units (ALUs) to rapidly perform simple calculations in parallel. This high degree of parallelism is what makes GPUs efficient and fast image processing engines for high-performance military applications.

Programmability versus upgradeability

So, given their parallel performance potential and low power consumption, why haven’t GPUs been utilized much for high-performance embedded computing? Programmability is one reason, and upgradeability is another.

The software environment for GPUs is notoriously non-intuitive even to proficient embedded programmers. The environment is based on graphics primitives – not high-level language constructs or even CPU assembly variants. And the basic structure of programming tools for GPUs does not offer the optimizations that programming languages for CPUs do. GPGPUs are a relatively recent concept of GPU computing, offering a developer-friendly software environment. Software developers can now program GPGPUs with familiar constructs such as well-defined APIs and indexed matrix operations.

Historically, GPUs have not been easily upgradeable; they have been discrete components soldered directly onto the printed circuit boards. Upgrading the chip as new versions become available would require a complete board respin. Many of today’s GPGPUs (from ATI, NVIDIA, and others), however, are available in a mobile PCI Express module (MXM): an easy-to-insert format that facilitates upgrades when new, faster GPGPUs are available. Adherence to the MXM specification, developed by NVIDIA and now a stand-alone specification, ensures easy upgradeability for technology updates.

GPGPUs: A natural fit for high-performance embedded applications

A high GFLOPS/Watt ratio, parallel processing capabilities, and a programmable software environment and upgradeability are all now available with today’s GPGPUs. The application space in high-performance embedded computing for defense is clearly defined.

As mentioned, several applications in the high-performance embedded computing space could benefit from the use of GPGPUs. Persistent surveillance – an unmanned aerial vehicle application characterized by long mission duration and onboard sensor data exploitation – is a particularly good example. The long mission duration aspect of persistent surveillance demands minimal power consumption. Meanwhile, the intense computational aspect of onboard exploitation, including image stabilization and geo-registration, requires parallel processing – such as that provided by GPGPUs – to provide real-time, actionable information to the warfighter.

The missing link is a platform or environment that can support experimentation and algorithm tradeoffs. One such link is Mercury’s Sensor Stream Computing Platform (SSCP, see Figure 1), a 6U VXS-development chassis that is the size of a piece of carry-on luggage, weighs 32 pounds, draws less than 600 W from a standard wall outlet, and achieves 3.84 TFLOPS (see Figure 2). The SSCP tunable power/performance operation allows the user dial-down GPU clock speed to minimize power consumption during periods of inactivity, as is required for persistent surveillance and similar applications.

Figure1
Figure 1: The Sensor Stream Computing Platform
(click graphic to zoom)

Figure2
Figure 2: Sensor Stream Computing Platform: Peak FLOPS versus GPU clock rate
(click graphic to zoom by 1.9x)

Anne Mascarin is a Product Marketing Manager at Mercury Computer Systems, where she has been employed for five years. Previously, she worked at The MathWorks and Analog Devices, Inc. Anne holds a Master of Science in Electrical Engineering from Northeastern University and a Bachelor of Arts in Economics from Boston University. She can be contacted amascari@mc.com.

Scott Thieret is the Technical Director for GPU Computing at Mercury Computer Systems, where he has been employed for 10 years in various positions dedicated to GPU development. Prior to Mercury, he worked at Avid, MITRE, and IBM. Scott holds a Bachelor of Science in Computer Engineering from the University of Vermont. He can be contacted at sthieret@mc.com.

Mercury Computer Systems 866-627-6951 www.mc.com




©MMIX VME and Critical Systems. An OpenSystems Media, LLC publication.
About this Magazine and Website | Contact Us | VME and Critical Systems Media Kit