The sea of transistors available on one chip opens the door to many new image and video processing applications for the security and the transportation industries, with different trade-offs between raw performance, power and cost. Large reconfigurable chips (FPGA) such as Virtex, or even embedded FPGA IP from M2000, and new design tools let the designers use higher-level descriptions of their target archi- tecture to explore the design space and find interesting trade-offs to tackle the external memory latency, the energy use and, last but not least, the programming complexity, which must be further reduced by providing a software development environment. Such trade-offs are keys to the success of many hardware initiatives such as the Ter@ops project, launched by the System@tic pôle de compétitivité, or various start-ups dealing with MP-SoC or stream-computing.
The FREIA project, submitted by ARMINES/CMM, THALES/TRT, ARMINES/CRI and Institut TELECOM/TELECOM_Bretagne, intends to merge at the application level and to improve two different image processing accelerator architectures, called SPoC and Ter@pix, by using common high-level and low-level interfaces so as to address a larger set of applications, to support application portability to future accelerators, and to capitalize on the associated development environments which will be based on the common interfaces. A set of applications will be used to benchmark SPoC, Ter@pix and the use of both of them via dynamic reconfiguration. These architectures will also be compared to the IBM Cell in term of speed, cost, energy and programmability by developing software libraries implementing the same APIs.
The SPoC architecture developed by CMM is based on a large-grain approach. Each instruction is applied to a whole image in a pipelined way and performance gains stem from chaining instructions. The current architecture will be improved by adding new objects, new instructions, and deeper chaining. The software development environment, PIPS, provided by CRI, will automatically detect coarse-grain parallelism and chain the elementary accelerator instructions used in the benchmark applications to demonstrate the usability of the accelerator by any image application developer.
The Ter@pix architecture developed by TRT is based on a medium-grain approach. The whole image cannot be stored on the accelerator. The accelerator programming model is based on operators applied to sub-images. Two programming environments, SPEAR-DE, provided by TRT, and PIPS, will improve the usability of this accelerator by supporting the two optimization steps, tiling to accommodate the on-chip memory and loop fusion to re-use on-chip data. The Ter@pix architecture can also be used as a fine-grain SIMD machine, providing more opportunities for optimization, but at the cost of a much greater complexity of the programming environment, which must include a full compiler. This approach will be pursued by TELECOM_Bretagne, within the PIPS framework, core of the project.
Finally, the CELL architecture will be used to implement all of the above, including both the SPoC and the Ter@pix APIs. It will be used to measure the performance improvements due to specialized image processing accelerator architectures.
The expected results of the FREIA project are first of all a new image processing platform based on a common interface for two improved image-processing accelerator architectures, implemented in FPGA, together with four different developing environments to reduce the application development cost by hiding the target architecture without sacrificing performance. The SPoC and Ter@pix improved architectures will share common accelerator interfaces to make application portable and to make partial dynamic reconfiguration of the accelerator possible when some parts of an application execute faster on SPoC and some other ones on Ter@pix. Extensive benchmarking results, using existing image processing applications, will be presented for these two architectures and for the Cell processor, and also for the different application development environments. The global architecture, the architecture comparison, the common interfaces and the specific development tools will be used by TRT to guide, enrich and support the Ter@ops project. The developed IP, the development tools and the whole platform will be used by CMM in industrial applications projects with real-time constraints. CMM considers also to sell IP. By working closely with hardware specialists, CRI expects to understand better where future compilation challenges lay and how past investments in HPC can be re-used for MPSoC. It also expects the standardization efforts to be re-used for hardware accelerators beyond the image processing field. In the same way, TELECOM_Bretagne will re-invest expertise and results in its own applications fields such as turbo-codes, which were invented at TELECOM_Bretagne.