Easy and efficient high-performance computing on FPGAs

Date and Time:

January 28, 2011 - 2:10pm - 2:30pm

Presentation Abstract:

The use of multicore systems in all domains of computing has emphasized the benefits of heterogeneous multiprocessors, where processors of different compute characteristics can be combined to effectively boost the performance of different application kernels. GPUs and FPGAs have become very popular in heterogeneous systems for speeding up compute intensive kernels in scientific, imaging and simulation applications. FPGAs offer a highly versatile compute platform which embeds CPUs, DSPs, Memories and various communication hard IPs along with the reconfigurable fabric that enable customization of the inherent parallelism in compute intensive applications. Nevertheless, mapping the application parallelism onto reconfigurable hardware parallelism is not a push-button process, in most cases. In this presentation we will talk about the FCUDA project which provides a framework for efficient and automated mapping of CUDA kernels onto high-performance accelerators on the FPGA. FCUDA comprises a source-to-source compilation engine which works in tandem with a High-Level Synthesis tool to map compute-intensive kernels into parallel hardware compute engines. In the proposed framework we leverage different granularities of parallelism and employ an efficient design space exploration engine for determining a maximal performance configuration without iterating through the time-consuming feedback loop of synthesis and place-&-route. We will discuss experimental results that show that the FCUDA framework provides efficient transformation of CUDA kernels onto FPGA accelerators that can offer competitive performance compared to GPU execution.