Technical Session 1: GPUbased computingRay
Bittner, Erik Ruf: Direct GPU/FPGA Communication Via PCI Express Parallel processing has hit mainstream
computing in the form of CPUs, GPUs and FPGAs. While explorations
proceed with all three platforms individually and with the CPUGPU pair,
little exploration has been performed with the synergy of GPUFPGA.
This is due in part to the cumbersome nature of communication between
the two. This paper presents a mechanism for direct GPUFPGA
communication and characterizes its performance in a full hardware
implementation.
Brad
Suchoski, Caleb Severn, Manu Shantharam and Padma Raghavan: Adapting Sparse
Triangular Solution to GPUs High performance computing systems are
increasingly incorporating hybrid CPU/GPU nodes to accelerate the rate
at which floating point calculations can be performed for scientific
applications. Currently, a key challenge is adapting scientific
applications to such systems when the underlying computations are
sparse, such as sparse linear solvers for the simulation of partial
differential equation models using semiimplicit methods. Now, a key
bottleneck is sparse triangular solution for solvers such as
preconditioned conjugate gradients (PCG). We show that sparse triangular
solution can be effectively mapped to GPUs by extracting very large
degrees of fine grained parallelism using graph coloring. We develop
simple performance models to predict these effects at intersection of
the data and hardware attributes and we evaluate our scheme on a Nvidia
Tesla M2090 GPU relative to the level set scheme developed at NVIDIA.
Our results indicate that our approach
significantly enhances the available finegrained parallelism to
speedup execution time compared to the NVIDIA scheme, by a factor with a
geometric mean of 5.41 on a single GPU, with speedups as high as 63 in
some cases.
Presentation
Pedro
ValeroLara: MRF satellite image classification on GPU One of the stages of the analysis of
satellite images is given by a classification based on the Markov Random
Fields (MRF) method. It is possible to find in literature several
packages to carry out this analysis, and of course the classification
tasks. One of them is the Orfeo ToolBox (OTB). The analysis of satellite
images is an expensive computational task requiring real time execution
or automatization. In order to reduce the execution time spent on the
analysis of satellite
images, parallelism techniques can be used. Currently, Graphics
Processing Units (GPUs) are becoming a good choice to reduce the
execution time of several applications at a low cost. In this paper, the
author presents a GPUbased classification using MRF from the
sequential algorithm that appears in the OTB package. The experimental
results show a spectacular reduction of the execution time for the
GPUbased algorithm, up to 225 times faster than the sequential
algorithm included in the OTB package. Moreover, this result is also
observed in the total power consumption, which is reduced by a
significant amount.
Presentation
