GPU-Graphics Processing Unit IEEE PAPER 2017





APUNet: Revitalizing GPU as Packet Processing Accelerator.
free download

Abstract Many research works have recently experimented with GPU to accelerate packet processing in network applications. Most works have shown that GPU brings a significant performance boost when it is compared to the CPU-only approach, thanks to its highly-

Compiler techniques to reduce the synchronization overhead of gpu redundant multithreading
free download

ABSTRACT Redundant Multi-Threading (RMT) provides a potentially low cost mechanism to increase GPU reliability by replicating computation at the thread level. Prior work has shown that RMTs high performance overhead stems not only from executing redundant threads,

GPU Taint Tracking
free download

Without address space layout randomization, an attacker can predict where GPU data is stored.[Patterson, ISU thesis 2013]Without process isolation, an attacker can peek into another GPU process, steal encryption keys.[Pietro+, TECS 2016]Without page protection

Gravel: Fine-Grain GPU-Initiated Network Messages
free download

ABSTRACT Distributed systems incorporate GPUs because they provide massive parallelism in an energy-efficient manner. Unfortunately, existing programming models make it difficult to route a GPU-initiated network message. The traditional coprocessor model

Analyzing memory management methods on integrated CPU-GPU systems
free download

Abstract Heterogeneous systems that integrate a multicore CPU and a GPU on the same die are ubiquitous. On these systems, both the CPU and GPU share the same physical memory as opposed to using separate memory dies. Although integration eliminates the need to

GPU Multisplit: an extended study of a parallel algorithm
free download

1This paper is an extended version of initial results published at PPoPP 2016 [3]. The source code is available at https://github. com/owensgroup/GpuMultisplit. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted

Moving Object Detection by Connected Component Labeling of Point Cloud Registration Outliers on the GPU.
free download

Abstract: Using a depth camera, the KinectFusion with Moving Objects Tracking (KinFu MOT) algorithm permits tracking the camera poses and building a dense 3D reconstruction of the environment which can also contain moving objects. The GPU processing pipeline

A GPU deep learning metaheuristic based model for time series forecasting
free download

Abstract As the new generation of smart sensors is evolving towards high sampling acquisitions systems, the amount of information to be handled by learning algorithms has been increasing. The Graphics Processing Unit (GPU) architecture provides a greener

Computational Power Optimization of 3D Virtual Audio Techniques on DSP and GPU Platforms
free download

3D virtual audio techniques are used in many spatial audio applications such as home theater entertainment, gaming, teleconference and remote control. With binaural loudspeakers, these techniques are able to offer virtual surround sound effects to the listener

GPU Based Face Recognition System for Authentication
free download

ABSTRACT-Face has significant role in identifying a person for authentication purpose in public places such as airport security. Face recognition has many real world applications including surveillance and authentication. Due to complex and multidimensional structure of

Computing delaunay refinement using the GPU.
free download

Abstract We propose the first working GPU algorithm for the 2D Delaunay refinement problem. Our algorithm adds Steiner points to an input planar straight line graph (PSLG) to generate a constrained Delaunay mesh with triangles having no angle smaller than an input

Statistical Pattern Based Modeling of GPU Memory Access Streams
free download

ABSTRACT Recent research studies have shown that modern GPU performance is often limited by the memory system performance. Optimizing memory hierarchy performance requires GPU designers to draw design insights based on the cachememory behavior of

Energy Efficient Real-time Task Scheduling on CPU-GPU Hybrid Clusters
free download

AbstractConserving the energy consumption of large data centers is of critical significance, where a few percent in consumption reduction translates into millions-dollar savings. This work studies energy conservation on emerging CPU-GPU hybrid clusters

Achieving Portable Performance for GTC-P with OpenACC on GPU, multi-core CPU, and Sunway Many-core Processor
free download



GPU Parallel Program for the Bin Packing Problem
free download

The purpose of this paper is to explore the use of GPU computing for solving the famous bin packing problem. Specifically, a massively parallel seesaw search program was constructed using Nvidias CUDA API and the Parallel Java 2 Library. The speed and quality of the

High-Throughput Subset Matching on Commodity GPU-Based Systems
free download

Abstract Large-scale information processing often relies on subset matching for data classification and routing. Examples are publish/subscribe and stream processing systems, database systems, social media, and information-centric networking. For instance, an

Strategies for Regular Segmented Reductions on GPU
free download

Abstract We present and evaluate an implementation technique for regular segmented reductions on GPUs. Existing techniques tend to be either consistent in performance but relatively inefficient in absolute terms, or optimised for specific workloads and thereby AbstractIn this contribution, an advanced numerical regression approach based on graphics processing unit (GPU) is introduced. The approach has been applied for real-time terahertz thickness measurements of individual layers within multi-layered structures for a

Parallel continuous collision detection for high-performance GPU cluster.
free download

Abstract Continuous collision detection (CCD) is a process to interpolate the trajectory of polygons and detect collisions between successive time steps. However, primitive-level CCD is a very time-consuming process especially for a large number of moving polygons.

Response-Time Bounds for Concurrent GPU Scheduling
free download

AbstractGraphics processing units (GPUs) have been receiving increasing attention in the real-time systems community as a potential solution for hosting workloads like those found in autonomous-driving use cases that require significant computational capacity. Allowing

Mini-Gunrock: A Lightweight Graph Analytics Framework on the GPU
free download

Abstract: Existing GPU graph analytics frameworks are typically built from specialized, bottom-up implementations of graph operators that are customized to graph computation. In this work we describe Mini-Gunrock, a lightweight graph analytics framework on the GPU.

CPU and GPU Behaviour Modelling Versus Sequential and Parallel Bias Field Correction Fuzzy C-means Algorithm Implementations
free download

Abstract The correction of images corrupted by bias field artefact is still challenging task both at accuracy level as on the computational plane. The work in this paper focus on the second constraint by giving mathematical models of experimental execution time per iteration ETPI

Corolla: GPU-Accelerated FPGA Routing Based on Subgraph Dynamic Expansion.
free download

Page 1. Corolla: GPU-Accelerated FPGA Routing Based on Subgraph Dynamic Expansion Minghua Shen and Guojie Luo FPGA-February 23, 2017 Peking University 1 Page 2.Motivation BackgroundSearch Space Reduction for Routing(4) (4) (3) (3) Page 9. Dynamic Parallelism

MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability
free download

ABSTRACT Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moores law slows down, and the number of transistors per die no longer grows at historical rates, the performance curve of single

gNUFFTW: Auto-Tuning for High-Performance GPU-Accelerated Non-Uniform Fast Fourier Transforms
free download

AbstractNon-uniform sampling of the Fourier transform appears in many important applications such as magnetic resonance imaging (MRI), optics, tomography and radio interferometry. Computing the inverse often requires fast application of the non-uniform

Cooperative kernels: GPU multitasking for blocking algorithms
free download

ABSTRACT There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorithms are typically blocking, so they require fair scheduling. But GPU programming models (eg OpenCL) do not mandate fair scheduling, and GPU schedulers

Simulations of Coherent Synchrotron Radiation on Parallel Hybrid GPU/CPU Platform
free download

Abstract Coherent synchrotron radiation (CSR) is an effect of selfinteraction of an electron bunch as it traverses a curved path. It can cause a significant emittance degradation, as well as fragmentation and microbunching. Numerical simulations of the 2D/3D CSR effects have

Continuous and discrete models of melanoma progression simulated in multi-GPU environment
free download

Abstract. Existing computational models of cancer evolution mostly represent very general approaches for studying tumor dynamics in a homogeneous tissue. Here we present two very different models, continuous and discrete ones, of a specific cancer type melanoma

Towards Composable GPU Programming: Programming GPUs with Eager Actions and Lazy Views.
free download

Abstract In this paper, we advocate a composable approach to programming systems with Graphics Processing Units (GPU): programs are developed as compositions of generic, reusable patterns. Current GPU programming approaches either rely on low-level,

Software Puzzle for GPU Inflated DoS Attack
free download

Abstract: Denial-of-service (DoS) and distributed DoS (DDoS) are among the major threats to cyber-security, and client puzzle, which demands a client to perform computationally expensive operations before being granted services from a server, is a well-known

Modular array-based GPU computing in a dynamically-typed language
free download

Abstract Nowadays, GPU accelerators are widely used in areas with large data-parallel computations such as scientific computations or neural networks. Programmers can either write code in low-level CUDA/OpenCL code or use a GPU extension for a high-level

Just-In-Time GPU Compilation for Interpreted Languages with Partial Evaluation.
free download

Abstract Computer systems are increasingly featuring powerful parallel devices with the advent of many-core CPUs and GPUs. This offers the opportunity to solve computationally- intensive problems at a fraction of the time traditional CPUs need. However, exploiting

Towards Efficient Graph Traversal using a Multi-GPU Cluster
free download

AbstractGraph1 processing has always been a challenge, as there are inherent complexities in it. These include scalability to larger data sets and clusters, dependencies between vertices in the graph, irregular memory accesses during processing and traversals,

Detecting Bank Conflict of GPU Programs Using Symbolic ExecutionCase Study
free download

Abstract GPU (Graphics Processing Unit) is used in various areas. Therefore, the demand for the verification of GPU programs is increasing. In this paper, we suggest the method to detect bank conflict by using symbolic execution. Bank conflict is one of the bugs happening

GPU-GIST a case of generalized database indexing on modern hardware
free download

Abstract: A lot of different indexes have been developed for accelerating search operations on large data sets. Search trees, representing the most prominent class, are ubiquitous in database management systems but are also widely used in non-DBMS applications. An

A GPU Variant of Mbtrack and Its Application in SLS-2
free download

Abstract Mbtrack is a widely used multi-bunch tracking code for modeling collective instabilities in electron storage rings. It has been applied to the Swiss Light Source upgrade proposal (SLS-2) for the study of single bunch instabilities. However, an n-bunch simulation

Visual Analytics of Millions of GPU Threads
free download

Abstract Although the GPGPU has been widely used in various fields for algorithms acceleration, it is notorious for its programming difficulties because of many different concepts from general CPU programming and issues from the huge number of concurrent

Study of Parallel Image Processing with the Implementation of vHGW Algorithm using CUDA on NVIDIAS GPU Framework
free download

Abstract-This paper provides an effective study of the implementation of parallel image processing techniques using CUDA on NVIDIA GPU framework. It also discusses about the major requirements of parallelism in medical image processing techniques. Additional

Evaluation Of The Performance Of GPU Global Memory Coalescing
free download

AbstractNowadays, GPU is widely used for graphics and general-purpose parallel computations. In the GPU software development, memory coalescing is one of the most important optimization techniques, which reduces the number of memory transactions. In this

GPU-accelerated Video Transcoding Unit for Multi-access Edge Computing Scenarios
free download

AbstractThe exponential growth of video traffic and the outburst of novel video-based services is revealing the inadequacy of the traditional mobile network infrastructure. To respond to this and to many other demands coming from todays society, the 5G and the

Large Integer Arithmetic in GPU for Cryptography
free download

ABSTRACT Most computer nowadays support 32 bits or 64 bits of data type on various type of programming languages and they are sufficient for most use cases. However, in cryptography, the required range and precision are more than 64 bits which are The Raspberry Pi was created to meet a need to help younger people become involved in the IT field. As a low-cost computer, it can be used, experimented with, broken, and replaced. Initially expected to sell perhaps a few thousand, it has now sold more than 10

GPU-Centered Font Rendering Directly from Glyph Outlines
free download

Abstract This paper describes a method for rendering antialiased text directly from glyph outline data on the GPU without the use of any precomputed texture images or distance fields. This capability is valuable for text displayed inside a 3D scene because, in addition to

ANALYSIS OF RAY BATCHING ON THE GPU
free download

Abstract Due to the large amount of scene data in production renderers that use Monte Carlo techniques, efficient and fast ray tracing means batching up rays in some arbitrary amount. Hyperion, Disneys renderer, pools 33 million rays for any scene to be rendered, and this

Real-time 3D integral imaging system using a faster elemental image generation method using GPU parallel processing
free download

A novel method of faster computation of Elemental Image generation for real time integral imaging 3D display system, with the implementation of GPU parallel processing is proposed. Previous experiments were conducted to generate Real Time Integral Image and resulting

GPU Parallelization of Back-Propagation Neural Network
free download

Abstract: Graphics Processing Unit (GPU) can provide remarkable performance gains when compared to Central Processing Unit (CPU) for computational intensive application. GPU has acquired programmability to perform general purpose computation fast by running ten

INTELLIGENT SCHEDULING FOR SIMULTANEOUS CPU-GPU APPLICATIONS
free download

ABSTRACT Heterogeneous computing systems with both general purpose multicore central processing units (CPU) and specialized accelerators has emerged recently. Graphics processing unit (GPU) is the most widely used accelerator. To fully utilize such a

Accelerating GPU Hardware Transactional Memory with Snapshot Isolation
free download

ABSTRACT Snapshot Isolation (SI) is an established model in the database community, which permits write-read conflicts to pass and aborts transactions only on write-write conflicts. With the Write Skew anomaly correctly eliminated, SI can reduce the occurrence of

Inferring Scheduling Policies of an Embedded CUDA GPU
free download

Abstract Embedded systems augmented with graphics processing units (GPUs) are seeing increased use in safety-critical real-time systems such as autonomous vehicles. Due to monetary cost requirements along with size, weight, and power (SWaP) constraints,

Evaluating a CPU/GPU Implementation for Real-Time Ray Tracing
free download

Abstract Animated movies, CGI, and video games are commonplace in every day life. Virtual- and Augmented Reality are becoming more pervasive in society, and with them, the role of Computer Graphics becomes even more important. Part of creating these experiences is

Parallel Execution Optimization of GPU-aware Components in Embedded Systems
free download

AbstractMany embedded systems process huge amount of data that comes from the interaction with the environment. The Graphics Processing Unit (GPU) is a modern embedded solution that tackles the efficiency challenge when processing a lot of data. GPU

A unified GPU-CPU aeroelastic compressible URANS solver for aeronautical, turbomachinery and open rotors applications
free download

English abstract: For the aerodynamic design of aeronautical components Computational Fluid Dynamics (CFD) plays a fundamental role. Pure CFD analyses are usually sufficiently accurate for a wide range of problems. However, when the deformability of the structure

A Survey of Power Consumption Modeling for GPU Architecture
free download

Abstract: GPUs are of increasing interests in the multi-core era due to their high computing power. However, the power consumption caused by the rising performance of GPUs has been a general concern. As a consequence, it is becoming an imperative demand to

GPU-Based Acceleration for 3D OCT Imaging
free download

ABSTRACT We designed a graphics processing unit (GPU)-based acceleration to reconstruct the optical coherence tomography (OCT) images as sub-micrometer resolution with the spectral domain OCT (SD-OCT) system. GPU-based acceleration is the use of

RLAGPU: High-performance Out-of-Core Randomized Singular Value Decomposition on GPU
free download

Randomized Singular Value Decomposition (SVD)[1] is gaining attention in finding structure in scientific data. However, processing large-scale data is not easy due to the limited capacity of GPU memory. To deal with this issue, we propose RLAGPU, an out-of-core

Development of GPU-based fast reconstruction algorithm for Gamma ray imaging with insufficient conditions
free download

The purpose of this study is to develop a graphic processing unit (GPU)-based fast reconstruction algorithm for nuclear medicine image under insufficient conditions, and verification of the developed algorithm is carried out to achieve the purpose. Simple-pattern

GPU Scripting using PyCUDA
free download

Page 1. GPU Scripting using PyCUDABy Kushagra Trivedi Page 2. CONTENT AND THE LEARNING PROCESSINTRODUCTION OF PyCUDAPage 3. INTRODUCTIONPyCUDA is package that is available for python to use the power of CUDA compatible GPU processor.

Exposing Hidden Performance Opportunities in High Performance GPU Applications
free download

AbstractThe emergence of leadership class systems with nodes containing many-core accelerators, such as GPUs, has the potential to vastly increase the performance of distributed applications. Exploiting the additional parallelism that manycore accelerators

Efficient Semantic Search over Structured Web Data: A GPU Approach
free download

Abstract. Semantic search is an advanced topic in information retrieval which has attracted increasing attention in recent years. The growing availability of structured semantic data offers opportunities for semantic search engines, which can support more expressive

GPU-Accelerated SVM Training Algorithm Based on PC and Mobile Device
free download

(Support Vector Machine) which is suitable for Android operating system. SVM is widely used in the health-related applications. The SVM provides a potential classification technology based on the pattern recognition method and statistical learning theory. This

GPU accelerated atmospheric chemical kinetics in the ECHAM/MESSy (EMAC) Earth system model (version 2.52)
free download

Abstract. This paper presents an application of GPU accelerators in Earth system modelling. We focus on atmospheric chemical kinetics, one of the most computationally intensive tasks in climate-chemistry model simulations. We developed a software package that

Patch-Based Recursive Catmull-Clark Subdivision on the GPU
free download

Abstract Catmull-Clark subdivision is an algorithm that takes a coarse mesh of a 3D model as input and outputs a smooth mesh. It has many different applications from level of detail rendering to feature film production. Starting from the coarse control mesh a series of

DeepSpotCloud: Leveraging Cross-Region GPU Spot Instances for Deep Learning
free download

AbstractCloud computing resources that are equipped with GPU devices are widely used for applications that require extensive parallelism, such as deep learning. When the demand of cloud computing instance is low, the surplus of resources is provided at a lower price in

Enabling Asynchronous Coupled Data Intensive Analysis Workflows on GPU-accelerated Platforms via Data Staging
free download

ABSTRACT Enabled by the advanced network techniques as In niband and RDMA, data staging and in-situ/in-transit techniques are emerging as an a ractive approach for large scale data intensive workows. At the same time, accelerator based heterogeneous platforms

NUFFT: Fast Auto-Tuned GPU-Based Library
free download

Synopsis We present a fast auto-tuned library for computing non-uniform fast Fourier Transform (NUFFT) on GPU. The library includes forward and adjoint NUFFT using precomputation-free and fully-precomputed methods, as well as Toeplitz-based operation

Technical report: Crane-Fast and Migratable GPU Passthrough for OpenCL applications
free download

ABSTRACT General purpose GPU (GPGPU) computing in virtualized environments leverages PCI passthrough to achieve GPU performance comparable to bare-metal execution. However, GPU passthrough prevents service administrators from performing AbstractThe increasing need for computing power today justifies the continuous search for techniques that decrease the time to answer usual computational problems. To take advantage of new hybrid parallel architectures composed by multithreading and

Dynamic performance prediction for chunk-wise parallelization on heterogeneous CPU/GPU systems
free download

Abstract-Many aspects of heterogeneity in multicores such as performance variation may affect the overall execution time and cores efficiency. An effective mapping should support this variation. A complex challenge is cores load balancing to minimize the program

Accelerate Local Tone Mapping for High Dynamic Range Images Using OpenCL with GPU
free download

Abstract--Tone mapping has been used to transfer HDR (high dynamic range) images to low dynamic range. This paper describes an algorithm to display high dynamic range images. Although local tone-mapping operator is better than global operator in reproducing images

A modular GPU raytracer using OpenCL for non-interactive graphics
free download

ABSTRACT We describe the development of a modular plugin based raytracer renderer called RenderGirl suitable for running inside the OpenCL framework. We aim to take advantage of heterogeneous computing devices such as GPUs and many-core CPUs,

CUDA Optimized dynamic programming search for automatic speech recognition on a GPU platform
free download

Abstract-In a typical recognition process, there are substantial parallelization challenges in concurrently assessing thousands of alternative interpretations of a speech utterance to find the most probable interpretation. During this process, input signals are converted into

GPU accelerated investigation of a dual-frequency driven nonlinear oscillator
free download

Summary. The bifurcation structure of a dual-frequency driven, second order nonlinear oscillator (Keller Miksis equation) is investigated by exploiting the high computational resources of professional GPUs. The numerical scheme of the applied initial value problem

GPU Based Text Analytics
free download

This is the documentation for the ProjectGPU Based Text Analyticsof the Webis group. In this project we installed, configured and tested a new deep learning cluster for the group. The second part was to use the new cluster with deep learning software to get familiar with

GPU Scheduling on the NVIDIA TX2: Hidden Details Revealed
free download

Abstract The push towards fielding autonomous-driving capabilities in vehicles is happening at breakneck speed. Semi-autonomous features are becoming increasingly common, and fully autonomous vehicles are optimistically forecast to be widely available in just a few

CUDA compatible GPU as an efficient hardware accelerator for Automatic Subtitle Generation
free download

Abstract: As avid audiences, we always face the need to find the right subtitle file for a particular video or audio file and these subtitles can be very helpful for deaf and hearing impaired persons as it allows them to perceive acoustic information in an alternative way.

GPU computations and memory access model based on Petri nets
free download

Abstract. In modern systems CPUs as well as GPUs are equipped with multi-level memory architectures, where different levels of the hierarchy vary in latency and capacity. Therefore, various memory access models were studied. Such a model can be seen as an interface

Reducing GPU Address Translation Overhead with Virtual Caching
free download

ABSTRACT Heterogeneous computing on tightly-integrated CPU-GPU systems is ubiquitous, and to increase programmability, many of these systems support virtual address accesses from GPU hardware. However, there is no free lunch. Supporting virtual memory

Computation of Synchrotron Radiation on Arbitrary Geometries in 3D with Modern GPU, Multi-Core, and Grid Computing
free download

Abstract Open Source Code for Advanced Radiation Simulation (OSCARS) is an open source project developed at Brookhaven National Laboratory for the computation of synchrotron radiation from arbitrary particle beams in arbitrary magnetic (and electric) fields

REGION GROWING IMAGE SEGMENTATION ON LARGE DATASETS USING GPU
free download

ABSTRACT Image segmentation is an important image processing, and it seems everywhere if we want to analyze what inside the image. There are varieties of applications of image segmentation such as the field of filtering noise from image, medical imaging, and

GPU Simulations of Violent Flows with Smooth Particle Hydrodynamics (SPH) Method
free download

Abstract Graphics processing unit (GPU) accelerated supercomputers have proved to be very powerful and energy effective for to accelerate the compute intensive applications and become the new standard for high performance computing (HPC) and a critical ingredient in CSE PROJECTS

FREE IEEE PAPER AND PROJECTS

FREE IEEE PAPER