Computers & Internet Books:

Run-Time Loop Parallelization with Efficient Dependency Checking on Gpu-Accelerated Platforms



Customer rating

Click to share your rating 0 ratings (0.0/5.0 average) Thanks for your vote!

Share this product

Run-Time Loop Parallelization with Efficient Dependency Checking on Gpu-Accelerated Platforms by Chenggang Zhang
Sorry, this product is not currently available to order


This dissertation, "Run-time Loop Parallelization With Efficient Dependency Checking on GPU-accelerated Platforms" by Chenggang, Zhang, 张呈刚, was obtained from The University of Hong Kong (Pokfulam, Hong Kong) and is being sold pursuant to Creative Commons: Attribution 3.0 Hong Kong License. The content of this dissertation has not been altered in any way. We have altered the formatting in order to facilitate the ease of printing and reading of the dissertation. All rights not granted by the above license are retained by the author. Abstract: General-Purpose computing on Graphics Processing Units (GPGPU) has attracted a lot of attention recently. Exciting results have been reported in using GPUs to accelerate applications in various domains such as scientific simulations, data mining, bio-informatics and computational finance. However, up to now GPUs can only accelerate data-parallel loops with statically analyzable parallelism. Loops with dynamic parallelism (e.g., with array accesses through subscripted subscripts), an important pattern in many general-purpose applications, cannot be parallelized on GPUs using existing technologies. Run-time loop parallelization using Thread Level Speculation (TLS) has been proposed in the literatures to parallelize loops with statically un-analyzable dependencies. However, most of the existing TLS systems are designed for multiprocessor/multi-core CPUs. GPUs have fundamental differences with CPUs in both hardware architecture and execution model, making the previous TLS designs not work or inefficient when ported to GPUs. This thesis presents GPUTLS, a runtime system designed to support speculative loop parallelization on GPUs. The design of GPU-TLS addresses several key problems encountered when adapting TLS to GPUs: (1) To reduce the possibility of mis-speculation, deferred-update memory versioning scheme is adopted to avoid mis-speculations caused by inter-iteration WAR and WAW dependencies. A technique named intra-warp value forwarding is proposed to respect some inter-iteration RAW dependencies, which further reduces the mis-speculation possibility. (2) An incremental speculative execution scheme is designed to exploit partial parallelism within loops. This avoids excessive re-executions and reduces the mis-speculation penalty. (3) The dependency checking among thousands of speculative GPU threads poses large overhead and can easily become the performance bottleneck. To lower the overhead, we design several e_cient dependency checking schemes named PRW+BDC, SW, SR, SRW+EDC, and SRW+LDC respectively. (4) We devise a novel parallel commit scheme to avoid the overhead incurred by the serial commit phase in most existing TLS designs. We have carried out extensive experiments on two platforms with different NVIDIA GPUs, using both a synthetic loop that can simulate loops with different characteristics and several loops from real-life applications. Testing results show that the proposed intra-warp value forwarding and eager dependency checking techniques can improve the performance for almost all kinds of loop patterns. We observe that compared with other dependency checking schemes, SR and SW can achieve better performance in most cases. It is also shown that the proposed parallel commit scheme is especially useful for loops with large write set size and small number of inter-iteration WAW dependencies. Overall, GPU-TLS can achieve speedups ranging from 5 to 105 for loops with dynamic parallelism. DOI: 10.5353/th_b4716765 Subjects: Graphics processing unitsParallel processing (Electronic computers)Threads (Computer programs)
Release date NZ
January 26th, 2017
Created by
colour illustrations
Country of Publication
United States
Open Dissertation Press
Product ID

Customer reviews

Nobody has reviewed this product yet. You could be the first!

Write a Review

Marketplace listings

There are no Marketplace listings available for this product currently.
Already own it? Create a free listing and pay just 9% commission when it sells!

Sell Yours Here

Help & options

  • If you think we've made a mistake or omitted details, please send us your feedback. Send Feedback
  • If you have a question or problem with this product, visit our Help section. Get Help
Filed under...

Buy this and earn 691 Banana Points