Runtime performance enhancement of nondeterministic applications on partailly [i.e. partially] reconfigurable architectures using task agglomeration
Date of Award
Doctor of Philosophy (PhD)
B. Earl Wells
Adaptive computing systems., Field programmable gate arrays., Embedded computer systems--Design and construction.
Reconfigurable hardware architectures have the potential to dramatically improve run-time performance by altering their low-level hardware to match the inherent structure of a given application. Partially-reconfigurable architectures extend this capability by allowing their reconfigurable resources to be partitioned into multiple subregions which are independently configurable. This permits the run-time reconfiguration of a single hardware partition while other partitions continue their operation unabated. The major limiting factor of these architectures is the relatively long time required to perform this fine-grained hardware modification. Previous research on the Single-Chip Multiprocessor and Dynamically Reconfigurable Hybrid Architecture have developed an application framework and a partially reconfigurable hardware architecture that can be applied to both deterministic and bounded nondeterministic applications. In this work, both models are extended to account for the limited resources available within each hardware partition. This resource constraint model was developed consistent with implementation in modern commercially available programmable logic devices. To balance widely varying task resource requirements, two new compile-time task agglomeration methods have been developed. These methods are designed to identify tasks and combine them into common partial bitstreams that do not exceed the resource limitations of the reconfigurable partition. By more efficiently using the partition resources the total number of required reconfigurations can be reduced. Both methodologies have as their core element a static scheduler that is used to estimate the resultant execution time of a task grouping. The first method applies this static scheduler once and then performs greedy local optimization to minimize the number of such groupings. The second method utilizes simulated annealing to actively optimize the estimated execution time. The effectiveness of both of these techniques was evaluated using the same synthetically generated task systems and discrete event simulator from previous research. In 86% of the cases, the Simulated Annealing algorithm produced significantly shorter run times while the greedy method produced significantly shorter run times in 70% of the cases when compared to the case where task agglomeration is not applied.
Austin, David, "Runtime performance enhancement of nondeterministic applications on partailly [i.e. partially] reconfigurable architectures using task agglomeration" (2016). Dissertations. 94.