Study of the impact of non-determinism on numerical reproducibility and debugging at the exascale

Author(s)Chapp, Dylan
Date Accessioned2019-10-18T14:31:12Z
Date Available2019-10-18T14:31:12Z
Publication Date2017
SWORD Update2017-09-05T16:30:54Z
AbstractNon-determinism in high performance scientific applications has severe detri- mental impacts for both numerical reproducibility and accuracy, and debugging. As scientific simulations are migrated to extreme-scale platforms, the increase in platform concurrency and the attendant increase in non-determinism is likely to exacerbate both of these problems. In this thesis, we address the dual challenges of non-determinism’s impact on numerical reproducibility and on debugging. ☐ To address the numerical challenge, our work investigates the power of mathe- matical methods to mitigate error propagation at the exascale. We focus on floating- point error accumulation over global summations where enforcing any reduction order is expensive or impossible. We model parallel summations with reduction trees and identify those parameters that can be used to estimate the reduction’s sensitivity to variability in the reduction tree. We assess the impact of these parameters on the abil- ity of different reduction methods to successfully mitigate errors. Our results illustrate the pressing need for intelligent runtime selection of reduction operators that ensure a given degree of reproducible accuracy. ☐ To address the debugging challenge, our work examines the impact of logical clock ticking policies on the Clock-Delta Compression record-and-replay technique. We assess three logical clock ticking policies in terms of the number of out-of-order messages that result during recording executions under these policies. We examine the performance of Clock-Delta Compression when using the three ticking policies in four distinct application scenarios to probe the impact of floating-point workload and communication intensity on recording performance. Our results illustrate the pressing need for fine-grained logical clock ticking policies that reduce then out-of-order message rate of the Clock-Delta Compression record-and-replay technique.en_US
AdvisorTaufer, Michela
DegreeM.S.
DepartmentUniversity of Delaware, Department of Computer and Information Sciences
Unique Identifier1124074949
URLhttp://udspace.udel.edu/handle/19716/24492
Languageen
PublisherUniversity of Delawareen_US
URIhttps://search.proquest.com/docview/1957944576?accountid=10457
TitleStudy of the impact of non-determinism on numerical reproducibility and debugging at the exascaleen_US
TypeThesisen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Chapp_udel_0060M_12864.pdf
Size:
3.2 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: