College of Computing and Digital Media Dissertations

Date of Award

Winter 3-28-2025

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

School

School of Computing

First Advisor

Tanu Malik, PhD

Second Advisor

Ashish Gehani, PhD

Third Advisor

Alexander Rasin, PhD

Fourth Advisor

Dong Jae Kim, PhD

Fifth Advisor

Iyad Kanj, PhD

Abstract

Debugging and understanding system behavior pose technical challenges, often necessitating the comparison of two audited execution traces. Although provenance systems execution traces, the audited traces at most enable causal analysis within a single known execution. As a result, utilizing provenance systems differential analysis thus for debugging and reasoning is a challenging task. This thesis addresses the challenge of using provenance in debugging by developing accurate and scalable methods for differential analysis of system provenance. Our approach emphasizes the importance of knowing the application’s provenance graph structure and embedding this graph structure information within traces to conduct a precise differential analysis of system provenance. We develop algorithms that report all the differences precisely across two execution traces generated from the same application’s provenance graph structure. We dedicate our four research questions to enhance both accuracy and scalability. Namely, 1. finding the accurate places in the source code where our tool logged traces, 2. finding the loop iteration context in our traces, 3. reducing the tracing overhead, and 4. making our tool robust against distributed applications. Research questions 1 and 2 are mainly dedicated to accuracy, and 3 and 4 are dedicated to scalability. In each chapter, we show theoretically and empirically why our tool would have a more accurate differential analysis in a feasible and scalable manner. Our framework shows that current provenance systems must audit at a higher granularity to report differential analysis results accurately. We show that such overheads can be offset by statistically analyzing the application’s provenance graph structure. Finally, we outline the challenges of performing differential analysis on real distributed execution traces.

Available for download on Friday, April 24, 2026

Share

COinS