Runtime predictive analysis
Runtime predictive analysis (or predictive analysis) is a runtime verification technique in computer science for detecting property violations in program executions inferred from an observed execution. An important class of predictive analysis methods has been developed for detecting concurrency errors (such as data races) in concurrent programs, where a runtime monitor is used to predict errors which did not happen in the observed run, but can happen in an alternative execution of the same program. The predictive capability comes from the fact that the analysis is performed on an abstract model extracted online from the observed execution, which admits a class of executions beyond the observed one.[1]
Overview
[edit]Informally, given an execution , predictive analysis checks errors in a reordered trace of . is called feasible from (alternatively a correct reordering[2] of ) if any program that can generate can also generate .
In the context of concurrent programs, a predictive technique is sound if it only predicts concurrency errors in feasible executions of the causal model of the observed trace. Assuming the analysis has no knowledge about the source code of the program, the analysis is complete (also called maximal[3][4]) if the inferred class of executions contains all executions that have the same program order and communication order prefix of the observed trace.
Applications
[edit]Predictive analysis has been applied to detect a wide class of concurrency errors, including:
- Data races
- Deadlocks[5][6][7]
- Atomicity violations[8]
- Order violations, e.g., use-after-free errors[9]
Implementation
[edit]As is typical with dynamic program analysis, predictive analysis first instruments the source program. At runtime, the analysis can be performed online, in order to detect errors on the fly. Alternatively, the instrumentation can simply dump the execution trace for offline analysis. The latter approach is preferred for expensive refined predictive analyses that require random access to the execution trace or take more than linear time.
Incorporating data and control-flow analysis
[edit]Static analysis can be first conducted to gather data and control-flow dependence information about the source program, which can help construct the causal model during online executions. This allows predictive analysis to infer a larger class of executions based on the observed execution. Intuitively, a feasible reordering can change the last writer of a memory read (data dependence) if the read, in turn, cannot affect whether any accesses execute (control dependence).[10][11]
Approaches
[edit]Partial order based techniques
[edit]Partial order based techniques are most often employed for online race detection. At runtime, a partial order over the events in the trace is constructed, and any unordered pairs of critical events are reported as races. Many predictive techniques for race detection are based on the happens-before relation or a weakened version of it. Such techniques can typically be implemented efficiently with vector clock algorithms, allowing only one pass of the whole input trace as it is being generated, and are thus suitable for online deployment. [2] [12] [13]
SMT-based techniques
[edit]SMT encodings allow the analysis to extract a refined causal model from an execution trace, as a (possibly very large) mathematical formula. Furthermore, control-flow information can be incorporated into the model. SMT-based techniques can achieve soundness and completeness (also called maximal causality[4] [3]), but has exponential-time complexity with respect to the trace size. In practice, the analysis is typically deployed to bounded segments of an execution trace, thus trading completeness for scalability. [10] [14] [15] [16]
Lockset-based approaches
[edit]In the context of data race detection for programs using lock based synchronization, lockset-based[17] techniques provide an unsound, yet lightweight mechanism for detecting data races. These techniques primarily detect violations of the lockset principle. which says that all accesses of a given memory location must be protected by a common lock. Such techniques are also used to filter out candidate race reports in more expensive analyses.[4]
Graph-based techniques
[edit]In the context of data race detection, sound polynomial-time predictive analyses have been developed, with good, close to maximal predictive capability based on a graphs. [18]
Computational Complexity
[edit]Given an input trace of size executed by threads, general race prediction is NP-complete and even W[1]-hard parameterized by , but admits a polynomial-time algorithm when the communication topology is acyclic.[19] Happens-before races are detected in time, and this bound is optimal.[20] Lockset races over variables are detected in time, and this bound is also optimal.[20]
Tools
[edit]Here is a partial list of tools that use predictive analyses to detect concurrency errors, sorted alphabetically.
- "Rapid". GitHub.: a lightweight framework for implementing dynamic race detection engines.
- "RoadRunner". GitHub.: a dynamic analysis framework designed to facilitate rapid prototyping and experimentation with dynamic analyses for concurrent Java programs.
- "RV-Predict".: SMT-based predictive race detection.
- "UFO". GitHub.: SMT-based predictive use-after-free detection.
See also
[edit]References
[edit]- ^ "Runtime Predictive Analysis". November 10, 2008.
- ^ a b Smaragdakis, Yannis; Evans, Jacob; Sadowski, Caitlin; Yi, Jaeheon; Flanagan, Cormac (2012). "Sound predictive race detection in polynomial time". ACM SIGPLAN Notices. 47 (1): 387. doi:10.1145/2103621.2103702. ISSN 0362-1340.
- ^ a b Şerbănuţă, Traian Florin; Chen, Feng; Roşu, Grigore (2013). "Maximal Causal Models for Sequentially Consistent Systems". Runtime Verification. Lecture Notes in Computer Science. Vol. 7687. pp. 136–150. doi:10.1007/978-3-642-35632-2_16. hdl:2142/27708. ISBN 978-3-642-35631-5. ISSN 0302-9743.
- ^ a b c Huang, Jeff (2015). "Stateless model checking concurrent programs with maximal causality reduction". Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 165–174. doi:10.1145/2737924.2737975. ISBN 9781450334686. S2CID 15196006.
- ^ Kalhauge, Christian Gram; Palsberg, Jens (2018). "Sound deadlock prediction". Proceedings of the ACM on Programming Languages. 2 (OOPSLA): 1–29. doi:10.1145/3276516. ISSN 2475-1421.
- ^ "Sound Dynamic Deadlock Prediction in Linear Time" (PDF).
- ^ Tunç, Hünkar Can; Mathur, Umang; Pavlogiannis, Andreas; Viswanathan, Mahesh (2023), "Sound Dynamic Deadlock Prediction in Linear Time", Proceedings of the ACM on Programming Languages, 7: 1733–1758, arXiv:2304.03692, doi:10.1145/3591291, retrieved 29 September 2023
- ^ "Atomicity Checking in Linear Time using Vector Clocks" (PDF).
- ^ Huang, Jeff (2018). "UFO". Proceedings of the 40th International Conference on Software Engineering. pp. 609–619. doi:10.1145/3180155.3180225. ISBN 9781450356381. S2CID 3730958.
- ^ a b Huang, Jeff; Meredith, Patrick O'Neil; Rosu, Grigore (2013). "Maximal sound predictive race detection with control flow abstraction". Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 337–348. doi:10.1145/2594291.2594315. ISBN 9781450327848. S2CID 8883624.
- ^ Genç, Kaan; Roemer, Jake; Xu, Yufan; Bond, Michael D. (2019). "Dependence-aware, unbounded sound predictive race detection". Proceedings of the ACM on Programming Languages. 3 (OOPSLA): 1–30. arXiv:1904.13088. doi:10.1145/3360605. ISSN 2475-1421.
- ^ Kini, Dileep; Mathur, Umang; Viswanathan, Mahesh (2017). "Dynamic race prediction in linear time". Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 157–170. arXiv:1704.02432. doi:10.1145/3062341.3062374. ISBN 9781450349888. S2CID 6855894.
- ^ Roemer, Jake; Genç, Kaan; Bond, Michael D. (2018). "High-coverage, unbounded sound predictive race detection". Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 374–389. doi:10.1145/3192366.3192385. ISBN 9781450356985. S2CID 47018225.
- ^ Liu, Peng; Tripp, Omer; Zhang, Xiangyu (2016). "IPA: improving predictive analysis with pointer analysis". Proceedings of the 25th International Symposium on Software Testing and Analysis. pp. 59–69. doi:10.1145/2931037.2931046. ISBN 9781450343909.
- ^ Wang, Chao; Kundu, Sudipta; Ganai, Malay; Gupta, Aarti (2009). "Symbolic Predictive Analysis for Concurrent Programs". FM 2009: Formal Methods. Lecture Notes in Computer Science. Vol. 5850. pp. 256–272. doi:10.1007/978-3-642-05089-3_17. ISBN 978-3-642-05088-6. ISSN 0302-9743.
- ^ Said, Mahmoud; Wang, Chao; Yang, Zijiang; Sakallah, Karem (2011). "Generating Data Race Witnesses by an SMT-Based Analysis". NASA Formal Methods. Lecture Notes in Computer Science. Vol. 6617. pp. 313–327. doi:10.1007/978-3-642-20398-5_23. ISBN 978-3-642-20397-8. ISSN 0302-9743.
- ^ Savage, Stefan; Burrows, Michael; Nelson, Greg; Sobalvarro, Patrick; Anderson, Thomas (1997). "Eraser". ACM Transactions on Computer Systems. 15 (4). Association for Computing Machinery (ACM): 391–411. doi:10.1145/265924.265927. ISSN 0734-2071. S2CID 1492924.
- ^ Pavlogiannis, Andreas (2020). "Fast, sound, and effectively complete dynamic race prediction". Proceedings of the ACM on Programming Languages. 4 (POPL): 1–29. arXiv:1901.08857. doi:10.1145/3371085. ISSN 2475-1421.
- ^ Mathur, Umang; Pavlogiannis, Andreas; Viswanathan, Mahesh (2020). "The Complexity of Dynamic Data Race Prediction". Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science. Association for Computing Machinery. pp. 713–727. arXiv:2004.14931. doi:10.1145/3373718.3394783. ISBN 9781450371049. S2CID 210171694.
- ^ a b Kulkarni, Rucha; Mathur, Umang; Pavlogiannis, Andreas (2021). "Dynamic Data-Race Detection Through the Fine-Grained Lens". 32nd International Conference on Concurrency Theory (CONCUR 2021). Leibniz International Proceedings in Informatics (LIPIcs). 203. Schloss Dagstuhl -- Leibniz-Zentrum fur Informatik: 16:1–16:23. doi:10.4230/LIPIcs.CONCUR.2021.16. ISBN 9783959772037. S2CID 235765773.