Publications
- Demystifying the Fight Against Complexity: A Comprehensive Study of Live Debugging Activities in Production Cloud Systems. [pdf][data/code][slides]. Panchapakesan Chitra Sruthi, Zinan Guo, Deming Chu, Zhengyan Chen, and Yongle Zhang. In Proceedings of the 15th ACM Symposium on Cloud Computing (SoCC’24).
- Vicious Cycles in Distributed Software Systems. [pdf][data/code]. Shangshu Qian, Wen Fan, Lin Tan, and Yongle Zhang. In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE’23).
- Fail through the Cracks: Cross-System Interaction Failures in Modern Cloud Systems [pdf][data/code]. Lilia Tang, Chaitanya Bhandari, Yongle Zhang, Anna Karanika, Shuyang Ji, Indranil Gupta, and Tianyin Xu. In Proceedings of the 18th European Conference on Computer Systems (EuroSys’23).
- Efficiently Detecting Concurrency Bugs in Persistent Memory Programs [pdf][code]. Zhangyu Chen, Yu Hua, Yongle Zhang, Luochangqi Ding. The 2022 Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’22).
- Understanding and Detecting Software Upgrade Failures in Distributed Systems [pdf][code]. Yongle Zhang, Junwen Yang, Zhuqi Jin, Utsav Sethi, Shan Lu, Ding Yuan. The 28th ACM Symposium on Operating Systems Principles (SOSP’21), October 2021.
- Automating failure diagnosis for distributed systems [pdf]. Yongle Zhang, March 2021.
- The Inflection Point Hypothesis: A Principled Debugging Approach for Locating the Root Cause of a Failure [pdf]. Yongle Zhang, Kirk Rodrigues, Yu Luo, Michael Stumm, Ding Yuan. The 27th ACM Symposium on Operating Systems Principles (SOSP’19), Oct 2019.
- Pensieve: Non-Intrusive Failure Reproduction for Distributed Systems using the Event Chaining Approach [pdf]. Yongle Zhang, Serguei Makarov, Xiang Ren, David Lion, Ding Yuan. The 26th ACM Symposium on Operating Systems Principles (SOSP’17), Oct 2017.
- lprof: A Non-intrusive Request Flow Profiler for Distributed Systems [pdf]. Xu Zhao*, Yongle Zhang*, David Lion, Muhammad FaizanUllah, Yu Luo, Ding Yuan, and Michael Stumm. Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14).
- Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-intensive Systems [pdf]. Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm. Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14), Oct 2014.
- SAC: Exploiting Stable Set Model to Enhance CacheFiles [pdf]. Jian-Liang Liu, Yongle Zhang, Lin Yang, Mingyang Guo, Zhenjun Liu, Lu Xu. Journal of Computer Science and Technology. 29(2): 293-302 (2014)
- Stable Set Model Based Methods for Large-capacity Client Cache Management [pdf]. Mingyang Guo, Liu Liu, Yongle Zhang, Zhenjun Liu, Lu Xu. Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications (HPCC 2012), June 2012.
Patents
- M. Faizanullah, L. David, Y. Luo, M. Stumm, D. Yuan, X. Zhao, Y. Zhang. Systems and processes for computer log analysis. US Patent 9,729,671, 2017.