Biography

Yongle Zhang is a tenure-track assistant professor in the Computer Science Department at Purdue University. He received his Ph.D. from the University of Toronto working with Dr. Ding Yuan. He was recommended to Institute of Computing Technologies for Master’s program after he received his Bachelor’s degree from Shandong University.

His Chinese given name is 永乐(Yong Le), pronounced as 永/juŋ/ 乐/lə/.

I am looking for highly motivated students at different levels (research intern, master, PhD) to work with me in software systems. If you are interested, please contact me with your resume and transcripts.

Research

My research interest is in systems software with a focus on improving the reliability and availability of complex, real-world systems. In particular, we are currently working on failure detection and diagnosis in production cloud systems, as well as design and implementation of diagnosable software systems.

News

  • [Mar. 2023] Shangshu Qian will intern in Microsoft Research Redmond in summer 2023!
  • [Jan. 2023] Our paper on root cause diagnosis is invited to publish in USENIX ;login:.
  • [Nov. 2022] I was fortunate to receive The SIGOPS Dennis M. Ritchie Thesis Award. Many thanks to my advisor - Ding Yuan - and all my collaborators!
  • [Aug. 2022] We received Meta 2022 Systems Research Award! Many thanks to Meta and Meta Research.
  • [Dec. 2021] We received an NSF Core grant! Many thanks to NSF and Pedro Fonseca.
  • [Dec. 2021] Min-Ju Li will intern in Cloudera in summer 2022!
  • [Nov. 2021] Our paper about concurrency bugs in persistent memory applications was accepted to ASPLOS 2022!
  • [Oct. 2021] Our paper about upgrade failures in distributed systems appeared on SOSP 2021!
  • [Jan. 2021] I will join Purdue CS as a tenure-track assistant professor.

Recent Publications

  • Demystifying the Fight Against Complexity: A Comprehensive Study of Live Debugging Activities in Production Cloud Systems. [pdf][data/code]. Panchapakesan Chitra Sruthi, Zinan Guo, Deming Chu, Zhengyan Chen, and Yongle Zhang. In Proceedings of the 15th ACM Symposium on Cloud Computing (SoCC’24).
  • Vicious Cycles in Distributed Software Systems. [pdf][data/code]. Shangshu Qian, Wen Fan, Lin Tan, and Yongle Zhang. In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE’23).
  • Fail through the Cracks: Cross-System Interaction Failures in Modern Cloud Systems [pdf][data/code]. Lilia Tang, Chaitanya Bhandari, Yongle Zhang, Anna Karanika, Shuyang Ji, Indranil Gupta, and Tianyin Xu. In Proceedings of the 18th European Conference on Computer Systems (EuroSys’23).
  • Efficiently Detecting Concurrency Bugs in Persistent Memory Programs [pdf][code]. Zhangyu Chen, Yu Hua, Yongle Zhang, Luochangqi Ding. The 2022 Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’22).
  • Understanding and Detecting Software Upgrade Failures in Distributed Systems [pdf][code]. Yongle Zhang, Junwen Yang, Zhuqi Jin, Utsav Sethi, Shan Lu, Ding Yuan. The 28th ACM Symposium on Operating Systems Principles (SOSP’21), October 2021.
  • The Inflection Point Hypothesis: A Principled Debugging Approach for Locating the Root Cause of a Failure [pdf]. Yongle Zhang, Kirk Rodrigues, Yu Luo, Michael Stumm, Ding Yuan. The 27th ACM Symposium on Operating Systems Principles (SOSP’19), Oct 2019.
  • Pensieve: Non-Intrusive Failure Reproduction for Distributed Systems using the Event Chaining Approach [pdf]. Yongle Zhang, Serguei Makarov, Xiang Ren, David Lion, Ding Yuan. The 26th ACM Symposium on Operating Systems Principles (SOSP’17), Oct 2017.
  • lprof: A Non-intrusive Request Flow Profiler for Distributed Systems [pdf]. Xu Zhao*, Yongle Zhang*, David Lion, Muhammad FaizanUllah, Yu Luo, Ding Yuan, and Michael Stumm. Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14).
  • Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-intensive Systems [pdf]. Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm. Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14), Oct 2014.

Full publication list…