osdi 2021 accepted papers

Sunny Hills High School Student Dies 2021, Why Did Athenian Democracy Fail, Articles O

), Program Co-Chairs: Angela Demke Brown, University of Toronto, and Jay Lorch, Microsoft Research. Lifting predicates and crash framing make the specification easy to use for developers, and logically atomic crash specifications allow for modular reasoning in GoJournal, making the proof tractable despite complex concurrency and crash interleavings. Attaching supplementary material is optional; if your paper says that you have source code or formal proofs, you need not attach them to convince the PC of their existence. However, a plethora of recent data breaches show that even widely trusted service providers can be compromised. In contrast, CLP achieves significantly higher compression ratio than all commonly used compressors, yet delivers fast search performance that is comparable or even better than Elasticsearch and Splunk Enterprise. Forgot your password? OSDI brings together professionals from academic and industrial backgrounds in a premier forum for discussing the design, implementation, and implications of systems software. We focus on NVMe storage devices and show that it is natural to express these semantics in the kernel and the application and only requires a modest two-bit change to the device interface. Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. The OSDI Symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. The conference papers and full proceedings are available to registered attendees now and will be available to everyone beginning Wednesday, July 14, 2021. SOSP 2021 - Symposium on Operating Systems Principles For realistic workloads, KEVIN improves throughput by 68% on average. Table of Contents | Only two types of supplementary material are permitted: source code described in the paper and formal proofs sketched in the paper. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. VLDB 2021 - 47th International Conference on Very Large Data Bases Based on this observation, P3 proposes a new approach for distributed GNN training. Sat, Aug 7, 2021 3 min read researches review. Session Chairs: Gennady Pekhimenko, University of Toronto / Vector Institute, and Shivaram Venkataraman, University of WisconsinMadison, Aurick Qiao, Petuum, Inc. and Carnegie Mellon University; Sang Keun Choe and Suhas Jayaram Subramanya, Carnegie Mellon University; Willie Neiswanger, Petuum, Inc. and Carnegie Mellon University; Qirong Ho, Petuum, Inc.; Hao Zhang, Petuum, Inc. and UC Berkeley; Gregory R. Ganger, Carnegie Mellon University; Eric P. Xing, MBZUAI, Petuum, Inc., and Carnegie Mellon University. Authors are required to register abstracts by 3:00 p.m. PST on December 3, 2020, and to submit full papers by 3:00 p.m. PST on December 10, 2020. A scientific paper consists of a constellation of artifacts that extend beyond the document itself: software, hardware, evaluation data and documentation, raw survey results, mechanized proofs, models, test suites, benchmarks, and so on. Welcome to the 2021 USENIX Annual Technical Conference (ATC '21) submissions site! Researchers from the Software Systems Laboratory bagged a Best Paper Award at the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021). As a result, data characteristics and device capabilities vary widely across clients. If you have any questions about conflicts, please contact the program co-chairs. Pollux simultaneously considers both aspects. Sijie Shen, Rong Chen, Haibo Chen, and Binyu Zang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai Artificial Intelligence Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. Amy Tai, VMware Research; Igor Smolyar, Technion Israel Institute of Technology; Michael Wei, VMware Research; Dan Tsafrir, Technion Israel Institute of Technology and VMware Research. Accepted papers will be allowed 14 pages in the proceedings, plus references. Welcome to the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI '22) submissions site. How can we design systems that will be reliable despite misbehaving participants? NrOS is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. Fan Lai, Xiangfeng Zhu, Harsha V. Madhyastha, and Mosharaf Chowdhury, University of Michigan. Prior or concurrent workshop publication does not preclude publishing a related paper in OSDI. OSDI will provide an opportunity for authors to respond to reviews prior to final consideration of the papers at the program committee meeting. We build Polyjuice based on our learning framework and evaluate it against several existing algorithms. We particularly encourage contributions containing highly original ideas, new approaches, and/or groundbreaking results. This motivates the need for a new approach to data privacy that can provide strong assurance and control to users. They collectively make the backup fresh, columnar, and fault-tolerant, even facing millions of concurrent transactions per second. Jaehyun Hwang and Midhul Vuppalapati, Cornell University; Simon Peter, UT Austin; Rachit Agarwal, Cornell University. We first introduce two new hardware primitives: 1) Guarded Page Table (GPT), which protects page table pages to support page-level secure memory isolation; 2) Mountable Merkle Tree (MMT), which supports scalable integrity protection for secure memory. The 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) will take place as a virtual event on July 1416, 2021. The key insight in blk-switch is that Linux's multi-queue storage design, along with multi-queue network and storage hardware, makes the storage stack conceptually similar to a network switch. Publications | Mosharaf Chowdhury Therefore, developers typically find data locality issues via dynamic profiling and repair them manually. Petuum Awarded OSDI 2021 Best Paper for Goodput-Optimized Deep Learning Welcome to the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) submissions site. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. We present TEMERAIRE, a hugepage-aware enhancement of TCMALLOC to reduce CPU overheads in the applications code. The device then "calibrates" its interrupts to completions of latency-sensitive requests. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. Currently, for large graphs, CPU servers offer the best performance-per-dollar over GPU servers. We present case studies and end-to-end applications that show how Storm lets developers specify diverse policies while centralizing the trusted code to under 1% of the application, and statically enforces security with modest type annotation overhead, and no run-time cost. DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh . We describe Fluffy, a multi-transaction differential fuzzer for finding consensus bugs in Ethereum. She is the author of the textbook Interconnections (about network layers 2 and 3) and coauthor of Network Security. OSDI '21 Call for Papers | USENIX You must not improperly identify a PC member as a conflict if none of these three circumstances applies, even if for some other reason you want to avoid them reviewing your paper. . Here, we focus on hugepage coverage. Although SSDs can be simplified under the current ZNS interface, its counterpart LFS must bear segment compaction overhead. The chairs may reject abstracts or papers on the basis of egregious missing or extraneous conflicts. Third, GNNAdvisor capitalizes on the GPU memory hierarchy for acceleration by gracefully coordinating the execution of GNNs according to the characteristics of the GPU memory structure and GNN workloads. SOSP 2021 - Symposium on Operating Systems Principles We compare Marius against two state-of-the-art industrial systems on a diverse array of benchmarks. His work has included the Barrelfish multikernel research OS, as well as work on distributed stream processors, and using formal specifications to describe the hardware/software interfaces of modern computer systems. We observe that scalability challenges in training GNNs are fundamentally different from that in training classical deep neural networks and distributed graph processing; and that commonly used techniques, such as intelligent partitioning of the graph do not yield desired results. AI enables principled representation of knowledge, complex strategy optimization, learning from data, and support to human decision making. 2019 - Present. The co-chairs may then share that paper with the workshops organizers and discuss it with them. Each new model trained with DP increases the bound on data leakage and can be seen as consuming part of a global privacy budget that should not be exceeded. OSDI'21 accepted 31 papers and 26 papers participated in the AE, a significant increase in the participate ratio: 84%, compared to OSDI'20 (70%) and SOSP'19 (61%). With the help of thousands of Lambda threads, Dorylus scales GNN training to billion-edge graphs. blk-switch evaluation over a variety of scenarios shows that it consistently achieves s-scale average and tail latency (at both 99th and 99.9th percentiles), while allowing applications to near-perfectly utilize the hardware capacity. Kirk Rodrigues, Yu Luo, and Ding Yuan, University of Toronto and YScope Inc. Penglai also reduces the latency of secure memory initialization by three orders of magnitude and gains 3.6x speedup for real-world applications (e.g., MapReduce). We discuss the design and implementation of TEMERAIRE including strategies for hugepage-aware memory layouts to maximize hugepage coverage and to minimize fragmentation overheads. Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. Manuela M. Veloso is the Head of J.P. Morgan AI Research, which pursues fundamental research in areas of core relevance to financial services, including data mining and cryptography, machine learning, explainability, and human-AI interaction. The conference papers and full proceedings are available to registered attendees now and will be available to everyone beginning Wednesday, July 14, 2021. Consensus bugs are extremely rare but can be exploited for network split and theft, which cause reliability and security-critical issues in the Ethereum ecosystem. Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. We present the results of a 1% experiment at fleet scale as well as the longitudinal rollout in Googles warehouse scale computers. USENIX Security '21 has three submission deadlines. Memory allocation represents significant compute cost at the warehouse scale and its optimization can yield considerable cost savings. Authors must limit their responses to (a) correcting factual errors in the reviews or (b) directly addressing questions posed by reviewers. Further, Vegito can recover from cascading machine failures by using the columnar backup in less than 60 ms. For instance, the following are not sufficient grounds to specify a conflict with a PC member: they have reviewed the work before, they are employed by your competitor, they are your personal friend, they were your post-doc advisor or advisee, or they had the same advisor as you. Weak Links in Authentication Chains: A Large-scale Analysis of Email Sender Spoofing Attacks Session Chairs: Moshe Gabel, University of Toronto, and Joseph Gonzalez, University of California, Berkeley, John Thorpe, Yifan Qiao, Jonathan Eyolfson, and Shen Teng, UCLA; Guanzhou Hu, UCLA and University of Wisconsin, Madison; Zhihao Jia, CMU; Jinliang Wei, Google Brain; Keval Vora, Simon Fraser; Ravi Netravali, Princeton University; Miryung Kim and Guoqing Harry Xu, UCLA. Submitted November 12, 2021 Accepted January 20, 2022. Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. P3 exposes a simple API that captures many different classes of GNN architectures for generality. Reviews will be available for response on Wednesday, March 3, 2021. We present the nanoPU, a new NIC-CPU co-design to accelerate an increasingly pervasive class of datacenter applications: those that utilize many small Remote Procedure Calls (RPCs) with very short (s-scale) processing times. We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture. Such centralized engines are in a perfect position to censor content and violate users privacy, undermining some of the key tenets behind decentralization. There is no explicit limit to the response, but authors are strongly encouraged to keep it under 500 words; reviewers are neither required nor expected to read excessively long responses. Authors may submit a response to those reviews until Friday, March 5, 2021. Proceedings Front Matter OSDI '22 - HotCRP.com Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software. He joined Intel Research at Berkeley in April 2002 as a principal architect of PlanetLab, an open, shared platform for developing and deploying planetary-scale services. As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings). Instead, we propose addressing the root cause of the heuristics problem by allowing software to explicitly specify to the device if submitted requests are latency-sensitive. We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on Amazon Reviews data. These scripts often make pages slow to load, partly due to a fundamental inefficiency in how browsers process JavaScript content: browsers make it easy for web developers to reason about page state by serially executing all scripts on any frame in a page, but as a result, fail to leverage the multiple CPU cores that are readily available even on low-end phones. After request completion, an I/O device must decide either to minimize latency by immediately firing an interrupt or to optimize for throughput by delaying the interrupt, anticipating that more requests will complete soon and help amortize the interrupt cost. SanRazor adopts a novel hybrid approach it captures both dynamic code coverage and static data dependencies of checks, and uses the extracted information to perform a redundant check analysis. To help more profitably utilize sanitizers, we introduce SanRazor, a practical tool aiming to effectively detect and remove redundant sanitizer checks. Foreshadow was chosen as an IEEE Micro Top Pick. One important reason for the high cost is, as we observe in this paper, that many sanitizer checks are redundant the same safety property is repeatedly checked leading to unnecessarily wasted computing resources. In this paper, we present P3, a system that focuses on scaling GNN model training to large real-world graphs in a distributed setting. While verifying GoJournal, we found one serious concurrency bug, even though GoJournal has many unit tests. Han Meng - Research Assistant - Michigan State University | LinkedIn Authors may use this for content that may be of interest to some readers but is peripheral to the main technical contributions of the paper. At a high level, Addra follows a template in which callers and callees deposit and retrieve messages from private mailboxes hosted at an untrusted server. In this paper, we present Vegito, a distributed in-memory HTAP system that embraces freshness and performance with the following three techniques: (1) a lightweight gossip-style scheme to apply logs on backups consistently; (2) a block-based design for multi-version columnar backups; (3) a two-phase concurrent updating mechanism for the tree-based index of backups. Machine learning (ML) models trained on personal data have been shown to leak information about users. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. Leveraging these information, Pollux dynamically (re-)assigns resources to improve cluster-wide goodput, while respecting fairness and continually optimizing each DL job to better utilize those resources. The symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. Paper abstracts and proceedings front matter are available to everyone now. DistAI generates data by simulating the distributed protocol at different instance sizes and recording states as samples. Artifact Evaluation - Systems Research Artifacts Using selective profiling, we build DMon, a system that can automatically locate data locality problems in production, identify access patterns that hurt locality, and repair such patterns using targeted optimizations. If in doubt about whether your submission to OSDI 2021 and your upcoming submission to SOSP are the same paper or not, please contact the PC chairs by email. HotNets 2021: Call for Papers - sigcomm Storm ensures security using a Security Typed ORM that refines the (type) abstractions of each layer of the MVC API with logical assertions that describe the data produced and consumed by the underlying operation and the users allowed access to that data. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. Lukas Burkhalter, Nicolas Kchler, Alexander Viand, Hossein Shafagh, and Anwar Hithnawi, ETH Zrich. Mothy received a PhD in 1995 from the Computer Laboratory of the University of Cambridge, where he was a principal designer and builder of the Nemesis OS. NrOS replicates kernel state on each NUMA node and uses operation logs to maintain strong consistency between replicas. Additionally, there is no assurance that data processing and handling comply with the claimed privacy policies. Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings. To achieve low overhead, selective profiling gathers runtime execution information selectively and incrementally. As increasingly more sensitive data is being collected to gain valuable insights, the need to natively integrate privacy controls in data analytics frameworks is growing in importance. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. Zeph enforces privacy policies cryptographically and ensures that data available to third-party applications complies with users' privacy policies. Researchers from the Software Systems Laboratory bagged Best Paper Awards at the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2021) and the 2021 USENIX Annual Technical Conference (USENIX ATC 2021).. Jay Lepreau Best Paper Award, OSDI'21. Submissions may include as many additional pages as needed for references but not for appendices. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%. We have made Fluffy publicly available at https://github.com/snuspl/fluffy to contribute to the security of Ethereum. Some recent schedulers choose job resources for users, but do so without awareness of how DL training can be re-optimized to better utilize the provided resources. As has been standard practice in OSDI and SOSP in recent years, we will allow authors to submit quick responses to PC reviews: they will be made available to the PC before the final online discussion and PC meeting. Our approach effectively eliminates high communication and partitioning overheads, and couples it with a new pipelined push-pull parallelism based execution strategy for fast model training. The abstractions we design for the privacy resource mirror those defined by Kubernetes for traditional resources, but there are also major differences. Most existing schedulers expect users to specify the number of resources for each job, often leading to inefficient resource use. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. All deadline times are 23:59 hrs UTC. Proceedings Cover | Pollux improves scheduling performance in deep learning (DL) clusters by adaptively co-optimizing inter-dependent factors both at the per-job level and at the cluster-wide level. Message from the Program Co-Chairs. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation. Paper Submission Information All submissions must be received by 11:59 PM AoE (UTC-12) on the day of the corresponding deadline. In particular, I'll argue for re-engaging with what computer hardware really is today and give two suggestions (among many) about how the OS research community can usefully do this, and exploit what is actually a tremendous opportunity. Differential privacy (DP) enables model training with a guaranteed bound on this leakage. Professor Veloso is on leave from Carnegie Mellon University as the Herbert A. Simon University Professor in the School of Computer Science, and the past Head of the Machine Learning Department. It then feeds those invariants and the desired safety properties to an SMT solver to check if the conjunction of the invariants and the safety properties is inductive. 23 artifacts received the Artifacts Functional badge (88%). OSDI 2021 papers summary | hacklog The experimental results show that Penglai can support 1,000s enclave instances running concurrently and scale up to 512GB secure memory with both encryption and integrity protection. An evaluation of Addra on a cluster of 80 machines on AWS demonstrates that it can serve 32K users with a 99-th percentile message latency of 726 msa 7 improvement over a prior system for text messaging in the same threat model. We also welcome work that explores the interface to related areas such as computer architecture, networking, programming languages, analytics, and databases. We propose a learning-based framework that instead explicitly optimizes concurrency control via offline training to maximize performance. GoJournal is implemented in Go, and Perennial is implemented in the Coq proof assistant. Concretely, Dorylus is 1.22 faster and 4.83 cheaper than GPU servers for massive sparse graphs. In experiments with real DL jobs and with trace-driven simulations, Pollux reduces average job completion times by 37-50% relative to state-of-the-art DL schedulers, even when they are provided with ideal resource and training configurations for every job. A graph neural network (GNN) enables deep learning on structured graph data. This is especially true for DPF over Rnyi DP, a highly composable form of DP. For conference information, . Moreover, as of October 2020, a review of the 50 most cited empirical papers that list personality as a keyword indicates that all 50 papers were authored by people with insti tutional affiliations in the United States, Canada, Germany, the UK, and New Zealand, and only three papers included samples outside of these regions (see Supplementary She has been recognized with many industry honors including induction into the National Academy of Engineering, the Inventor Hall of Fame, The Internet Hall of Fame, Washington State Academy of Science, and lifetime achievement awards from USENIX and SIGCOMM. SOSP 2021 - Symposium on Operating Systems Principles However, Addra improves message latency in this architecture, which is a key performance metric for voice calls. Report - Systems Research Artifacts We present DPF (Dominant Private Block Fairness) a variant of the popular Dominant Resource Fairness (DRF) algorithmthat is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. Used Zotero to organize papers about the stress and diffusion between anode and electrolyte and made a summary . Submitted papers must be no longer than 12 single-spaced 8.5 x 11 pages, including figures and tables, plus as many pages as needed for references, using 10-point type on 12-point (single-spaced) leading, two-column format, Times Roman or a similar font, within a text block 7 wide x 9 deep.