TACL Seminar Fall '05
|Sep 22||Daniel S Dantas||PolyAML: A Polymorphic Aspect-oriented Functional Programmming Language||Abstract|
|Oct 6||Frances Spalding||SpacerX: Detecting Deadlocks Using Sequential Analysis||Abstract|
|Oct 13||David Walker]||PADS/ML|
|Oct 27||Jay Ligatti||Policy Enforcement via Program Monitoring||Abstract|
|Nov 3||Guilherme Ottoni||Automatic Thread Extraction with Decoupled Software Pipelining||Abstract|
|Nov 15||Anupam Datta - Stanford||Security Analysis of Network Protocols|
|Nov 22||Matthew Bridges||Predicate Instruction Fusion||Abstract|
|Dec 15||Yitzhak Mandelbaum||The Next 700 Data Description Languages||Abstract|
Daniel S Dantas
PolyAML: A Polymorphic Aspect-oriented Functional Programmming Language
This talk defines PolyAML, a typed functional, aspect-oriented programming language. The main contribution of PolyAML is the seamless integration of polymorphism, run-time type analysis and aspect-oriented programming language features. In particular, PolyAML allows programmers to define type-safe polymorphic advice using pointcuts constructed from a collection of polymorphic join points. PolyAML also comes equipped with a type inference algorithm that conservatively extends Hindley-Milner type inference. To support first-class polymorphic point-cut designators, a crucial feature for developing aspect-oriented profiling or logging libraries, the algorithm blends the conventional Hindley-Milner type inference algorithm with a simple form of local type inference.
We give our language operational meaning via a type-directed translation into an expressive type-safe intermediate language. Many complexities of the source language are eliminated in this translation, leading to a modular specification of its semantics. One of the novelties of the intermediate language is the definition of polymorphic labels for marking control-flow points. These labels are organized in a tree structure such that a parent in the tree serves as a representative for all of its children. Type safety requires that the type of each child is less polymorphic than its parent type. Similarly, when a set of labels is assembled as a pointcut, the type of each label is an instance of the type of the pointcut.
SpacerX: Detecting Deadlocks Using Sequential Analysis
This is work I did this summer as an intern at Microsoft in the Program Analysis Group.
Deadlocks are a very common problem in concurrent code and are notoriously hard to debug. SpacerX is a prototype tool that detects potential deadlocks using sequential analysis.
This is accomplished in two phases. In phase one, we compute the sequence of locks held at each program point using a dataflow analysis. This analysis is built to leverage existing tools from the Program Ananlysis Group, such as a value alias analysis, a path feasibility simulator, and a graphical interface for viewing traces through the code. In phase two, we compute the ordering constraints between locks and look for cycles in these ordering constraints.
After running SpacerX on a very small subset of the Windows code base, we have already found three confirmed deadlocks. These deadlocks could not have been found by hand (the reasoning is too complicated), by model checking (the code involved is too large), or by testing (one deadlock only occurs under specific conditions at precisely midnight).
Policy Enforcement via Program Monitoring
Program monitors intercept security-relevant actions of a target application at run time and take remedial steps whenever the target attempts to execute a potentially dangerous action. Although it is well known that monitors with simple remedial options can enforce only safety properties, we will describe more sophisticated monitors that can enforce any of what we call the 'infinite renewal' properties. We will compare the set of infinite renewal properties with the standard sets of safety and liveness properties and find that infinite renewal properties include all safety properties, some liveness properties, and some properties that are neither safety nor liveness.
In practice, the security policies enforced by program monitors grow more complex both as the monitored software is given new capabilities and as policies are refined in response to attacks and user feedback. We propose dealing with policy complexity by organizing policies so that they are composeable. Security architects can then specify complex policies more simply as compositions of smaller 'subpolicies'. We will describe a language and system called Polymer that implements and supports composeable policies for Java applications.
This is joint work with Lujo Bauer (Carnegie Mellon University) and David Walker (Princeton University).
Automatic Thread Extraction with Decoupled Software Pipelining
This is a presentation I'll be giving at the 38th IEEE/ACM International Symposium on Microarchitecture.
Until recently, a steadily rising clock rate and other uniprocessor microarchitectural improvements could be relied upon to consistently deliver increasing performance for a wide range of applications. Current difficulties in maintaining this trend have lead microprocessor manufacturers to add value by incorporating multiple processors on a chip. Unfortunately, since decades of compiler research have not succeeded in delivering automatic threading for prevalent code properties, this approach demonstrates no improvement for a large class of existing codes.
To find useful work for chip multiprocessors, we propose an automatic approach to thread extraction, called Decoupled Software Pipelining (DSWP). DSWP exploits the fine-grained pipeline parallelism lurking in most applications to extract long-running, concurrently executing threads. Use of the non-speculative and truly decoupled threads produced by DSWP can increase execution efficiency and provide significant latency tolerance, mitigating design complexity by reducing inter-core communication and per-core resource requirements. Using our initial fully automatic compiler implementation and a validated processor model, we prove the concept by demonstrating significant gains for dual-core chip multiprocessor models running a variety of codes. We then explore simple opportunities missed by our initial compiler implementation which suggest a promising future for this approach.
Predication is an optimization that allows compilers to eliminate branches by turning control flow into data flow. This allows the compiler to reduce the number of branch mispredictions in a program, but can lead to longer schedules if the region is heavily resource bound afterward. This talk presents Predicate Instruction Fusion (PIF), an optimization to reduce the resource requirements of a predicated schedule by merging redundant instructions. PIF builds upon and extends the traditional compiler techniques of common subexpression elimination and code hoisting to find and exploit redundancy. We enumerate several cases of possible instruction fusion and show how PIF captures these opportunities. Our results show that PIF can increase the range of regions that can be effectively predicated or software pipelined. Additionally, reducing the resource requirements of already predicated regions leads to better schedules and better performance.
In the spirit of Landin, we present a calculus of dependent types to serve as the semantic foundation for a family of languages called data description languages. Such languages, which include PADS,DATASCRIPT, and PACKETTYPES, are designed to facilitate programming with ad hoc data, i.e., data not in well-behaved relational or XML formats. In the calculus, each type describes the physical layout and semantic properties of a data source. In the semantics, we interpret types simultaneously as the in-memory representation of the data described and as parsers for the data source. The parsing functions are robust, automatically detecting and recording errors in the data stream without halting parsing. We show the parsers are type-correct, returning data whose type matches the simple-type interpretation of the specification. We also prove the parsers are “error-correct,” accurately reporting the number of physical and semantic errors that occur in the returned data. We use the calculus to describe the features of various data description languages, and we discuss how we have used the calculus to improve PADS.