Invited Speakers
- Serge Abiteboul(Inria)
Issues in Ethical Data Management
Data science holds incredible promise of improving people's lives, accelerating scientific discovery and innovation, and bringing about positive societal change. Yet, if not used responsibly, this technology can generate economic inequality, destabilize global markets and worsen systemic bias. We consider issues such as bias and violation of data privacy in data analysis. We discuss desirable properties of data analysis such as fairness, transparency, neutrality, and diversity. Our goal is to draw the attention of the computer science community to the important emerging subject of responsible data management and analysis. We present our perspective on the issue, and motivate research directions.
- Sumit Gulwani (Microsoft):
Programming by Examples: Applications, Algorithms, and Ambiguity Resolution
99% of computer users do not know programming and hence struggle with repetitive tasks. Programming by Examples (PBE) can revolutionize this landscape by enabling users to synthesize intended programs from example based specifications. A key technical challenge in PBE is to search for programs that are consistent with the examples provided by the user. Our efficient search methodology is based on two key ideas: (i) Restriction of the search space to an appropriate domain-specific language (ii) A divide-and-conquer based search paradigm that inductively reduces the problem of synthesizing a program with a certain top-level operator to simpler synthesis problems over its sub-programs by leveraging the operator's inverse semantics. Another challenge in PBE is to resolve the ambiguity in the example based specification. Our ambiguity resolution methodology leverages two complementary approaches: (a) machine learning based ranking techniques that can pick an intended program from among those that satisfy the specification, and (b) active-learning based user interaction models. I will illustrate these various concepts using Flash Fill, FlashExtract, and FlashRelate---PBE technologies for data manipulation domains. These technologies, which have been released inside various Microsoft products, are useful for data scientists who spend 80% of their time wrangling with data. The Microsoft PROSE SDK allows easy construction of such technologies.
- Marieke Huisman (University of Twente):
A Verification Technique for Deterministic Parallel Programs
A commonly used approach to develop parallel programs is to augment a sequential program with compiler directives that indicate which program blocks may potentially be executed in parallel. This talk presents how we use our verification technique for concurrent software, as supported by the VerCors tool set, to prove correctness of compiler directives combined with functional correctness of the program. We propose syntax and semantics for a simple core language, capturing the main forms of deterministic parallel programs. This language distinguishes three kinds of basic blocks: parallel, vectorized and sequential blocks, which can be composed using three different composition operators: sequential, parallel and fusion composition. We show that it is sufficient to have contracts for the basic blocks to prove correctness of the compiler directives, and moreover that functional correctness of the sequential program implies correctness of the parallelized program. We also show how a widely-used subset of OpenMP can be encoded into our core language, thus effectively enabling the verification of OpenMP compiler directives, and we discuss automated tool support for this verification process.