Preview: Lambda the Ultimate - Programming Languages Weblog

Contextual isomorphisms

Fri, 09 Dec 2016 21:37:50 +0000

Contextual Isomorphisms Paul Blain Levy 2017 What is the right notion of "isomorphism" between types, in a simple type theory? The traditional answer is: a pair of terms that are inverse, up to a specified congruence. We firstly argue that, in the presence of effects, this answer is too liberal and needs to be restricted, using Führmann’s notion of thunkability in the case of value types (as in call-by-value), or using Munch-Maccagnoni’s notion of linearity in the case of computation types (as in call-by-name). Yet that leaves us with different notions of isomorphism for different kinds of type. This situation is resolved by means of a new notion of “contextual” isomorphism (or morphism), analogous at the level of types to contextual equivalence of terms. A contextual morphism is a way of replacing one type with the other wherever it may occur in a judgement, in a way that is preserved by the action of any term with holes. For types of pure λ-calculus, we show that a contextual morphism corresponds to a traditional isomorphism. For value types, a contextual morphism corresponds to a thunkable isomorphism, and for computation types, to a linear isomorphism. This paper is based on a very simple idea that everyone familiar with type-systems can enjoy. It then applies it in a technical setting in which it brings a useful contribution. I suspect that many readers will find that second part too technical, but they may still enjoy the idea itself. To facilitate this, I will rephrase the abstract above in a way that I hope makes it accessible to a larger audience. The problem that the paper solves is: how do we know what it means for two types to be equivalent? For many languages they are reasonable definitions of equivalence (such that: there exists a bijection between these two types in the language), but for more advanced languages these definitions break down. For example, in presence of hidden mutable state, one can build a pair of functions from the one-element type unit to the two-element type bool and back that are the identity when composed together -- the usual definition of bijection -- while these two types should probably not be considered "the same". Those two functions share some hidden state, so they "cheat". Other, more complex notions of type equivalence have been given in the literature, but how do we know whether they are the "right" notions, or whether they may disappoint us in the same way? To define what it means for two program fragments to be equivalent, we have a "gold standard", which is contextual equivalence: two program fragments are contextually equivalent if we can replace one for the other in any complete program without changing its behavior. This is simple to state, it is usually clear how to instantiate this definition for a new system, and it gives you a satisfying notion of equivalent. It may not be the most convenient one to work with, so people define others, more specific notions of equivalence (typically beta-eta-equivalence or logical relations); it is fine if they are more sophisticated, and their definiton harder to justify or understand, because they can always be compared to this simple definition to gain confidence. The simple idea in the paper above is to use this exact same trick to define what it means for two types to be equivalent. Naively, one could say that two types are equivalent if, in any well-typed program, one can replace some occurrences of the first type by occurrences of the second type, all other things being unchanged. This does not quite work, as changing the types that appear in a program without changing its terms would create ill-typed terms. So instead, the paper proposes that two types are equivalent when we are told how to transform any program using the first type into a program using the second type, in a way that is bijective (invertible) and compositional -- see the paper for details. Then, the author can validate this definition by showing that, when instantiated to languages (simple or complex) where existing notions of equival[...]
Joe Armstrong Interviews Alan Kay

Fri, 25 Nov 2016 15:25:27 +0000

Youtube video (via HN)

By far not the best presentation of Kay's ideas but surely a must watch for fans. Otherwise, skip until the last third of the interview which might add to what most people here already know.

It is interesting that in this talk Kay rather explicitly talks about programming languages as abstraction layers. He also mentions some specifics that may not be as well known as others, yet played a role in his trajectory, such as META.

I fully sympathize with his irritation with the lack of attention to and appreciation of fundamental concepts and theoretical frameworks in CS. On the other hand, I find his allusions to biology unconvincing.

An oh, he is certainly right about Minsky's book (my first introduction to theoretical CS) and in his deep appreciation of John McCarthy.

Polymorphism, subtyping and type inference in MLsub

Wed, 23 Nov 2016 16:54:15 +0000

I am very enthusiastic about the following paper: it brings new ideas and solves a problem that I did not expect to be solvable, namely usable type inference when both polymorphism and subtyping are implicit. (By "usable" here I mean that the inferred types are both compact and principal, while previous work generally had only one of those properties.)

Polymorphism, Subtyping, and Type Inference in MLsub

Stephen Dolan and Alan Mycroft

2017

We present a type system combining subtyping and ML-style parametric polymorphism. Unlike previous work, our system supports type inference and has compact principal types. We demonstrate this system in the minimal language MLsub, which types a strict superset of core ML programs.

This is made possible by keeping a strict separation between the types used to describe inputs and those used to describe outputs, and extending the classical unification algorithm to handle subtyping constraints between these input and output types. Principal types are kept compact by type simplification, which exploits deep connections between subtyping and the algebra of regular languages. An implementation is available online.

The paper is full of interesting ideas. For example, one idea is that adding type variables to the base grammar of types -- instead of defining them by their substitution -- forces us to look at our type systems in ways that are more open to extension with new features. I would recommend looking at this paper even if you are interested in ML and type inference, but not subtyping, or in polymorphism and subtyping, but not type inference, or in subtyping and type inference, but not functional languages.

This paper is also a teaser for the first's author PhD thesis, *Algebraic Subtyping*, currently not available online -- I suppose the author is not very good at the internet.

(If you are looking for interesting work on inference of polymorphism and subtyping in object-oriented languages, I would recommend Getting F-Bounded Polymorphism into Shape by Ben Greenman, Fabian Muehlboeck and Ross Tate, 2014.)

Salon des Refusés -- Dialectics for new computer science

Wed, 23 Nov 2016 16:20:16 +0000

In the first decade of the twenty-first century, the Feyerabend Project organized several workshops to discuss and develop new ways to think of programming languages and computing in general. A new event in this direction is a new workshop that will take place in Brussels, in April, co-located with the new

Salon des Refusés -- Dialectics for new computer science

April 2017, Brussels

Salon des Refusés (“exhibition of rejects”) was an 1863 exhibition of artworks rejected from the official Paris Salon. The jury of Paris Salon required near-photographic realism and classified works according to a strict genre hierarchy. Paintings by many, later famous, modernists such as Édouard Manet were rejected and appeared in what became known as the Salon des Refusés. This workshop aims to be the programming language research equivalent of Salon des Refusés. We provide venue for exploring new ideas and new ways of doing computer science.

Many interesting ideas about programming might struggle to find space in the modern programming language research community, often because they are difficult to evaluate using established evaluation methods (be it proofs, measurements or controlled user studies). As a result, new ideas are often seen as “unscientific”.

This workshop provides a venue where such interesting and thought provoking ideas can be exposed to critical evaluation. Submissions that provoke interesting discussion among the program committee members will be published together with an attributed review that presents an alternative position, develops additional context or summarizes discussion from the workshop. This means of engaging with papers not just enables explorations of novel programming ideas, but also enourages new ways of doing computer science.

The workshop's webpage also contains descriptions of of some formats that could "make it possible to think about programming in a new way", including: Thought experiments, Experimentation, Paradigms, Metaphors, myths and analogies, and From jokes to science fiction.

For writings on similar questions about formalism, paradigms or method in programming language research, see Richard Gabriel's work, especially The Structure of a Programming Language Revolution (2012) and Writers’ Workshops As Scientific Methodology (?)), Thomas Petricek's work, especially Against a Universal Definition of 'Type' (2015) and Programming language theory Thinking the unthinkable (2016)), and Jonathan Edwards' blog: Alarmind Development.

For programs of events of similar inspiration in the past, you may be interested in the Future of Programming workshops: program of 2014, program of September 2015, program of October 2015. Other events that are somewhat similar in spirit -- but maybe less radical in content -- are Onward!, NOOL and OBT.

Proving Programs Correct Using Plain Old Java Types

Fri, 04 Nov 2016 14:27:32 +0000

Proving Programs Correct Using Plain Old Java Types, by Radha Jagadeesan, Alan Jeffrey, Corin Pitcher, James Riely:

Tools for constructing proofs of correctness of programs have a long history of development in the research community, but have often faced difficulty in being widely deployed in software development tools. In this paper, we demonstrate that the off-the-shelf Java type system is already powerful enough to encode non-trivial proofs of correctness using propositional Hoare preconditions and postconditions.

We illustrate the power of this method by adapting Fähndrich and Leino’s work on monotone typestates and Myers and Qi’s closely related work on object initialization. Our approach is expressive enough to address phased initialization protocols and the creation of cyclic data structures, thus allowing for the elimination of null and the special status of constructors. To our knowledge, our system is the first that is able to statically validate standard one-pass traversal algorithms for cyclic graphs, such as those that underlie object deserialization. Our proof of correctness is mechanized using the Java type system, without any extensions to the Java language.

Not a new paper, but it provides a lightweight verification technique for some program properties that you can use right now, without waiting for integrated theorem provers or SMT solvers. Properties that require only monotone typestates can be verified, ie. those that operations can only move the typestate "forwards".

In order to achieve this, they require programmers to follow a few simple rules to avoid Java's pervasive nulls. These are roughly: don't assign null explicitly, be sure to initialize all fields when constructing objects.

Automating Ad hoc Data Representation Transformations

Thu, 22 Sep 2016 18:29:03 +0000

Automating Ad hoc Data Representation Transformations by Vlad Ureche, Aggelos Biboudis, Yannis Smaragdakis, and Martin Odersky:

To maximize run-time performance, programmers often specialize their code by hand, replacing library collections and containers by custom objects in which data is restructured for efficient access. However, changing the data representation is a tedious and error-prone process that makes it hard to test, maintain and evolve the source code.

We present an automated and composable mechanism that allows programmers to safely change the data representation in delimited scopes containing anything from expressions to entire class definitions. To achieve this, programmers define a transformation and our mechanism automatically and transparently applies it during compilation, eliminating the need to manually change the source code.

Our technique leverages the type system in order to offer correctness guarantees on the transformation and its interaction with object-oriented language features, such as dynamic dispatch, inheritance and generics.

We have embedded this technique in a Scala compiler plugin and used it in four very different transformations, ranging from improving the data layout and encoding, to

retrofitting specialization and value class status, and all the way to collection deforestation. On our benchmarks, the technique obtained speedups between 1.8x and 24.5x.

This is a realization of an idea that has been briefly discussed here on LtU a few times, whereby a program is written using high-level representations, and the user has the option to provide a lowering to a more efficient representation after the fact.

This contrasts with the typical approach of providing efficient primitives, like primitive unboxed values, and leaving it to the programmer to compose them efficiently up front.

NOOL 2016

Wed, 07 Sep 2016 08:54:58 +0000

At this year's SPLASH there will be a follow-up to last year's NOOL 2015 workshop (discussed here on LtU). Objects may not be the trendiest thing in PL circles, but I think they have a bright future. (Example: session types are essentially concurrent objects in disguise, and they *are* quite trendy.)

Please consider submitting! (Disclaimer: I'm one of the organisers :)

PowerShell is open sourced and is available on Linux

Fri, 19 Aug 2016 09:23:21 +0000

Long HN thread ensues. Many of the comments discuss the benefits/costs of basing pipes on typed objects rather than text streams. As someone who should be inclined in favor of the typed object approach I have to say that I think the text-only folks have the upper hand at the moment. Primary reason is that text as a lingua franca between programs ensures interoperability (and insurance against future changes to underlying object models) and self-documenting code. Clearly the Achilles' heel is parsing/unparsing.

As happens often, one is reminded of the discussions of DSLs and pipelines in Jon Bentley's Programming Pearls...

Philip Wadler: Category Theory for the Working Hacker

Sun, 07 Aug 2016 17:26:26 +0000

Nothing you don't already know, if you are inteo this sort of thing (and many if not most LtU-ers are), but a quick way to get the basic idea if you are not. Wadler has papers that explain Curry-Howard better, and the category theory content here is very basic -- but it's an easy listen that will give you the fundamental points if you still wonder what this category thing is all about.

To make this a bit more fun for those already in the know: what is totally missing from the talk (understandable given time constraints) is why this should interest the "working hacker". So how about pointing out a few cool uses/ideas that discerning hackers will appreciate? Go for it!

Fully Abstract Compilation via Universal Embedding

Wed, 27 Jul 2016 15:57:02 +0000

Fully Abstract Compilation via Universal Embedding by Max S. New, William J. Bowman, and Amal Ahmed:

A

fully abstractcompiler guarantees that two source components are observationally equivalent in the source language if and only if their translations are observationally equivalent in the target. Full abstraction implies the translation is secure: target-language attackers can make no more observations of a compiled component than a source-language attacker interacting with the original source component. Proving full abstraction for realistic compilers is challenging because realistic target languages contain features (such as control effects) unavailable in the source, while proofs of full abstraction require showing that every target context to which a compiled component may be linked can be back-translated to a behaviorally equivalent source context.We prove the first full abstraction result for a translation whose target language contains exceptions, but the source does not. Our translation—specifically, closure conversion of simply typed λ-calculus with recursive types—uses types at the target level to ensure that a compiled component is never linked with attackers that have more distinguishing power than source-level attackers. We present a new back-translation technique based on a deep embedding of the target language into the source language at a dynamic type. Then boundaries are inserted that mediate terms between the untyped embedding and the strongly-typed source. This technique allows back-translating non-terminating programs, target features that are untypeable in the source, and well-bracketed effects.

Potentially a promising step forward to secure multilanguage runtimes. We've previously discussed security vulnerabilities caused by full abstraction failures here and here. The paper also provides a comprehensive review of associated literature, like various means of protection, back translations, embeddings, etc.

Admin

Sun, 03 Jul 2016 05:16:11 +0000

As many of you know, the email functionality of the website has not been working for a very, very long time. In addition, all new users are still being approved by me, to combat spam. All this means manual work by me, prompted by frustrated emails by new members. Alas, given other commitments, I find that the backlog is growing and I simply cannot find the time to handle these emails (i.e., approve the user, set an initial password, let them know and ask them to change it). If any member want to help me with this, I would be grateful. This will involve getting extra admin privileges on the site, after which I can forward the requests in the pipe line to you to approve.

Thanks!

Dynamic Witnesses for Static Type Errors (or, ill-typed programs usually go wrong)

Sat, 18 Jun 2016 12:51:35 +0000

Dynamic Witnesses for Static Type Errors (or, ill-typed programs usually go wrong) by Eric L. Seidel, Ranjit Jhala, Westley Weimer:

Static type errors are a common stumbling block for newcomers to typed functional languages. We present a dynamic approach to explaining type errors by generating counterexample witness inputs that illustrate how an ill-typed program goes wrong. First, given an ill-typed function, we symbolically execute the body to dynamically synthesize witness values that can make the program go wrong. We prove that our procedure synthesizes general witnesses in that if a witness is found, then for all inhabited input types, there exist values that can make the function go wrong. Second, we show how to extend the above procedure to produce a reduction graph that can be used to interactively visualize and debug witness executions. Third, we evaluate our approach on two data sets comprising over 4,500 ill-typed student programs. Our technique is able to generate witnesses for 88% of the programs, and our reduction graph yields small counterexamples for 81% of the witnesses.

Sounds like a great idea to make type systems more accessible, particularly for beginners. The current limitations are described the discussion, section 54:

There are, of course, drawbacks to our approach. Four that stand out are: (1) coverage limits due to random generation, (2) the inability to handle certain instances of infinite types, (3) dealing with an explosion in the size of generated traces, and (4) handling ad-hoc polymorphism.

It's also not clear whether this would produce proper witnesses for violations of higher kinded types or other more sophisticated uses of type systems. There are plenty of examples where invariants are encoded in types, eg. lightweight static capabilities.

Set-Theoretic Types for Polymorphic Variants

Thu, 09 Jun 2016 13:38:24 +0000

Set-Theoretic Types for Polymorphic Variants by Giuseppe Castagna, Tommaso Petrucciani, and Kim Nguyễn:

Polymorphic variants are a useful feature of the OCaml language whose current definition and implementation rely on kinding constraints to simulate a subtyping relation via unification. This yields an awkward formalization and results in a type system whose behaviour is in some cases unintuitive and/or unduly restrictive.

In this work, we present an alternative formalization of polymorphic variants, based on set-theoretic types and subtyping, that yields a cleaner and more streamlined system. Our formalization is more expressive than the current one (it types more programs while preserving type safety), it can internalize some meta-theoretic properties, and it removes some pathological cases of the current implementation resulting in a more intuitive and, thus, predictable type system. More generally, this work shows how to add full-fledged union types to functional languages of the ML family that usually rely on the Hindley-Milner type system. As an aside, our system also improves the theory of semantic subtyping, notably by proving completeness for the type reconstruction algorithm.

Looks like a nice result. They integrate union types and restricted intersection types for complete type inference, which prior work on CDuce could not do. The disadvantage is that it does not admit principal types, and so inference is non-deterministic (see section 5.3.2).

How to Build Static Checking Systems Using Orders of Magnitude Less Code

Thu, 02 Jun 2016 13:44:58 +0000

How to Build Static Checking Systems Using Orders of Magnitude Less Code, by Fraser Brown Andres Notzli Dawson Engler:

Modern static bug finding tools are complex. They typically consist of hundreds of thousands of lines of code, and most of them are wedded to one language (or even one compiler). This complexity makes the systems hard to understand, hard to debug, and hard to retarget to new languages, thereby dramatically limiting their scope.

This paper reduces the complexity of the checking system by addressing a fundamental assumption, the assumption that checkers must depend on a full-blown language specification and compiler front end. Instead, our program checkers are based on drastically incomplete language grammars (“micro-grammars”) that describe only portions of a language relevant to a checker. As a result, our implementation is tiny—roughly 2500 lines of code, about two orders of magnitude smaller than a typical system. We hope that this dramatic increase in simplicity will allow developers to use more checkers on more systems in more languages.

We implement our approach in µchex, a language-agnostic framework for writing static bug checkers. We use it to build micro-grammar based checkers for six languages (C, the C preprocessor, C++, Java, JavaScript, and Dart) and find over 700 errors in real-world projects.

Looks like an interesting approach with some compelling results, and will make a good tool for the toolbox. See also the Reddit thread for further discussion.

No value restriction is needed for algebraic effects and handlers

Wed, 25 May 2016 13:54:59 +0000

No value restriction is needed for algebraic effects and handlers, by Ohad Kammar and Matija Pretnar:

We present a straightforward, sound Hindley-Milner polymorphic type system for algebraic effects and handlers in a call-by-value calculus, which allows type variable generalisation of arbitrary computations, not just values. This result is surprising. On the one hand, the soundness of unrestricted call-by-value Hindley-Milner polymorphism is known to fail in the presence of computational effects such as reference cells and continuations. On the other hand, many programming examples can be recast to use effect handlers instead of these effects. Analysing the expressive power of effect handlers with respect to state effects, we claim handlers cannot express reference cells, and show they can simulate dynamically scoped state.

Looks like a nice integration of algebraic effects with simple Hindly-Milner, but which yields some unintuitive conclusions. At least I certainly found the possibility of supporting dynamically scoped state but not reference cells surprising!

It highlights the need for some future work to support true reference cells, namely a polymorphic type and effect system to generate fresh instances.