Thu, 22 Sep 2016 18:29:03 +0000
Automating Ad hoc Data Representation Transformations by Vlad Ureche, Aggelos Biboudis, Yannis Smaragdakis, and Martin Odersky:
To maximize run-time performance, programmers often specialize their code by hand, replacing library collections and containers by custom objects in which data is restructured for efficient access. However, changing the data representation is a tedious and error-prone process that makes it hard to test, maintain and evolve the source code.
We present an automated and composable mechanism that allows programmers to safely change the data representation in delimited scopes containing anything from expressions to entire class definitions. To achieve this, programmers define a transformation and our mechanism automatically and transparently applies it during compilation, eliminating the need to manually change the source code.
Our technique leverages the type system in order to offer correctness guarantees on the transformation and its interaction with object-oriented language features, such as dynamic dispatch, inheritance and generics.
We have embedded this technique in a Scala compiler plugin and used it in four very different transformations, ranging from improving the data layout and encoding, to
retrofitting specialization and value class status, and all the way to collection deforestation. On our benchmarks, the technique obtained speedups between 1.8x and 24.5x.
This is a realization of an idea that has been briefly discussed here on LtU a few times, whereby a program is written using high-level representations, and the user has the option to provide a lowering to a more efficient representation after the fact.
This contrasts with the typical approach of providing efficient primitives, like primitive unboxed values, and leaving it to the programmer to compose them efficiently up front.
Wed, 07 Sep 2016 08:54:58 +0000
At this year's SPLASH there will be a follow-up to last year's NOOL 2015 workshop (discussed here on LtU). Objects may not be the trendiest thing in PL circles, but I think they have a bright future. (Example: session types are essentially concurrent objects in disguise, and they are quite trendy.)
Please consider submitting! (Disclaimer: I'm one of the organisers :)
Fri, 19 Aug 2016 09:23:21 +0000
Long HN thread ensues. Many of the comments discuss the benefits/costs of basing pipes on typed objects rather than text streams. As someone who should be inclined in favor of the typed object approach I have to say that I think the text-only folks have the upper hand at the moment. Primary reason is that text as a lingua franca between programs ensures interoperability (and insurance against future changes to underlying object models) and self-documenting code. Clearly the Achilles' heel is parsing/unparsing.
As happens often, one is reminded of the discussions of DSLs and pipelines in Jon Bentley's Programming Pearls...
Sun, 07 Aug 2016 17:26:26 +0000
Nothing you don't already know, if you are inteo this sort of thing (and many if not most LtU-ers are), but a quick way to get the basic idea if you are not. Wadler has papers that explain Curry-Howard better, and the category theory content here is very basic -- but it's an easy listen that will give you the fundamental points if you still wonder what this category thing is all about.
To make this a bit more fun for those already in the know: what is totally missing from the talk (understandable given time constraints) is why this should interest the "working hacker". So how about pointing out a few cool uses/ideas that discerning hackers will appreciate? Go for it!
Wed, 27 Jul 2016 15:57:02 +0000
Fully Abstract Compilation via Universal Embedding by Max S. New, William J. Bowman, and Amal Ahmed:
A fully abstract compiler guarantees that two source components are observationally equivalent in the source language if and only if their translations are observationally equivalent in the target. Full abstraction implies the translation is secure: target-language attackers can make no more observations of a compiled component than a source-language attacker interacting with the original source component. Proving full abstraction for realistic compilers is challenging because realistic target languages contain features (such as control effects) unavailable in the source, while proofs of full abstraction require showing that every target context to which a compiled component may be linked can be back-translated to a behaviorally equivalent source context.
We prove the first full abstraction result for a translation whose target language contains exceptions, but the source does not. Our translation—specifically, closure conversion of simply typed λ-calculus with recursive types—uses types at the target level to ensure that a compiled component is never linked with attackers that have more distinguishing power than source-level attackers. We present a new back-translation technique based on a deep embedding of the target language into the source language at a dynamic type. Then boundaries are inserted that mediate terms between the untyped embedding and the strongly-typed source. This technique allows back-translating non-terminating programs, target features that are untypeable in the source, and well-bracketed effects.
Potentially a promising step forward to secure multilanguage runtimes. We've previously discussed security vulnerabilities caused by full abstraction failures here and here. The paper also provides a comprehensive review of associated literature, like various means of protection, back translations, embeddings, etc.
Sun, 03 Jul 2016 05:16:11 +0000
As many of you know, the email functionality of the website has not been working for a very, very long time. In addition, all new users are still being approved by me, to combat spam. All this means manual work by me, prompted by frustrated emails by new members. Alas, given other commitments, I find that the backlog is growing and I simply cannot find the time to handle these emails (i.e., approve the user, set an initial password, let them know and ask them to change it). If any member want to help me with this, I would be grateful. This will involve getting extra admin privileges on the site, after which I can forward the requests in the pipe line to you to approve.
Sat, 18 Jun 2016 12:51:35 +0000
Dynamic Witnesses for Static Type Errors (or, ill-typed programs usually go wrong) by Eric L. Seidel, Ranjit Jhala, Westley Weimer:
Static type errors are a common stumbling block for newcomers to typed functional languages. We present a dynamic approach to explaining type errors by generating counterexample witness inputs that illustrate how an ill-typed program goes wrong. First, given an ill-typed function, we symbolically execute the body to dynamically synthesize witness values that can make the program go wrong. We prove that our procedure synthesizes general witnesses in that if a witness is found, then for all inhabited input types, there exist values that can make the function go wrong. Second, we show how to extend the above procedure to produce a reduction graph that can be used to interactively visualize and debug witness executions. Third, we evaluate our approach on two data sets comprising over 4,500 ill-typed student programs. Our technique is able to generate witnesses for 88% of the programs, and our reduction graph yields small counterexamples for 81% of the witnesses.
Sounds like a great idea to make type systems more accessible, particularly for beginners. The current limitations are described the discussion, section 54:
There are, of course, drawbacks to our approach. Four that stand out are: (1) coverage limits due to random generation, (2) the inability to handle certain instances of infinite types, (3) dealing with an explosion in the size of generated traces, and (4) handling ad-hoc polymorphism.
It's also not clear whether this would produce proper witnesses for violations of higher kinded types or other more sophisticated uses of type systems. There are plenty of examples where invariants are encoded in types, eg. lightweight static capabilities.
Thu, 09 Jun 2016 13:38:24 +0000
Set-Theoretic Types for Polymorphic Variants by Giuseppe Castagna, Tommaso Petrucciani, and Kim Nguyễn:
Polymorphic variants are a useful feature of the OCaml language whose current definition and implementation rely on kinding constraints to simulate a subtyping relation via unification. This yields an awkward formalization and results in a type system whose behaviour is in some cases unintuitive and/or unduly restrictive.
In this work, we present an alternative formalization of polymorphic variants, based on set-theoretic types and subtyping, that yields a cleaner and more streamlined system. Our formalization is more expressive than the current one (it types more programs while preserving type safety), it can internalize some meta-theoretic properties, and it removes some pathological cases of the current implementation resulting in a more intuitive and, thus, predictable type system. More generally, this work shows how to add full-fledged union types to functional languages of the ML family that usually rely on the Hindley-Milner type system. As an aside, our system also improves the theory of semantic subtyping, notably by proving completeness for the type reconstruction algorithm.
Looks like a nice result. They integrate union types and restricted intersection types for complete type inference, which prior work on CDuce could not do. The disadvantage is that it does not admit principal types, and so inference is non-deterministic (see section 5.3.2).
Thu, 02 Jun 2016 13:44:58 +0000
How to Build Static Checking Systems Using Orders of Magnitude Less Code, by Fraser Brown Andres Notzli Dawson Engler:
Modern static bug finding tools are complex. They typically consist of hundreds of thousands of lines of code, and most of them are wedded to one language (or even one compiler). This complexity makes the systems hard to understand, hard to debug, and hard to retarget to new languages, thereby dramatically limiting their scope.
This paper reduces the complexity of the checking system by addressing a fundamental assumption, the assumption that checkers must depend on a full-blown language specification and compiler front end. Instead, our program checkers are based on drastically incomplete language grammars (“micro-grammars”) that describe only portions of a language relevant to a checker. As a result, our implementation is tiny—roughly 2500 lines of code, about two orders of magnitude smaller than a typical system. We hope that this dramatic increase in simplicity will allow developers to use more checkers on more systems in more languages.
Looks like an interesting approach with some compelling results, and will make a good tool for the toolbox. See also the Reddit thread for further discussion.
Wed, 25 May 2016 13:54:59 +0000
No value restriction is needed for algebraic effects and handlers, by Ohad Kammar and Matija Pretnar:
We present a straightforward, sound Hindley-Milner polymorphic type system for algebraic effects and handlers in a call-by-value calculus, which allows type variable generalisation of arbitrary computations, not just values. This result is surprising. On the one hand, the soundness of unrestricted call-by-value Hindley-Milner polymorphism is known to fail in the presence of computational effects such as reference cells and continuations. On the other hand, many programming examples can be recast to use effect handlers instead of these effects. Analysing the expressive power of effect handlers with respect to state effects, we claim handlers cannot express reference cells, and show they can simulate dynamically scoped state.
Looks like a nice integration of algebraic effects with simple Hindly-Milner, but which yields some unintuitive conclusions. At least I certainly found the possibility of supporting dynamically scoped state but not reference cells surprising!
It highlights the need for some future work to support true reference cells, namely a polymorphic type and effect system to generate fresh instances.
Tue, 10 May 2016 15:24:33 +0000
...the big benefit is that Elm is now significantly easier to learn and use. As the design of subscriptions emerged, we saw that all the toughest concepts in Elm (signals, addresses, and ports) could collapse into simpler concepts in this new world. Elm is designed for ease-of-use, so I was delighted to stumble upon a path that would take us farther with fewer concepts. To put this in more alarmist terms, everything related to signals has been replaced with something simpler and nicer.
Sat, 30 Apr 2016 19:44:24 +0000
Simon's main research interest is in functional programming languages, their implementation, and their application. He was a key contributor to the design of the now-standard functional language Haskell, and is the lead designer of the widely-used Glasgow Haskell Compiler (GHC). He has written two textbooks about the implementation of functional languages.
More generally, Simon is interested in language design, rich type systems, compiler technology, code generation, runtime systems, virtual machines, and garbage collection. He is particularly motivated by direct use of principled theory to practical language design and implementation -- that is one reason he loves functional programming so much.
Simon is also chair of Computing at School, the grass-roots organisation that was at the epicentre of the 2014 reform of the English computing curriculum.
Wed, 27 Apr 2016 17:32:27 +0000
Kent Dybvig (Cadence Research, Cisco Systems) has released the commercial scheme compiler Chez Scheme (scheme.com) as open source on GitHub. Chez Scheme is a native code generating optimizing compiler for R6RS focusing on performance and productivity. It supports cross-compilation, threading, and many other extensions. Current version is 9.4.
I'm excited to see what the community will build with this great tool.
Fri, 08 Apr 2016 17:56:21 +0000The paper: An Array-Oriented Language with Static Rank Polymorphism, Justin Slepak, Olin Shivers, and Panagiotis Manolios, Northeastern University, 2014.
Abstract. The array-computational model pioneered by Iverson’s lan- guages APL and J offers a simple and expressive solution to the “von Neumann bottleneck.” It includes a form of rank, or dimensional, poly- morphism, which renders much of a program’s control structure im- plicit by lifting base operators to higher-dimensional array structures. We present the first formal semantics for this model, along with the first static type system that captures the full power of the core language.
The formal dynamic semantics of our core language, Remora, illu- minates several of the murkier corners of the model. This allows us to resolve some of the model’s ad hoc elements in more general, regular ways. Among these, we can generalise the model from SIMD to MIMD computations, by extending the semantics to permit functions to be lifted to higher-dimensional arrays in the same way as their arguments.
Our static semantics, a dependent type system of carefully restricted power, is capable of describing array computations whose dimensions cannot be determined statically. The type-checking problem is decidable and the type system is accompanied by the usual soundness theorems. Our type system’s principal contribution is that it serves to extract the implicit control structure that provides so much of the language’s expres- sive power, making this structure explicitly apparent at compile time.
Fri, 01 Apr 2016 01:25:22 +0000
Type Checking Modular Multiple Dispatch with Parametric Polymorphism and Multiple Inheritance by Eric Allen, Justin Hilburn, Scott Kilpatrick, Victor Luchangco, Sukyoung Ryu, David Chase, Guy L. Steele Jr.:
In previous work, we presented rules for defining overloaded functions that ensure type safety under symmetric multiple dispatch in an object-oriented language with multiple inheritance, and we showed how to check these rules without requiring the entire type hierarchy to be known, thus supporting modularity and extensibility. In this work, we extend these rules to a language that supports parametric polymorphism on both classes and functions.
In a multiple-inheritance language in which any type may be extended by types in other modules, some overloaded functions that might seem valid are correctly rejected by our rules. We explain how these functions can be permitted in a language that additionally supports an exclusion relation among types, allowing programmers to declare “nominal exclusions” and also implicitly imposing exclusion among different instances of each polymorphic type. We give rules for computing the exclusion relation, deriving many type exclusions from declared and implicit ones.
We also show how to check our rules for ensuring the safety of overloaded functions. In particular, we reduce the problem of handling parametric polymorphism to one of determining subtyping relationships among universal and existential types. Our system has been implemented as part of the open-source Fortress compiler.
Fortress was briefly covered here a couple of times, as were multimethods and multiple dispatch, but this paper really generalizes and nicely summarizes previous work on statically typed modular multimethods, and does a good job explaining the typing rules in an accessible way. The integration with parametric polymorphism I think is key to applying multimethods in other domains which may want modular multimethods, but not multiple inheritance.
The Formalization in COQ might also be of interest to some.
Also, another interesting point is Fortress' use of second-class intersection and union types to simplify type checking.