Subscribe: Terse Systems
http://tersesystems.com/rss.xml
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
actor  certificate  code  https  java  ldquo  mdash  permission java  play  rdquo  scala  security  server  system  tls 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Terse Systems

Terse Systems



Terse Systems



Published: Mon, 11 Jul 2016 19:34:25 -0700

Last Build Date: Mon, 11 Jul 2016 19:34:25 -0700

 



Redefining java.lang.System with Byte Buddy

Tue, 19 Jan 2016 10:37:00 -0800

The previous post talked about using Java’s SecurityManager to prevent attackers from gaining access to sensitive resources. This is complicated by the fact that if the wrong permissions are granted, or a type confusion attack in the JVM is used, it’s possible to turn off SecurityManager by calling System.setSecurityManager(null). There should be a way to tell the JVM that once a SecurityManager is set, it should never be unset. But this isn’t in the JVM itself right now, and adding it would mean redefining java.lang.System itself. So let’s go do that. The example project is at https://github.com/wsargent/securityfixer. The first step is to use the Java Instrumentation API. This will allow us to install a Java agent before the main program starts. In the Java agent, we’ll intercept the setSecurityManager method, and throw an exception if the security manager is already set. The second step is Byte Buddy, a code generation tool that will create new bytecode representing the System class. Byte Buddy comes with an AgentBuilder that can be attached to the instrumentation instance. Byte Buddy uses ASM under the hood, but doesn’t require raw manipulation of byte code and class files the way that ASM does — instead, you write interceptors and Byte Buddy will generate the corresponding byte code. From there, an interceptor appended to the bootstrap class path will be loaded before the actual JVM System class. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 public class SecurityFixerAgent { public static void premain(String arg, Instrumentation inst) { install(arg, inst); } public static void agentmain(String arg, Instrumentation inst) { install(arg, inst); } /** * Installs the agent builder to the instrumentation API. * * @param arg the path to the interceptor JAR file. * @param inst instrumentation instance. */ static void install(String arg, Instrumentation inst) { appendInterceptorToBootstrap(arg, inst); AgentBuilder agentBuilder = createAgentBuilder(inst); agentBuilder.installOn(inst); } } The interceptor class lives in its own package and is relatively simple: 1 2 3 4 5 6 7 8 9 10 11 public class MySystemInterceptor { private static SecurityManager securityManager; public static void setSecurityManager(SecurityManager s) { if (securityManager != null) { throw new IllegalStateException("SecurityManager cannot be reset!"); } securityManager = s; } } The interesting bit is the configuration of the AgentBuilder. Byte Buddy is set up out of the box for class transformation and adding new methods and dynamic classes, not redefinition of static methods, so we have to flip a bunch of switches to get the behavior we want: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 private static AgentBuilder createAgentBuilder(Instrumentation inst) { // Find me a class called "java.lang.System" final ElementMatcher.Junction systemType = ElementMatchers.named("java.lang.System"); // And then find a method called setSecurityManager and tell MySystemInterceptor to // intercept it (the method binding is smart enough to take it from there) final AgentBuilder.Transformer transformer = (b, typeDescription) -> b.method(ElementMatchers.named("setSecurityManager")) .intercept(MethodDelegation.to(MySystemInterceptor.class)); // Disable a bunch of stuff and turn on redefine as the only option final ByteBuddy byteBuddy = new ByteBuddy().with(Implementation.Context.Disabled.Factory.INSTANCE); final AgentBuilder agentBuilder = new AgentBuilder.Default() .withByteBuddy(byteBuddy) .withInitializationStrategy(AgentBuilder.InitializationStrategy.NoOp.INSTANCE) .withRedefinitionStrategy(AgentBuilder.RedefinitionStrategy.REDEFINITION) .withTypeStrategy(A[...]



Self-Protecting Sandbox using SecurityManager

Tue, 29 Dec 2015 14:18:00 -0800

TL:DR; Most sandbox mechanisms involving Java’s SecurityManager do not contain mechanisms to prevent the SecurityManager itself from being disabled, and are therefore “defenseless” against malicious code. Use a SecurityManager and a security policy as a system property on startup to cover the entire JVM, or use an “orthodox” sandbox as described below. Background Since looking at the Java Serialization vulnerability, I’ve been thinking about mitigations and solutions in the JVM. I started with a Java agent, notsoserial, as a way to disable Java serialization itself. But that just opened up a larger question — why should the JVM let you call runtime.exec to begin with? Poking into that question led me to Pro-Grade and looking at blacklists of security manager permissions as a solution. The problem with blacklists is that there’s always a way to work around them. Simply disabling the “execute” file permission in the JVM didn’t mean anything by itself — what’s to say that it couldn’t simply be re-enabled? What prevents malicious code from turning off the SecurityManager? I thought it was possible, but I didn’t know exactly how it could happen. I’ve never worked on a project that worked with the SecurityManager, or added custom security manager checks. Most of the time, the Java security manager is completely disabled on server side applications, or the application is given AllPermission, essentially giving it root access to the JVM. Broken Sandboxes All my suspicions were confirmed in a recent paper from a team at CMU, Evaluating the Flexibility of the Java Sandbox: [D]evelopers regularly misunderstand or misuse Java security mechanisms, that benign programs do not use all of the vast flexibility afforded by the Java security model, and that there are clear differences between the ways benign and exploit programs interact with the security manager. The team found that most of the policies that programmers put in place to “sandbox” code so that it did not have permissions to do things like execute files were not effective. In particular, they call out some security manager idioms as “defenseless” — they cannot prevent themselves from being sidestepped. In a nutshell, while sandboxed applications may not execute a script on the filesystem, they can still modify or disable the security manager. The team looked through 36 applications that used the SecurityManager, to see how application programmers work with the security architecture. Every single one of them failed. All of these applications ran afoul of the Java sandbox’s flexibility even though they attempted to use it for its intended purpose. […] While Java does provide the building blocks for constraining a subset of an application with a policy that is stricter than what is imposed on the rest of the application, it is clear that it is too easy to get this wrong: we’ve seen no case where this goal was achieved in a way that is known to be free of vulnerabilities. I think there are times when the disconnect between security professionals and application developers turns into a gulf. At the same time that the Oracle Secure Coding Guide has a section on access control, and the CERT guide talks about protecting sensitive operations with security manager checks, there’s very little about how to set up a secure environment that can use SecurityManager effectively at all. Sandbox Defeating Permissions Here are the permissions that make up a defenseless sandbox: RuntimePermission("createClassLoader") RuntimePermission("accessClassInPackage.sun") RuntimePermission("setSecurityManager") ReflectPermission("suppressAccessChecks") FilePermission("<>", "write, execute") SecurityPermission("setPolicy") SecurityPermission("setProperty.package.access") Given any one of these, sandboxed code can break out of the sandbox and work its way down the stack. The A[...]



An Easy Way to Secure Java Applications

Tue, 22 Dec 2015 14:18:00 -0800

One of the things that stands out in the Java Serialization exploit is that once a server side Java application is compromised, the next step is to gain shell access on the host machine. This is known as a Remote Code Execution, or RCE for short. The interesting thing is that Java has had a way to restrict execution and prevent RCE almost since Java 1.1: the SecurityManager. With the SecurityManager enabled, Java code operates inside a far more secure sandbox that prevents RCE. 1 java -Djava.security.manager com.example.Hello This runs with the default security policy in $JAVA_HOME/jre/lib/security/java.policy, which in JDK 1.8 is: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 // Standard extensions get all permissions by default grant codeBase "file:$/*" { permission java.security.AllPermission; }; // default permissions granted to all domains grant { // Allows any thread to stop itself using the java.lang.Thread.stop() // method that takes no argument. // Note that this permission is granted by default only to remain // backwards compatible. // It is strongly recommended that you either remove this permission // from this policy file or further restrict it to code sources // that you specify, because Thread.stop() is potentially unsafe. // See the API specification of java.lang.Thread.stop() for more // information. permission java.lang.RuntimePermission "stopThread"; // allows anyone to listen on dynamic ports permission java.net.SocketPermission "localhost:0", "listen"; // "standard" properies that can be read by anyone permission java.util.PropertyPermission "java.version", "read"; permission java.util.PropertyPermission "java.vendor", "read"; permission java.util.PropertyPermission "java.vendor.url", "read"; permission java.util.PropertyPermission "java.class.version", "read"; permission java.util.PropertyPermission "os.name", "read"; permission java.util.PropertyPermission "os.version", "read"; permission java.util.PropertyPermission "os.arch", "read"; permission java.util.PropertyPermission "file.separator", "read"; permission java.util.PropertyPermission "path.separator", "read"; permission java.util.PropertyPermission "line.separator", "read"; permission java.util.PropertyPermission "java.specification.version", "read"; permission java.util.PropertyPermission "java.specification.vendor", "read"; permission java.util.PropertyPermission "java.specification.name", "read"; permission java.util.PropertyPermission "java.vm.specification.version", "read"; permission java.util.PropertyPermission "java.vm.specification.vendor", "read"; permission java.util.PropertyPermission "java.vm.specification.name", "read"; permission java.util.PropertyPermission "java.vm.version", "read"; permission java.util.PropertyPermission "java.vm.vendor", "read"; permission java.util.PropertyPermission "java.vm.name", "read"; }; Take code like this, for example: 1 2 3 4 5 6 7 8 9 10 package com.example object Hello { def main(args: Array[String]): Unit = { val runtime = Runtime.getRuntime val cwd = System.getProperty("user.dir") val process = runtime.exec(s"$cwd/testscript.sh") println("Process executed without security manager interference!") } } With the security manager enabled and using an additional policy file, it’s possible to enable or disable execute privileges cleanly: 1 2 3 4 5 6 7 8 grant { // You can read user.dir permission java.util.PropertyPermission "user.dir", "read"; // Gets access to the current user directory script permission java.io.FilePermission "${user.dir}/testscript.sh", "execute"; permis[...]



The Right Way to Use SecureRandom

Thu, 17 Dec 2015 16:58:00 -0800

How do you generate a secure random number in JDK 1.8? It depends. The default: 1 2 3 SecureRandom random = new SecureRandom(); byte[] values = new byte[20]; random.nextBytes(values); If you’re okay with blocking the thread: 1 2 3 SecureRandom random = SecureRandom.getInstanceStrong(); byte[] values = new byte[20]; random.nextBytes(values); That’s really it. Details The difference between the first use case and the second: the first instance uses /dev/urandom. The second instance uses /dev/random. /dev/random blocks the thread if there isn’t enough randomness available, but /dev/urandom will never block. Believe it or not, there is no advantage in using /dev/random over /dev/urandom. They use the same pool of randomness under the hood. They are equally secure. If you want to safely generate random numbers, you should use /dev/urandom. The only time you would want to call /dev/random is when the machine is first booting, and entropy has not yet accumulated. Most systems will save off entropy before shutting down so that some is available when booting, so this is not an issue if you run directly on hardware. However, it might be an issue if you don’t run directly on hardware. If you are using a container based solution like Docker or CoreOS, you may start off from an initial image, and so may not be able to save state between reboots — additionally, in a multi-tenant container solution, there is only one shared /dev/random which may block horribly. However, the work around in these cases is to seed /dev/random with a userspace solution, either using an entropy server for pollinate, or a CPU time stamp counter for haveged. Either way, by the time the JVM starts, the system’s entropy pool should already be up to the job. Some people have a cryptographic habit of using /dev/random for seed generation, so there are some cases where it’s easier to use getInstanceStrong just to avoid argument or the hassle of a code review. However, that’s a workaround for a personnel issue, not a cryptographic argument. How the default works There is a full list of SecureRandom implementation available, which lists the preferences available for the “default” SecureRandom. For Linux and MacOS, the list is: NativePRNG** Sun SHA1PRNG** Sun NativePRNGBlocking Sun NativePRNGNonBlocking There is an asterisk saying “On Solaris, Linux, and OS X, if the entropy gathering device in java.security is set to file:/dev/urandom or file:/dev/random, then NativePRNG is preferred to SHA1PRNG. Otherwise, SHA1PRNG is preferred.” However, this doesn’t affect the list. When they say “entropy gathering device”, they mean “securerandom.source”, and grepping through java.security shows: 1 2 3 4 $ grep securerandom.source $JAVA_HOME/jre/lib/security/java.security # specified by the "securerandom.source" Security property. If an # "securerandom.source" Security property. securerandom.source=file:/dev/random Yep, the line exists, so “NativePRNG” is preferred to “SHA1PRNG”. So, what does that mean? There’s an entry in Standard Names, but there’s also is a more specific note of what each algorithm does in the Sun Providers section: SHA1PRNG (Initial seeding is currently done via a combination of system attributes and the java.security entropy gathering device) NativePRNG (nextBytes() uses /dev/urandom, generateSeed() uses /dev/random) NativePRNGBlocking (nextBytes() and generateSeed() use /dev/random) NativePRNGNonBlocking (nextBytes() and generateSeed() use /dev/urandom) The nextBytes method is the base method: when you call nextInt or nextLong, etc, it will call down to nextBytes under the hood. The generateSeed method is not needed for a Native PRNG of any type, but it IS useful to seed a user space PRNG such as SHA1PRNG. You can call setSeed on a NativePRNG, and it will use an int[...]



Closing the Open Door of Java Object Serialization

Sun, 08 Nov 2015 10:46:00 -0800

TL;DR This is a long blog post, so please read carefully and all the way through before you come up with objections as to why it’s not so serious. Here’s the short version. Java Serialization is insecure, and is deeply intertwingled into Java monitoring (JMX) and remoting (RMI). The assumption was that placing JMX/RMI servers behind a firewall was sufficient protection, but attackers use a technique known as pivoting or island hopping to compromise a host and send attacks through an established and trusted channel. SSL/TLS is not a protection against pivoting. This means that if a compromised host can send a serialized object to your JVM, your JVM could also be compromised, or at least suffer a denial of service attack. And because serialization is so intertwingled with Java, you may be using serialization without realizing it, in an underlying library that you cannot modify. To combat an attacker who has penetrated or bypassed initial layers of security, you need a technique called defense in depth. Ideally, you should disable serialization completely using a JVM agent called notsoserial. This will give you a security bulkhead and you can add network monitoring to see if an attacker starts testing ports with serialized objects. If you can’t disable serialization, then there are options for limiting your exposure until you can remove those dependencies. Please talk to your developers and vendors about using a different serialization format. The Exploit If you can communicate with a JVM using Java object serialization using java.io.ObjectInputStream, then you can send a class (technically bytes that cause instantiation of a class already on the classpath) that can execute commands against the OS from inside of the readObject method, and thereby get shell access. Once you have shell access, you can modify the Java server however you feel like. This is a class of exploit called “deserialization of untrusted data”, aka CWE-502. It’s a class of bug that has been encountered from Python, PHP, and from Rails. Chris Frohoff and Gabriel Lawrence presented a talk called Marshalling Pickles that talked about some exploits that are possible once you have access to Java object serialization. Practical Attacks A blog post by FoxGlove Security took the Marshalling Pickles talk and pointed out that it’s common for application servers to run ports with either RMI, or JMX, a management protocol that runs on top of RMI. An attacker with access to those ports could compromise the JVM. The proposed fix was to identify all app servers containing commons-collections JAR and remove them. The problem is, you don’t need to have commons-collection running — there’s a number of different pathways in. The ysoserial tool shows four different ways into the JVM using object serialization, and that’s only with the known libraries. There are any number of Java libraries which could have viable exploits. It isn’t over. Matthias Kaiser of Code White is doing further research in Exploiting Java Serialization, and says that more exploits are coming. So, fixing this particular exploit doesn’t fix the real problem, nor does it explain why it exists. The Real Problem The real problem is that “deserialization of untrusted input” happens automatically, at the ObjectInputStream level, when readObject is called. You need to check untrusted input first before deserializing it, a process called “validation” or “recognition” if you’re up on language security. But the specification is so powerful and complex that there isn’t a good way to securely validate Java serialized objects. Isolation — only having a management port open inside your “secure” data center — isn’t enough. Cryptography — using message authentication or encryption — isn’t enough. Obscurit[...]



Effective Cryptography in the JVM

Mon, 05 Oct 2015 13:09:00 -0700

Cryptography is hard. But it doesn’t have to be. Most cryptographic errors are in construction, in the joints of putting together well known primitives. From an effective cryptography standpoint, you are the weakest link. There are libraries out there that make all the right decisions, and minimize your exposure to incorrect constructions, meaning you never have to type “AES” and build a broken system out of working primitives. The most well known tools are NaCl and cryptlib, but these are C based tools — not useful for the JVM. (There are Java bindings like Kalium, but they require dynamic linked libraries to work.) However, there is a library out there that builds on top of Java’s native cryptography (JCA) and doesn’t require OS level integration: Keyczar. Keyczar’s philosophy is easier cryptography for software developers, and this post is mostly about how to use Keyczar. Keyczar isn’t a new project — Google came out with Keyczar in 2008. It is, however, a brutally undervalued project. Cryptographers hold their software to a high standard, and Keyczar is one of the few which has held up over the years (barring Nate Lawson’s timing attack). But Keyczar doesn’t break any new ground technologically, and uses some older (although still perfectly good) algorithms. To cryptographers, it’s boring. To developers, however, Keyczar is a series of bafflingly arbitrary decisions. Why are RSA keys specified as either as encrypt/decrypt or as sign/verify? (Because “one key per purpose.”) Why such a complicated system for adding and removing keys? (Because key management is important.) Why all the metadata? (Because key management isn’t free.) The fact that these are all cryptographically correct decisions is besides the point — Keyczar does very little to explain itself to developers. So. Let’s start explaining Keyczar. Explaining Keyczar Keyczar works on several levels. The first level is making sure you don’t make elementary crypto mistakes in encryption algorithms, like reusing an IV or using ECB mode. The second level is to minimize the possibility of key misuse, by setting up a key rotation system and ensuring keys are defined with a particular purpose. The third level is to set up cryptographic systems on top of the usual primitives that JCA or Bouncycastle will give you. Using Keyczar, you can effectively use public keys to add “PGP like” features to your application, or define signatures that are invalid after an expiration date. Note that the Java version of Keyczar does not cover password hashing or password based encryption (PBE). However, Keyczar is cross platform — there are .NET, Python, C# and Java libraries available, all designed to use the same underlying formats and protocols. Example Use Cases Keyczar exposes operations, but doesn’t talk much about what you can do with it. You can use Keyczar for storing application secrets securely. You can use Keyczar for PCI compliance. You can use Keyczar for secure storage of access tokens in an email application. More to the point, you don’t have to tremble in fear every time someone sends you a cryptography link. Even if you only have simple use cases, Keyczar will do what you want. Documenting Keyczar The current documentation is all on the Github wiki at https://github.com/google/keyczar/wiki — there is a PDF, but it’s from 0.5, and is horribly out of date. Instead, you should download the Wiki to your local machine. 1 git clone https://github.com/google/keyczar.wiki.git Installing Keyczar The first step is to get source, binaries and documentation. Keyczar’s source is on Github at https://github.com/google/keyczar and currently the only way I know to get current binaries is to build it from source. The latest version is 0.7.1g, so th[...]



Exposing Akka Actor State with JMX

Tue, 19 Aug 2014 08:51:00 -0700

I’ve published an activator template of how to integrate JMX into your Akka Actors. Using this method, you can look inside a running Akka application and see exactly what sort of state your actors are in. Thanks to Jamie Allen for the idea in his book, Effective Akka. Running Start up Activator with the following options: 1 2 3 export JAVA_OPTS="-XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:FlightRecorderOptions=samplethreads=true -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9191 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=localhost" export java_opts=$JAVA_OPTS activator "runMain jmxexample.Main" Then in another console, start up your JMX tool. In this example, we are using : 1 jmc Using Java Mission Control, connect to the NettyServer application listed in the right tree view. Go to “MBean Server” item in the tree view on the right. Click on “MBean Browser” in the second tab at the bottom. Open up the “jmxexample” tree folder, then “GreeterMXBean”, then “/user/master”. You’ll see the attributes on the right. Hit F5 a lot to refresh. You should see this: Creating an MXBean with an External View Class Exposing state through JMX is easy, as long as you play by the rules: always use an MXBean (which does not require JAR downloads over RMI), always think about thread safety when exposing internal variables, and always create a custom class that provides a view that the MXBean is happy with. Here’s a trait that exposes some state, GreetingHistory. As long as the trait ends in “MXBean”, JMX is happy. It will display the properties defined in that trait. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 /** * MXBean interface: this determines what the JMX tool will see. */ trait GreeterMXBean { /** * Uses composite data view to show the greeting history. */ def getGreetingHistory: GreetingHistory /** * Uses a mapping JMX to show the greeting history. */ def getGreetingHistoryMXView: GreetingHistoryMXView } Here’s the JMX actor that implements the GreeterMXBean interface. Note that the only thing it does is receive a GreeterHistory case class, and then renders it. There is a catch, however: because the greetingHistory variable is accessed both through Akka and through a JMX thread, it must be declared as volatile so that memory access is atomic. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 /** * The JMX view into the Greeter */ class GreeterMXBeanActor extends ActorWithJMX with GreeterMXBean { // @volatile needed because JMX and the actor model access from different threads @volatile private[this] var greetingHistory: Option[GreetingHistory] = None def receive = { case gh: GreetingHistory => greetingHistory = Some(gh) } def getGreetingHistory: GreetingHistory = greetingHistory.orNull def getGreetingHistoryMXView: GreetingHistoryMXView = greetingHistory.map(GreetingHistoryMXView(_)).orNull // Maps the MXType to this actor. override def getMXTypeName: String = "GreeterMXBean" } The actor which generates the GreetingHistory case class — the state that you want to expose — should be a parent of the JMX bean, and have a supervisor strategy that can handle JMX exceptions: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 trait ActorJMXSupervisor extends Actor with ActorLogging { import akka.actor.OneForOneStrategy import akka.actor.SupervisorStrategy._ import scala.concurrent.duration._ override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) { case e: JMRuntimeException => log.error(e, "Supervisor strategy STOPPING actor from errors during JM[...]



Composing Dependent Futures

Thu, 10 Jul 2014 13:44:00 -0700

This blog post is adapted from a lightning talk I gave at NetflixOSS, Season 2, Episode 2. I’ve noticed that when the word “reactive” is mentioned, it tends not to be associated with any code. One of the things that “reactive” means is “non-blocking” code. “Non blocking” means the idea that you can make a call, and then go on and do something else in your program until you get a notification that something happened. There are a number of frameworks which handle the notification — the idea that a response may not happen immediately — in different ways. Scala has the option of using a couple of different non-blocking mechanisms, and I’m going to go over how they’re used and some interestin wrinkles when they are composed together. Futures Scala uses scala.concurrent.Future as the basic unit of non-blocking access. The best way I’ve found to think of a Future is a box that will, at some point, contain the thing that you want. The key thing with a Future is that you never open the box. Trying to force open the box will lead you to blocking and grief. Instead, you put the Future in another, larger box, typically using the map method. Here’s an example of a Future that contains a String. When the Future completes, then Console.println is called: 1 2 3 4 5 6 7 8 9 10 11 12 object Main { import scala.concurrent.Future import scala.concurrent.ExecutionContext.Implicits.global def main(args:Array[String]) : Unit = { val stringFuture: Future[String] = Future.successful("hello world!") stringFuture.map { someString => // if you use .foreach you avoid creating an extra Future, but we are proving // the concept here… Console.println(someString) } } } Note that in this case, we’re calling the main method and then… finishing. The string’s Future, provided by the global ExecutionContext, does the work of calling Console.println. This is great, because when we give up control over when someString is going to be there and when Console.println is going to be called, we let the system manage itself. In constrast, look what happens when we try to force the box open: 1 2 val stringFuture: Future[String] = Future.successful("hello world!") val someString = Future.await(future) In this case, we have to wait — keep a thread twiddling its thumbs — until we get someString back. We’ve opened the box, but we’ve had to commandeer the system’s resources to get at it. Event Based Systems with Akka When we talk about reactive systems in Scala, we’re talking about event driven systems, which typically means Akka. When we want to get a result out of Akka, there are two ways we can do it. We can tell — fire off a message to an actor: 1 fooActor ! GetFoo(1) and then rely on fooActor to send us back a message: 1 2 3 4 def receive = { case GetFoo(id) => sender() ! Foo(id) } This has the advantage of being very simple and straightforward. You also have the option of using ask, which will generate a Future: 1 val fooFuture: Future[Foo] = fooActor ? GetFoo(1) When the actor’s receive method sends back Foo(id) then the Future will complete. If you want to go the other way, from a Future to an Actor, then you can use pipeTo: 1 2 3 Future.successful { Foo(1) } pipeTo actorRef tell is usually better than ask, but there are nuances to Akka message processing. I recommend Three flavours of request-response pattern in Akka and Ask, Tell and Per Request Actors for a more detailed analysis of messages, and see the Akka documentation. The important caveat is that not all systems are Akka-based. If you’re talking to a NoSQL store like Redis or Cassandra, odds are that you are using a non-blocking driver th[...]



Play TLS Example With Client Authentication

Mon, 07 Jul 2014 22:38:00 -0700

This is part of a series of posts about setting up Play WS as a TLS client for a “secure by default” setup. Previous posts are: Fixing The Most Dangerous Code In The World (MITM, Protocols, Cipher Suites, Cert Stores) Fixing X.509 Certificates (General PKI, Weak Signature and Key Algorithms) Fixing Certificate Revocation (CRL, OCSP) Fixing Hostname Verification (HTTPS server identity checks) Testing Hostname Verification (DNS spoofing your client for educational purposes) This post is where the rubber meets the road — an actual, demonstrable activator template that shows off the WS SSL, provides the scripts for certificate generation, and provides people with an out of the box TLS 1.2 using ECDSA certificates. Want to download it? Go to https://github.com/typesafehub/activator-play-tls-example or clone it directly: 1 git clone https://github.com/typesafehub/activator-play-tls-example.git It’s an activator template, so you can also install it from inside Typesafe Activator by searching for “TLS”. Be sure to read the README. This project is as lightweight as possible, but takes a little configuration to get started. Certificate Generation The biggest part of any demo application is setting up the scripts. I didn’t find anything that was really hands free, so I wrote my own. They are exactly the same as the ones described Certificate Generation section of the manual. There’s various shortcuts that you can use for defining X.509 certificates, but I found it a lot more useful to go through the work of setting up the root CA certificate, defining the server certificate as having an EKU of “serverAuth” and so on. Play Script The actual script to run Play with all the required JVM options is… large. Part of this is the documentation on every possible feature, but sadly, there are far too many lines which are “best practices” that are very rarely practiced. Also, the note about rh.secure is a reference to the RequestHeader class in Play itself. Ironically, even when we set HTTPS up on the server, Play itself can’t tell the protocol it’s running on without help. I will admit to being gleefully happy at setting disabledAlgorithms.properties on startup, so that at last AlgorithmConstraints is enabled on the server: 1 2 jdk.tls.disabledAlgorithms=RSA keySize < 2048, DSA keySize < 2048, EC keySize < 224 jdk.certpath.disabledAlgorithms=MD2, MD4, MD5, RSA keySize < 2048, DSA keySize < 2048, EC keySize < 224 CustomSSLEngineProvider The CustomSSLEngineProvider is responsible for Play’s HTTPS server. More details can be found in Configuring HTTPS. Setting up an SSLEngineProvider with client authentication is pretty easy, once you know the magic incantations needed to get the trust managers and the key managers set up. After that, it’s a question of ensuring that the SSLEngine knows how trusting it should be. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 override def createSSLEngine(): SSLEngine = { val sslContext = createSSLContext(appProvider) // Start off with a clone of the default SSL parameters… val sslParameters = sslContext.getDefaultSSLParameters // Tells the server to ignore client's cipher suite preference. // http://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#cipher_suite_preference sslParameters.setUseCipherSuitesOrder(true) // http://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#SSLParameters val needClientAuth = java.lang.System.getProperty("play.ssl.needClientAuth") sslParameters.setNeedClientAuth(java.lang.Boolean.parseBoolean(needClientAuth)) // Clone and modify the default SSL parameters. val engine = sslContext.createSSLEngine engine.setSSLP[...]



Akka Clustering, Step by Step

Wed, 25 Jun 2014 09:22:00 -0700

This blog post shows how an Akka cluster works by walking through an example application in detail. Introduction Akka is an excellent toolkit for handling concurrency. The core concept behind Akka is the Actor model: loosely stated, instead of creating an instance of a class and invoking methods on it, i.e. 1 2 val foo = new Foo() foo.doStuff(args) You create an Actor, and send the actor immutable messages. Those messages get queued through a mailbox, and the actor processes the messages one by one, in order they were received: 1 2 val fooActor: ActorRef = actorSystem.actorOf(FooActor.props, "fooActor") fooActor ! DoStuffMessage(args) Sending messages to an actor is much better than invoking a method for a number of reasons. First, you can pass around an ActorRef anywhere. You can pass them between threads. You can keep them in static methods. No matter what you do, you can’t get the actor into an inconsistent state or call methods in the “wrong” order because the Actor manages its own state in response to those messages. Second, the actor can send messages back to you: FooActor.scala1 2 3 4 5 6 def receive = { case DoStuffMessage(args) => sender() ! Success("all good!") case _ => sender() ! Failure("not so good.") } Message passing means that the usual binary of method calls — either return a value or throw an exception — gets opened up considerably. When you can send the actor any message you like, and the actor can send you any message back (and can send you that message when it gets around to processing it, instead of immediately), then you are not bound by locality any more. The actor that you’re sending a message to doesn’t have to live on the same JVM that you’re on. It doesn’t even have to live on the same physical machine. As long as there’s a transport capable of getting the message to the actor, it can live anywhere. This brings us to Akka remoting. Remoting Akka remoting works by saying to the actor system either “I want you to create an actor on this remote host”: 1 val ref = system.actorOf(FooActor.props.withDeploy(Deploy(scope = RemoteScope(address)))) or “I want a reference to an existing actor on the remote host”: 1 val remoteFooActor = context.actorSelection("akka.tcp://actorSystemName@10.0.0.1:2552/user/fooActor") After calling the actor, messages are sent to the remote server using Protocol Buffers for serialization, and reconstituted on the other end. This is great for peer to peer communication (it already beats RMI), but remoting is too specific in some ways — it points to a unique IP address, and really we’d like actors to just live out there “in the cloud”. This is where Akka clustering comes in. Clustering allows you to create an actor somewhere on a cluster consisting of nodes which all share the same actor system, without knowing exactly which node it is on. Other machines can join and leave the cluster at run time. We’ll use the akka-sample-cluster-app Activator template as a reference, and walk through each step of the TransformationApp application, showing how to run it and how it communicates. Installation Ideally, you should download Typesafe Activator, start it up with “activator ui” and search for “Akka Cluster Samples with Scala” in the field marked “Filter Templates”. From there, Activator can download the template and provide you with a friendly UI and tutorial. If not, all the code snippets have links back to the source code on Github, so you can clone or copy the files directly. Clustering The first step in Akka clustering is the library dependencies and the akka configuration. The build.sb[...]



Writing an SBT Plugin

Tue, 24 Jun 2014 21:17:00 -0700

One of the things I like about SBT is that it’s interactive. SBT stays up as a long running process, and you interact with it many times, while it manages your project and compiles code for you. Because SBT is interactive and runs on the JVM, you can use it for more than just builds. You can use it for communication. Specifically, you can use it to make HTTP requests out to things you’re interested in communicating with. Unfortunately, I knew very little about SBT plugins. So, I talked to Christopher Hunt and Josh Suereth, downloaded eigengo’s sbt-mdrw project, read the activator blog post on markdown and then worked it out on the plane back from Germany. I made a 0.13 SBT plugin that uses the ROME RSS library to display titles from a list of RSS feeds. It’s available from https://github.com/wsargent/sbt-rss and has lots of comments. The SBT RSS plugin adds a single command to SBT. You type rss at the console, and it displays the feed: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 > rss [info] Showing http://typesafe.com/blog/rss.xml [info] Title = The Typesafe Blog [info] Published = null [info] Most recent entry = Scala Days Presentation Roundup [info] Entry updated = null [info] Showing http://letitcrash.com/rss [info] Title = Let it crash [info] Published = null [info] Most recent entry = Reactive Queue with Akka Reactive Streams [info] Entry updated = null [info] Showing https://github.com/akka/akka.github.com/commits/master/news/_posts.atom [info] Title = Recent Commits to akka.github.com:master [info] Published = Thu May 22 05:51:21 EDT 2014 [info] Most recent entry = Fix fixed issue list. [info] Entry updated = Thu May 22 05:51:21 EDT 2014 Let’s show how it does that. First, the build file. This looks like a normal build.sbt file, except that there’s a sbtPlugin setting in it: build.sbtlink1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 // this bit is important sbtPlugin := true organization := "com.typesafe.sbt" name := "sbt-rss" version := "1.0.0-SNAPSHOT" scalaVersion := "2.10.4" scalacOptions ++= Seq("-deprecation", "-feature") resolvers += Resolver.sonatypeRepo("snapshots") libraryDependencies ++= Seq( // RSS fetcher (note: the website is horribly outdated) "com.rometools" % "rome-fetcher" % "1.5.0" ) publishMavenStyle := false /** Console */ initialCommands in console := "import com.typesafe.sbt.rss._" Next, there’s the Plugin scala code itself. SbtRss.scalalink1 2 3 object SbtRss extends AutoPlugin { // stuff } So, the first thing to note is the AutoPlugin class. The Plugins page talks about AutoPlugin — all you really need to know is if you define an autoImport object with your setting keys and then import it into an AutoPlugin, you will make the settingKey available to SBT. The next bit is the globalSettings entry: SbtRss.scalalink1 2 3 override def globalSettings: Seq[Setting[_]] = super.globalSettings ++ Seq( Keys.commands += rssCommand ) Here, we’re saying we’re going to add a command to SBT’s global settings, by merging it with super.globalSettings. The next two bits detail how to create the RSS command in SBT style. SbtRss.scalalink1 2 3 4 5 /** Allows the RSS command to take string arguments. */ private val args = (Space ~> StringBasic).* /** The RSS command, mapped into sbt as "rss [args]" */ private lazy val rssCommand = Command("rss")(_ => args)(doRssCommand) Finally, there’s the command itself. SbtRss.scalalink1 2 3 4 5 def doRssCommand(state: State, args: Seq[String]): State = { // do stuff st[...]



Testing Hostname Verification

Mon, 31 Mar 2014 18:35:00 -0700

This is part of a series of posts about setting up Play WS as a TLS client for a “secure by default” setup and configuration through text files, along with the research and thinking behind the setup. I recommend The Most Dangerous Code in the World for more background. And thanks to Jon for the shoutout in Techcrunch. Previous posts are: Fixing The Most Dangerous Code In The World (MITM, Protocols, Cipher Suites, Cert Stores) Fixing X.509 Certificates (General PKI, Weak Signature and Key Algorithms) Fixing Certificate Revocation (CRL, OCSP) Fixing Hostname Verification (HTTPS server identity checks) The last talked about implementing hostname verification, which was a particular concern in TMDCitW. This post shows how you can test that your TLS client implements hostname verification correctly, by staging an attack. We’re going to use dnschef, a DNS proxy server, to confuse the client into talking to the wrong server. To keep things simple, I’m going to assume you’re on Mac OS X Mavericks at this point. (If you’re on Linux, this is old hat. If you’re on Windows, it’s probably easier to use a VM like Virtualbox to set up a Linux environment.) The first step to installing dnschef is to install a decent Python. The Python Guide suggests Homebrew, and Homebrew requires XCode be installed, so let’s start there. Install XCode Install XCode from the App Store and also install the command line tools: 1 xcode-select –install Install Homebrew itself: 1 ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)" Homebrew has some notes about Python, so we set up the command line environment: 1 2 export ARCHFLAGS="-arch x86_64" export PATH=/usr/local/bin:/usr/local/sbin:~/bin:$PATH Now (if you already have homebrew installed): 1 2 3 brew update brew install openssl brew install python –with-brewed-openssl –framework You should see: 1 2 3 4 $ python –version Python 2.7.6 $ which python /usr/local/bin/python If you run into trouble, then brew doctor or brew link –overwrite python should sort things out. Now upgrade the various package tools for Python: 1 2 pip install –upgrade setuptools pip install –upgrade pip Now that we’ve got Python installed, we can install dnschef: 1 2 3 wget https://thesprawl.org/media/projects/dnschef-0.2.1.tar.gz tar xvzf dnschef-0.2.1.tar.gz cd dnschef-0.2.1 Then, we need to use dnschef as a nameserver. An attacker would use rogue DHCP or ARP spoofing to fool your computer into accepting this, but we can just add it directly: OS X – Open System Preferences and click on the Network icon. Select the active interface and fill in the DNS Server field. If you are using Airport then you will have to click on Advanced… button and edit DNS servers from there. Don’t forget to click “Apply” after making the changes! Now, we’re going to use DNS to redirect https://www.howsmyssl.com to https://playframework.com. 1 2 $ host playframework.com playframework.com has address 54.243.50.169 We need to specify the IP address 54.243.50.169 as the fakeip argument. 1 2 3 4 5 6 7 8 9 10 11 12 $ sudo /usr/local/bin/python ./dnschef.py –fakedomains www.howsmyssl.com –fakeip 54.243.50.169 _ _ __ | | version 0.2 | | / _| __| |_ __ ___ ___| |__ ___| |_ / _` | '_ \/ __|/ __| '_ \ / _ \ _| | (_| | | | \__ \ (__| | | | __/ | \__,_|_| |_|___/\___|_| |_|\___|_| iphelix@thesprawl.org [*] DNSChef started on interface: 127.0.0.1 [*] Using the following nameservers: 8.8.8.8 [*] Cooking A replies to point to 54.243.50.169 matching: www.howsmyssl.com Now that we’ve got[...]



Fixing Hostname Verification

Sun, 23 Mar 2014 14:33:00 -0700

This is the fourth in a series of posts about setting up Play WS as a TLS client for a “secure by default” setup and configuration through text files, along with the research and thinking behind the setup. I recommend The Most Dangerous Code in the World for more background. Previous posts are: Fixing The Most Dangerous Code In The World (MITM, Protocols, Cipher Suites, Cert Stores) Fixing X.509 Certificates (General PKI, Weak Signature and Key Algorithms) Fixing Certificate Revocation (CRL, OCSP) The Attack: Man in the Middle The scenario that requires hostname verification is when an attacker is on your local network, and can subvert DNS or ARP, and somehow redirect traffic through his own machine. When you make the call to https://example.com, the attacker can make the response come back to a local IP address, and then send you a TLS handshake with a certificate chain. The attacker needs you to accept a public key that it owns so that you will continue the conversation with it, so it can’t simply hand you the certificate chain that belongs to example.com — that has a different public key, and the attacker can’t use it. Also, the attacker can’t give you a certificate chain that points to example.com and has the attacker’s public key — the CA should (in theory) refuse to sign the certificate, since the domain belongs to someone else. However… if the attacker get a CA to sign a certificate for a site that it does have control of, then the attack works like this: In the example, DNS is compromised, but an attacker could just as well proxy the request to another server and return the result from a different server. The key to any kind of check for server identity is that the check of the hostname must happen on the client end, and must be tied to the original request coming in. It must happen out of band, and cannot rely on any response from the server. The Defense: Hostname Verification In theory, hostname verification in HTTPS sounds simple enough. You call “https://example.com”, save off the “example.com” bit, and then check it against the X.509 certificate from the server. If the names don’t match, you terminate the connection. So where do you look in the certificate? According to RFC 6125, hostname verification should be done against the certificate’s subjectAlternativeName’s dNSName field. In some legacy implementations, the check is done against the certificate’s commonName field, but commonName is deprecated and has been deprecated for quite a while now. You generate a certificate with the right name by using keytool with the -ext flag to say the certificate has example.com as the DNS record in the subjectAltName field: 1 2 3 4 5 6 7 8 9 10 keytool -genkeypair \ -keystore keystore.jks \ -dname "CN=example.com, OU=Sun Java System Application Server, O=Sun Microsystems, L=Santa Clara, ST=California, C=US" \ -keypass changeit \ -storepass changeit \ -keyalg RSA \ -keysize 2048 \ -alias example \ -ext SAN=DNS:example.com \ -validity 9999 And to view the certificate: 1 keytool -list -v -alias example -storepass changeit -keystore keystore.jks Is it really that simple? Yes. HTTPS is very specific about verifying server identity. You make an HTTPS request, then you check that the certificate that comes back matches the hostname of the request. There’s some bits added on about wildcards, but for the most part it’s not complicated. In fact (and this is part of the problem), you can say that HTTPS is defined by the hostname verification requirement for HTTP on top of TLS. The reason why HTTPS exists as a distinct RFC as apart from TLS i[...]



Fixing Certificate Revocation

Sat, 22 Mar 2014 15:17:00 -0700

This is the third in a series of posts about setting up Play WS as a TLS client for a “secure by default” setup and configuration through text files, along with the research and thinking behind the setup. (TL;DR — if you’re looking for a better revocation protocol, you may be happier reading Fixing Revocation for Web Browsers on the Internet and PKI: It’s Not Dead, Only Resting.) Previous posts are: Fixing The Most Dangerous Code In The World (Protocols, Cipher Suites, Cert Stores) Fixing X.509 Certificates (General PKI, Weak Signature and Key Algorithms) This post is all about certificate revocation using OCSP and CRL, what it is, how useful it is, and how to configure it in JSSE. Certificate Revocation (and its Discontents) The previous post talked about X.509 certificates that had been compromised in some way. Compromised certificates can be a big problem, especially if those certificates have the ability to sign other certificates. If certificates have been broken or forged, then in theory it should be possible for a certificate authority to let a client know as soon as possible which certificates are invalid and should not be used. There have been two attempts to do certificate revocation: Certificate Revocation Lists (CRLs). CRLs — lists of bad certificates — were huge and hard to manage. As an answer to CRLs, Online Certificate Status Protocol was invented. OCSP involves contacting the remote CA server and going through verification of the certificate there before it will start talking to the server. According to Cloudflare, this can make TLS up to 33% slower. Part of it may be because OCSP responders are slow, but it’s clear that OCSP is not well loved. In fact, most browsers don’t even bother with OCSP. Adam Langley explains why OCSP is disabled in Chrome: While the benefits of online revocation checking are hard to find, the costs are clear: online revocation checks are slow and compromise privacy. The median time for a successful OCSP check is ~300ms and the mean is nearly a second. This delays page loading and discourages sites from using HTTPS. They are also a privacy concern because the CA learns the IP address of users and which sites they’re visiting. On this basis, we’re currently planning on disabling online revocation checks in a future version of Chrome. (There is a class of higher-security certificate, called an EV certificate, where we haven’t made a decision about what to do yet.) — “Revocation checking and Chrome’s CRL” Adding insult to injury, OCSP also has security issues: Alas, there was a problem — and not just “the only value people are adding is republishing the data from the CA”. No, this concept doesn’t work at all, because OCSP assumes a CA never loses control of its keys. I repeat, the system in place to handle a CA losing its keys, assumes the CA never loses the keys. Dan Kaminsky and: OCSP is actually much, much worse than you describe. The status values are, as you point out, broken. Even if you fix that (as some CAs have proposed, after being surprised to find out how OCSP really worked – yes, some of the folks charged with running OCSP don’t actually know how it really works) it doesn’t help, given OCSP’s broken IDs an attacker can trivially work around this. And if you fix those, given the replay-attack-enabled “high-performance” optimisation an attacker can work around that. And if you fix that, given that half the response is unauthenticated, an attacker can go for that. To paraphrase Lucky Green, OCSP is multiple-redundant broken, by design. If you remove the bits that don’t work (the res[...]



Fixing X.509 Certificates

Thu, 20 Mar 2014 09:32:00 -0700

This is a continuation in a series of posts about how to correctly configure a TLS client using JSSE, using The Most Dangerous Code in the World as a guide. This post is about X.509 certificates in TLS, and has some videos to show both what the vulnerabilities are, and how to fix them. I highly recommend the videos, as they do an excellent job of describing problems that TLS faces in general. Also, JDK 1.8 just came out and has much better encryption. Now would be a good time to upgrade. Table of Contents Part One: we talk about how to correctly use and verify X.509 certificates. What X.509 Certificates Do Understanding Chain of Trust Understanding Certificate Signature Forgery Understanding Signature Public Key Cracking Part Two: We discuss how to check X.509 certificates. Validating a X.509 Certificate in JSSE Validating Key Size and Signature Algorithm What X.509 Certificates Do The previous post talked about using secure ciphers and algorithms. This alone is enough to set up a secure connection, but there’s no guarantee that you are talking to the server that you think you are talking to. Without some means to verify the identity of a remote server, an attacker could still present itself as the remote server and then forward the secure connection onto the remote server. This is the problem that Netscape had. As it turned out, another organization had come up with a solution. The ITU-T had some directory services that needed authentication, and set a system of public key certificates in a format called X.509 in a binary encoding known as ASN.1 DER. That entire system was copied wholesale for use in SSL, and X.509 certificates became the way to verify the identity of a server. The best way to think about public key certificates is as a passport system. Certificates are used to establish information about the bearer of that information in a way that is difficult to forge. This is why certificate verification is so important: accepting any certificate means that an attacker’s certificate will be blindly accepted. X.509 certificates contain a public key (typically RSA based), and a digest algorithm (typically in the SHA-2 family, i.e. SHA512) which provides a cryptographic hash. Together these are known as the signature algorithm (i.e. “RSAWithSHA512”). One certificate can sign another by taking all the DER encoded bits of a new certificate (basically everything except “SignatureAlgorithm”) and passing it through the digest algorithm to create a cryptographic hash. That hash is then signed by the private key of the organization owning the issuing certificate, and the result is stuck onto the end of the new certificate in a new “SignatureValue” field. Because the issuer’s public key is available, and the hash could have only been generated by the certificate that was given as input, we can treat it as “signed” by the issuer. So far, so good. Unfortunately, X.509 certificates are complex. Very few people understand (or agree on) the various fields that can be involved in X.509 certificates, and even fewer understand ASN.1 DER, the binary format that X.509 is encoded in (which has led to some interesting attacks on the format). So much of the original X.509 specification was vague that PKIX was created to nail down some of the extensions. Currently, these seem to be the important ones: The “basic fields” that every certificate has. subjectAltName, where ‘dNSName’ is the official hostname of the server. basicConstraints used to establish chain of trust. keyUsage. Used to define a CA certificate. There are other fields i[...]



Monkeypatching Java Classes

Sun, 02 Mar 2014 14:56:00 -0800

So, remember that thing I said last post about adding some debug features to JSSE? Turns out that didn’t work. Sometimes debugging would work. Sometimes it wouldn’t. I couldn’t get it to work reliably. The debugging feature in JSSE is defined by calling -Djavax.net.debug=ssl on the command line. The class that reads from the system property is sun.security.ssl.Debug and reading it explained everything: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 public class Debug { private String prefix; private static String args; static { args = java.security.AccessController.doPrivileged( new GetPropertyAction("javax.net.debug", "")); args = args.toLowerCase(Locale.ENGLISH); if (args.equals("help")) { Help(); } } public static Debug getInstance(String option) { return getInstance(option, option); } public static Debug getInstance(String option, String prefix) { if (isOn(option)) { Debug d = new Debug(); d.prefix = prefix; return d; } else { return null; } } public static boolean isOn(String option) { if (args == null) { return false; } else { int n = 0; option = option.toLowerCase(Locale.ENGLISH); if (args.indexOf("all") != -1) { return true; } else if ((n = args.indexOf("ssl")) != -1) { if (args.indexOf("sslctx", n) == -1) { // don't enable data and plaintext options by default if (!(option.equals("data") || option.equals("packet") || option.equals("plaintext"))) { return true; } } } return (args.indexOf(option) != -1); } } } This explained why I was seeing problems with my debug code. The args field of Debug was being set in a static initialization block, when the class was first loaded into the JVM. Due to some race conditions, my code could call System.setProperty("java.net.debug", options) before Debug was loaded, but there was no way of ensuring that. And of course, the args file was marked as private static final so there was no way to modify it outside of the class after that point. But it was even worse than that. The static methods isOn and getInstance were used by the JSSE internal classes to determine whether debug information should be logged or not, and those fields were also marked private static final, i.e. in sun.security.ssl.SSLContextImpl: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 public abstract class SSLContextImpl extends SSLContextSpi { private static final Debug debug = Debug.getInstance("ssl"); private X509ExtendedKeyManager chooseKeyManager(KeyManager[] kms) throws KeyManagementException { for (int i = 0; kms != null && i < kms.length; i++) { if (debug != null && Debug.isOn("sslctx")) { System.out.println( "X509KeyManager passed to " + "SSLContext.init(): need an " + "X509ExtendedKeyManager for SSLEngine use"); } return new AbstractKeyManagerWrapper((X509KeyManager)km); } [...]



Fixing The Most Dangerous Code In The World

Mon, 13 Jan 2014 14:44:00 -0800

TL;DR Most non-browser HTTP clients do SSL / TLS wrong. Part of why clients do TLS wrong is because crypto libraries have unintuitive APIs. In this post, I’m going to write about my experience extending an HTTP client to configure Java’s Secure Socket library correctly, and what to look for when implementing your own client. Introduction I volunteered to implement a configurable TLS solution for Play’s web services client (aka WS). WS is a Scala based wrapper on top of AsyncHttpClient that provides asynchronous mechanisms like Future and Iteratee on top of AsyncHttpClient and allows a developer to make GET and POST calls to a web service in just a couple of lines of code. However, WS did not contain any way to configure TLS. It was technically possible to configure TLS through the use of system properties (i.e. “javax.net.ssl.keyStore”) but that brought up more messiness — what if you wanted more than one keystore? What if you needed clients with different ciphers? Sadly, WS isn’t alone in this: most frameworks don’t provide configuration for the finer points of TLS. Added to that was the awareness that SSL client libraries have been dubbed The Most Dangerous Code in the World (FAQ). The problem is real and serious, and I want to fix it. There is also a long and well known gulf between the security community and the developer community about the level of knowledge about TLS and the current state of the HTTPS certificate ecosystem. I want to fix that as well, and this blog post should be a good start. So. Here’s what I did. Table of Contents For the sake of readability (i.e. avoiding TL;DR), I’m breaking this across several blog posts. The pull request is on Github and you are invited to review the code and comment as you see fit. In this blog post, I’m just going to cover the setup. First, the problems that make TLS necessary. The First Problem: Programmers do not get security The Second Problem: Wifi / Ethernet is not secure The Third Problem: Man In the Middle Digression: Mitigation and General Security More Videos and Talks Then, the implementation in WS. The Use Cases for WS Understanding TLS Understanding JSSE Configuring a client Debugging a client Choosing a protocol Choosing a cipher suite Future posts will discuss certificates in more detail, but this gives us somewhere to start. Problems The First Problem: Programmers do not get security The first problem is the assumption that TLS is overkill, built by researchers to protect against an abstract threat. Unfortunately, this is not the case. TLS has real attacks against it, and they exist in the wild. Even worse, there are very serious, real world implications to breaking a TLS connection. Some people trust TLS in situations which could mean imprisonment or death. But. Programmers work with bugs. Programmers get bugs. Programmers do not get security. Programmers understand how bad input can ruin a programmer’s day. Programmers understand how corrupt data can completely ruin any hope of a functioning program. Programmers know that working with concurrency is so dangerous that it should only be done with special concurrency primitives and rules. Human users may be incompetent, but they are mostly benevolent: the forces working against the programmer are entropy and loose requirements. Programmers don’t usually write programs that have to defend against an attacker. Most programmers have never even seen an attacker. Even the concept of a human deliberately trying to break or subvert a program is foreign. QA usuall[...]



Building a Development Environment with Docker

Wed, 20 Nov 2013 22:22:00 -0800

TL;DR I’ve written a cheat sheet for Docker, and I have a github project for it. Here’s the thinking that went into why Docker, and how best to use it. The problem You want to build your own development environment from scratch, and you want it to be as close to a production environment as possible. Solutions Development environments usually just… evolve. There’s a bunch of tries at producing a consistent development environment, even between developers. Eventually, through trial and error, a common set of configuration files and install instructions turns into something that resembles a scaled down and testable version of the production environment, managed through version control and a set of bash scripts. But even when it gets to that point, it’s not over, because modern environments can involve dozens of different components, all with their own configuration, often times communicating with each other through TCP/IP or even worse, talking to a third party API like S3. To replicate the production environment, these lines of communication must be drawn — but they can’t be squashed into one single machine. Something has to give. Solution #1: Shared dev environment The first solution is to set up a environment with exactly the same machines in the same way as production, only scaled down for development. Then, everyone uses it. This works only if there is no conflict between developers, and resource use and contention is not a problem. Oh, and you don’t want to swap out one of those components for a particular team. If you need to access the environment from outside the office, you’ll need a VPN. And if you’re on a flaky network or on a plane, you’re out of luck. Solution #2: Virtual Machines The second solution is to put as much of the environment as possible onto the developer’s laptop. Virtual Machines such as VirtualBox will allow you to create an isolated dev environment. You can package VMs into boxes with Vagrant, and create fresh VMs from template as needed. They each have their own IP address, and you can get them to share filesystems. However, VMs are not small. You can chew up gigabytes very easily providing the OS and packages for each VM, and those VMs do not share CPU or memory when running together. If you have a complex environment, you will run into a point where you either run out of disk space or memory, or you break down and start packaging multiple components inside a single VM, producing an environment which may not reflect production and is far more fragile and prone to complexities. Solution #3: Docker Docker solves the isolation problem. Docker provides (consistent, reproducible, disposable) containers that make components appear to be running on different machines, while sharing CPU and memory underneath, and provides TCP/IP forwarding and filesystems that can be shared between containers. So, here’s how you build a development environment in Docker. Docker Best Practices Build from Dockerfile The only sane way to put together a dev environment in Docker is to use raw Dockerfile and a private repository. Pull from the central docker registry only if you must, and keep everything local. Chef recipes are slow You might think to yourself, “self, I don’t feel like reinventing the wheel. Let’s just use chef recipes for everything.” The problem is that creating new containers is something that you’ll do lots. Every time you create a container, seconds will count, and minutes will be totally unacceptab[...]



Play in Practice

Sat, 20 Apr 2013 14:20:00 -0700

I gave a talk on Play in Practice at the SF Scala meetup recently. Thanks to Stackmob for hosting us and providing pizza. I went into describing how to implementing CQRS in Play, but there was a fairly long question and answer section about Play as well. I couldn’t go into detail on some of the answers and missed some others, so I’ll fill in the details here. Video width="560" height="315" src="http://www.youtube.com/embed/s2GOZpzBwVE?rel=0" frameborder="0" allowfullscreen> Slides src="http://www.slideshare.net/slideshow/embed_code/19312123" width="597" height="486" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" style="border:1px solid #CCC;border-width:1px 1px 0;margin-bottom:5px" allowfullscreen webkitallowfullscreen mozallowfullscreen> Core API The core API is Action, which take in a Request and return a Result. The Request is immutable, but you can wrap it with extra information, which you’ll typically do with action composition. 2.1.1 introduced EssentialAction, which uses (RequestHeader => Iteratee[Array[Byte], Result]) instead of Action’s (Request => Result) and makes building Filters easier. Again, Play’s core is simple. About as simple as you can get. Streaming Streaming is handled by Iteratees, which can be a confusing topic for many people. There are good writeups here and here. lila is the best application to look at for streaming, especially for sockets and hubs. Having good streaming primitives is something that I didn’t get into that much in the talk, but is still vitally important to “real time web” stuff. Filters If you want to do anything that you’d consider as part of a “servlet pipeline”, you use Filters, which are designed to work with streams. An example of a good Filter is to automatically uncompress an asset — here’s an example that uses an Enumeratee: 1 2 3 4 5 6 7 8 9 10 11 class GunzipFilter extends EssentialFilter { def apply(next: EssentialAction) = new EssentialAction { def apply(request: RequestHeader) = { if (request.headers.get("Content-Encoding").exists(_ == "gzip")) { Gzip.gunzip() &>> next(request) } else { next(request) } } } } Note that this only does uncompression: Automatic streaming gzip compression of templates is not available “out of the box” in 2.1.2, but it should be available in Play 2.2. Templating Play comes packaged with its own template language, Twirl, but you’re not required to use it. There is an integration into Scalate that gives you Mustache, Jade, Scaml and SSP. There’s also an example project that shows how to integrate Play with Freemarker. One thing that Play doesn’t address directly is how to set up a structure for page layouts. Play provides you with index.scala.html and main.scala.html, but doesn’t provide you with any more structure than that. If you set up a header and footer and allow for subdirectories to use their own templates, you can minimize the amount of confusion in the views. There’s an example in RememberMe, and this is the approach that lila takes as well. Another thing is that Play’s default project template is intentionally minimal. If you use Backbone and HTML5 templates, then a custom giter8 template like mprihoda/play-scala may suit you better. JSON Play’s JSON API is very well done, and is a great way to pass data around without getting into the weeds or having to resort to XML. It goes ver[...]



Error Handling in Scala

Thu, 27 Dec 2012 15:38:00 -0800

The previous post was mostly about programming “in the small” where the primary concern is making sure the body of code in the method does what it’s supposed to and doesn’t do anything else. This blog post is about what to do when code doesn’t work — how Scala signals failure and how to recover from it, based on some insightful discussions. First, let’s define what we mean by failure. Unexpected internal failure: the operation fails as the result of an unfulfilled expectation, such as a null pointer reference, violated assertions, or simply bad state. Expected internal failure: the operation fails deliberately as a result of internal state, i.e. a blacklist or circuit breaker. Expected external failure: the operation fails because it is told to process some raw input, and will fail if the raw input cannot be processed. Unexpected external failure: the operation fails because a resource that the system depends on is not there: there’s a loose file handle, the database connection fails, or the network is down. Java has one explicit construct for handling failure: Exception. There’s some difference of usage in Java throughout the years — IO and JDBC use checked exceptions throughout, while other API like org.w3c.dom rely on unchecked exceptions. According to Clean Code, the best practice is to use unchecked exceptions in preference to checked exceptions, but there’s still debate over whether unchecked exceptions are always appropriate. Exceptions Scala makes “checked vs unchecked” very simple: it doesn’t have checked exceptions. All exceptions are unchecked in Scala, even SQLException and IOException. The way you catch an exception in Scala is by defining a PartialFunction on it: 1 2 3 4 5 6 7 8 9 10 11 12 val input = new BufferedReader(new FileReader(file)) try { try { for (line <- Iterator.continually(input.readLine()).takeWhile(_ != null)) { Console.println(line) } } finally { input.close() } } catch { case e:IOException => errorHandler(e) } Or you can use control.Exception, which provides some interesting building blocks. The docs say “focuses on composing exception handlers”, which means that this set of classes supplies most of the logic you would put into a catch or finally block. 1 2 3 Exception.handling(classOf[RuntimeException], classOf[IOException]) by println apply { throw new IOException("foo") } Using the control.Exception methods is fun and you can string together exception handling logic to create automatic resource management, or an automated exception logger. On the other hand, it’s full of sharp things like allCatch. Leave it alone unless you really need it. Another important caveat is to make sure that you are catching the exceptions that you think you’re catching. A common mistake (mentioned in Effective Scala) is to use a default case in the partial function: 1 2 3 4 5 try { operation() } catch { case _ => errorHandler(e) } This will catch absolutely everything, including OutOfMemoryError and other errors that would normally terminate the JVM. If you want to catch “everything” that would normally happen, then use NonFatal: 1 2 3 4 5 6 7 import scala.util.control.NonFatal try { operation() } catch { case NonFatal(exc) => errorHandler(e) } Exceptions don’t get mentioned very much in Scala, but they’re still the bedrock for dealing with unexpected failure. For unexpected in[...]



Problems Scala Fixes

Sun, 16 Dec 2012 14:47:00 -0800

When I tell people I write code in Scala, a typical question is well, why? When it comes to writing code, most of my work is straightforward: SQL database on the backend, some architectural glue, CRUD, some exception handling, transactions handlers and an HTML or JSON front end. The tools have changed, but the problems are usually the same: you could get a website up in 5 minutes with Rails or Dropwizard. So why pick Scala? It’s a tough question to answer off the bat. If I point to the language features, it doesn’t get the experience across. It’s like explaining why I like English by reading from a grammar book. I don’t like Scala because of its functional aspects or its higher kinded type system. I like Scala because it solves practical, real world problems for me. You can think of Scala as Java with all the rough edges filed off, with new features that make it easier to write correct code and harder to create bugs. Scala is not a purist’s language — it goes out of its way to make it easy for Java programmers to dip their toes in the pool. You can literally take your Java code and hit a key to create working Scala code. So what problems does Scala solve? Let’s start with the single biggest problem in programming, the design flaw that’s caused more errors than anything else combined. Null references. Solving for Null Scala avoids null pointer references by providing a special type called Option. Methods that return Option[A] (where A is the type that you want, i.e. Option[String]) will give you an object that is either a wrapper object called ‘Some’ around your type, or None. There are a number of different ways you can use Option, but I’ll just mention the ones I use most. You can chain Options together in Scala using for comprehensions: 1 2 3 4 for { foo <- request.params('foo') bar <- request.params('bar') } yield myService.process(foo, bar) or through a map: 1 request.params('foo').map { foo => logger.debug(foo) } or through pattern matching. 1 2 3 4 request.params('foo') match { case Some(foo) => { logger.debug(foo) } case None => { logger.debug('no foo :-(') } } Not only is this easy, but it’s also safer. You can flirt with NPE saying myOption.get, but if you do that, you deserve what you get. Not having to deal with NPE is a pleasure. Right Type in the Right Place What’s the second biggest problem in programming? It’s a huge issue in security and in proving program correctness: invalid, unchecked input. Take the humble String. The work of manipulating strings is one of the biggest hairballs in programming — they’re pulled in from the environment or embedded in the code itself, and then programs try to figure out how best to deal with them. In one case, a string is displayed to the user and it’s done. In another case, an SQL query is embedded as a query parameter on a web page and passed straight through to the database. To the compiler, they’re just strings and there is no difference between them. But there are some types of strings that are suitable to pass to databases, and some which are not. Ideally, we’d like to tell the compiler that SQL and query parameters have different types. Scala makes this easy. With the Shapeless library, you can add distinguishing type information to objects and ensure that you can’t pass random input in: 1 2 3 import[...]



Remember Me Cookies for Play 2.0

Sat, 07 Jul 2012 15:19:00 -0700

I’ve been working with Play 2.0 for a while now, and in many ways it’s the ideal web framework for me: it’s a light framework that gets a request, puts together a result (either at once or in chunks using an iteree pattern), and provides some HTML templates and form processors for ease of use. It lets you change code and templates while the server is running, and gives you an asset pipeline for compressing LESS and Coffeescript into minified CSS and Javascript out of the box.

That being said, it’s a new web framework, and the biggest issue right now is all the boring infrastructure that goes on top of it to make a framework deal with authentication, authorization, and even boring things like resetting a password.

On the Java side, Yvonnick Esnault has a good starter application (disclaimer; I contributed some code), or you can use Play Authenticate.

On the Scala side, play20-auth is a good starting point for an authentication system. However, it didn’t do token based authentication, aka “Remember Me” cookies. Adding this feature turns out to be tricky if you’re new to Scala, because extending the request pipeline in Play 2.0 Scala requires that you know a functional style of programming called “action composition”.

So here’s a boilerplate project play20-rememberme that does authentication with remember me functionality (although it doesn’t have the password reset or confirm features added to Play20StartApp).

UPDATE: Now works with Play 2.1.




The User Illusion

Sat, 26 May 2012 15:51:00 -0700

Slides from the May 17th Five Minutes of Fame. This one is on consciousness and a book called The User Illusion. I read the book and thought it interesting, but it was only after reading Blindsight and following Peter Watts’s blog that it clicked as something that happened outside of science experiments. That, and I’d really been hankering for a good science talk. There’s nothing quite like science; even when it comes to something as wishy-washy as consciousness, it can still give you surprising answers. And video! width="560" height="315" src="http://www.youtube.com/embed/TYXKN0DSOdg" frameborder="0" allowfullscreen> The short version of the slides: When you see, you see what’s already been processed and filtered. Illusions are when the system doesn’t work; you don’t see when it does work. In other words, we see “car accident” as presented to our consciousness — we don’t consciously put it together from our visual input. I’ll spare you the customary link to the “You wouldn’t know if a Gorilla showed up” study, but it’s fairly clear the brain only passes on the Cliff Notes version to the executive layer. Consciousness lags well behind. When scientists measure the movement of a finger, the electrical potential rises a full second before the finger moves. But we report making the decision to move half a second before the finger moves (Libet). We become aware of making the decision after it’s already happened. Some scientists conjecture that consciousness may simply be unnecessary (Rosenthal). Others think that consciousness may be a result of conflicting subconscious systems (Morsella, Hofstader, Metzinger), and the Watts commentary points out that consciousness seems to be strongly associated with inner conflict and/or pain, although I’m not spoiling his punchline. Despite what Heph says, I don’t think the talk is depressing. When you think about consciousness, you assume that it’s a good thing, but realistically we’re far happier and productive in flow, without that nagging voice inside our heads. Rather than life being suffering, suffering itself is the act of consciousness. The talk itself went down well, with the coveted seal of approval. The 5MoF itself was surprisingly wide ranging — Eclair Bandersnatch showed up in a barbie mask and wig to talk about art, Danny O’Brien gave a talk on The Cosmopolitan Anarchist and recapped the news on Byron Sonne, Liz Henry read poetry from her new book, and Josh Juran presented FORGE, a GUI based on manipulating what appeared to be symbolic files on the filesystem — a programming paradigm that apparently came from Plan 9 and hurts my brain every time I think about it. We’re doing the same thing next month, and I’ll probably be talking about Transcranial Direct Current Stimulation (if I’m not, y’know, drooling in a corner). So! If you have a thought that’s been burning a hole in some mental sidepocket, you should sign up. [...]



Systemantics

Sat, 17 Mar 2012 12:29:00 -0700

New Five Minutes of Fame presentation. This one’s a presentation about a little known book called Systemantics (a.k.a. The Systems Bible).

Systemantics(object) (embed)

This is a hard book to get hold of, but a worthwhile one. Systemantics doesn’t make a lot of sense without the context of Systems Theory, which is responsible for the word “cybernetics” and a whole bunch else, mostly talking about systems in the context of the complex feedback loop of a nuclear power plant.

Systemantics is a little bit different: it talks about the feedback loop involved in organizations, and how the system has an independent life (and will to live) outside of any of its participants. It’s a book about how systems actually behave, and how what an observer may consider to be a bug looks like appropriate behavior to the system. It’s about the system as you know it at 2 am, the system complete unto itself in all its ineffable complexity.

That being said, much of it is applicable to complex computer systems as well — in fact I’d say that Systems Theory is far more applicable to my day job than most CS Theory is, and if there’s ever going to be a Software Engineering curriculum then I’d want it to include this book.




Interviews without Puzzles

Wed, 22 Feb 2012 19:12:00 -0800

Technical interviews have their own particular lore, and their own history. Over the years, there are some interview practices that have sunk into the group subconscious of engineers, to the point where they’re used so commonly we don’t even question why. Puzzles, for example. There are a few famous interview puzzles out there. Microsoft has “Why are Manhole Covers Round?” Google has “You are shrunk to the height of a nickel and your mass is proportionally reduced so as to maintain your original density. You are then thrown into an empty glass blender. The blades will start moving in 60 seconds. What do you do?” The standard answer to the first puzzle is “So they don’t fall in.” The answer to the second is that assuming you have the same proportionate strength, you can jump out. (There’s an inverse square root effect between muscle and body length, so density is not a factor.) But here’s the thing. The first answer is incorrect. Manhole covers are round because they can be round. They could be square or triangular just as easily. So as an interviewer, if you pick random interview puzzles out of a book and you think you know the “right answer” to the puzzle, you run the chance of not hiring someone because the answer he gave was actually the correct one. A more common problem is that a clever interview puzzle is usually well-known. As soon as you figure it out, you’ll tell all your friends, and they tell their friends. Eventually, you can google for the answer. Back in the day, Zen Schools had the same problem. They were looking for insight and flashes of realization, initially, and wanted a way to test for this. Some people thought up excellent questions that could test the subtle understanding of self, reality and perception required of Zen students. More and more, the koans were used as the standard by which students’ understanding could be measured. Eventually, someone had the bright idea to put together a book of koans — complete with “acceptable responses” — and through years of formalization, the age old question “What is the sound of one hand clapping” turned into a meaningless ritual. Even if you don’t use well known puzzles, interviewing with logic puzzles in the long run optimizes for them. Through discussion, shared experiences and research, people will generally know that they should study for a general class of logic puzzle. And the company will start getting more people that will do well at those puzzles… but that doesn’t mean they know programming any better. Fermi questions, those “How many elevators are in New York City?” questions that are popular in interviews, have a clearer structural weakness. They have no right answer at all, and can be completely circumvented with the right training. Once you know the rules, you can come up with completely the wrong answer and still be “correct” according to the law of the game. I also have a philosophical problem with Fermi questions. Yes, they’re pointless, but it’s not just that they’re pointless. Asking a Fermi question says that you don’t really care what the answer is. Asking a Fermi question tells your candidate that you want them to guess. Engineers are trained out of guessing. Engineers are train[...]



Failing with Passwords

Fri, 17 Feb 2012 11:08:00 -0800

Did a talk about implementing password security right last night at Five Minutes of Fame. src="https://docs.google.com/presentation/embed?id=1PMGgO_bjMhPaCdE5MrcF-6lwuLY1lNvgsS-dx62flKY&start=false&loop=false&delayms=3000" frameborder="0" width="529" height="426" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"> If you don’t want to go through the slides, here’s the TL;DR version: TL;DR User Security Use a password manager like LastPass or 1Password (with Dropbox) and use their password generation. If no manager available (routers, OS logins, etc), use pass phrases with non-English words or acronyms (see xkcd) Assume sites get compromised all the time and you never hear about it. NEVER reuse a password. If you’re at a coffee shop or hackerspace, use a public VPN service. OAuth / Twitter / Facebook based authentication is putting your auth credentials in their hands. TL;DR Encryption Security Use bcrypt. Bounce up the factor every few years. Do not limit password field length. (bcrypt takes up to 55 bytes of input.) Run a JS password tester to reject weak passwords. Run a password cracker regularly to test your security. Suggest to your users that they use passphrases with acronyms, punctuation or LOLspeak. Generate random passwords for your users. Consider removing password masking. TL;DR Operational Security Use HTTPS for both rendering and submitting login page. Show Cain and Abel video to everyone you work with. Use HSTS headers with HTTPS. Use Synchronizer Token to prevent CSRF attacks (or use a decent web framework). Use a captcha / throttle on password attempts. Use double validation for registering accounts (register sends email, clicking email link heads back to site). Use one time use password reset links. Send email notifications on password change attempts. Extra Credit Add Honeypot Logins. Use login token IDs with hidden check bits and math invariants that indicate tampering. Implement a secret in the session management system to keep state on the client and verify it on server interaction for better session authentication. OWASP also has cheat sheets which look useful if you’re putting a site together. It still disturbs me how freaking MANUAL so much of this is, but I suppose web frameworks can’t do everything for you. There are some options if you’re on Rails. It was a surprisingly tough talk to give. At first I was like, “lol, look at all the companies with crappy security”, but it’s a murky field in general. For example, the XKCD cartoon about passphrases is missing the problem that most people type passphrases in standard English, and only use about two thousand words in general conversation. It may look like there’s more entropy generated, but if your attackers know that your customers use passphrases, you may have just made their jobs much easier. Also, brute force cracking is surprisingly effective. MD5 and the SHA-* algorithms are inappropriate because GPUs chew through them very quickly, but the newer FPGA chips can do a reasonable implementation of bcrypt in hardware. It’s an issue that computers are fast, but a bigger problem is that they just keep getting faster. The biggest thing has to be to not let your users pick crappy passwords. Even if you have bcrypt with all the factors, if your users are[...]



Heuristics in Mate Search

Tue, 17 Jan 2012 15:00:00 -0800

Five Minutes of Fame talk about dating heuristics. This went much better than expected because the pictures and subject matter helped balance out the math.

src="https://docs.google.com/presentation/embed?id=1vywPmRpHKE6QwjaLvyXKBINzOAbZ1fiZZ3Bxpw5LFYQ&start=false&loop=false&delayms=3000" frameborder="0" width="529" height="426" allowfullscreen="true" webkitallowfullscreen="true">

Although there were a number of people afterwards who were like “too unrealistic” and I was like “yeah, this works better for interviews and college placement but whatchagonnado.”




Five Minutes of Web Frameworks

Fri, 18 Nov 2011 15:00:00 -0800

New 5MOF presentation, wherein I talk about why web applications are complicated. In five minutes.

I really need to write this up as an essay, as I think presentation objects vs domain objects are a much bigger detail than we realize.

src="https://docs.google.com/presentation/embed?id=1GWA93mRUbzX2x-uC5CluojrFsoMu4Awha5OgFflvhkc&start=false&loop=false&delayms=3000" frameborder="0" width="529" height="426" allowfullscreen="true" webkitallowfullscreen="true">



How We Make Decisions

Fri, 21 Oct 2011 14:04:00 -0700

Five Minutes of Fame presentation on how we make decisions. This one was a lot more dry and technical, but it was a nice change of pace after bronies melted my brain.

src="https://docs.google.com/presentation/embed?id=1pNN9NHaAMCuCKyxfMVpe58vIgcSwOT_gpt4FP1r-nFU&start=false&loop=false&delayms=3000" frameborder="0" width="529" height="426" allowfullscreen="true" webkitallowfullscreen="true">

Part of what makes this so fascinating to me is that you can actually see the algorithm that tells us “hey, we should do more of this.” That’s a huge step in knowing our blind spots.




Happiness Lecture at Noisebridge

Fri, 15 Jul 2011 15:11:00 -0700

Ty put up a video of me presenting Happiness @ Noisebridge.

It has kittens.

src="http://www.youtube.com/embed/MCjiCvG1DuY" frameborder="0" width="529" height="426" allowfullscreen="true" webkitallowfullscreen="true">

Updated: Now with slides!

src="https://docs.google.com/presentation/embed?id=16IiPcJKEoDi-_0d3IMcC7MR_Vx-092NZA3Ep5igoF1E&start=false&loop=false&delayms=3000" frameborder="0" width="529" height="426" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true">



The Core of Agile

Sun, 12 Jun 2011 00:00:00 -0700

I've been thinking lately about Agile.  Again. The first thing I've been thinking about is the people who say "You're doing Agile Wrong." There's always been a dichotomy for me between the theory of Agile, and the practice.  It's a common problem with any dream; it's always cleaner, brighter, simpler, better than the reality.  Reality is messy.  It is imprecise.  It is never seen directly, always filtered through recollections to make each participant the protagonist of their own play.   If you try pair programming, then you're going to find that "you should never pair program 100% of the time."  Or that "you should only pair program between people with equal skill sets."  Or that "you should practice pair programming ping pong".  There will always be a special case. There will always be something that works for you that doesn't work for someone else.   There will always be something that doesn't work for you that works for someone else. Not only do we do Agile "wrong", but we will always do Agile "wrong".  We won't ever do anything "right" – we will do imperfect jobs, come home to imperfect relationships, have imperfect children and live imperfect lives.   This is what happens when you measure yourself against an ideal. But why believe in Agile then, if the only way you can do it is wrong?  Something that came to mind about that statement.   If you're trying to do Agile, and it's not working for you… then you're doing it wrong. Another way of phrasing that statement is that Agile is Doing It Right.   In fact, almost by definition, Agile is Doing It Right. "Agile teams produce a continuous stream of value, at a sustainable pace, while adapting to the changing needs of business." – Elizabeth Hendrickson. "Agile development uses feedback to make constant adjustments in a highly collaborative environment." – Practices of an Agile Developer. "Agile has no definition. […] There's no standards board, there's no test, there's no approved workbook, there's no checklist.  […] It's based on three things: 1) principles not practices, 2) attention to people, and 3) always be adapting." – Daniel Markham. Three definitions of Agile.  Nothing about practices, or even methodology.  What they agree on is a feedback cycle that can respond to changing input and produce useful output.   It's Dorner's model of problem solving.  Or Deming's PDCA cycle.  Or the Military's OODA cycle.   Or the Scientific Method.  Or Kaizen.  It's continous process improvement, in all its forms. If you're following a "best practice" and that "best practice" isn't working for you, then it's not a case of "You're Doing Agile Wrong."  You're doing something that isn't providing a benefit for you.   By following that "best practice", yo[...]



The Logic of Failure

Fri, 10 Jun 2011 00:00:00 -0700

Another talk, this time on The Logic of Failure: Recognizing and Avoiding Error in Complex Situations, a book by Dietrich Dorner. The book’s been a favorite of mine for years, not just for the set up, but for the detailed, unsparing look it provides on how human beings fail to get things right.  Too often in psychology, there’s an emphasis on either seeing how people feel about a situation, or how well or how poorly they perform at a given task.  Dorner goes further, and tries to understand not just how, but why they fail. The Slides src="https://docs.google.com/present/embed?id=dcxrsgwk_137cncdckf4&size=m" frameborder="0" width="555" height="451"> The Setup The setup was simple.  Dorner set up a computer simulation of an African village called Tanaland.  This book was written in 1990, and so Sim City was not widely known, but it’s the same concept.  The players were given dictatorial powers, given the goal to “improve the wellbeing of the people” and had six opportunities over 10 years to review (and possibly change) their policies. The Experiment Given the tools the players had at hand, they went to improving what they could.  They improved the food supply (using artifical fertilizer) and increased medical care.  There were more children and fewer deaths, and lif expectancy was higher.  For the first three sessions, everything went well.  But unknown to the players, they’d set up an unsustainable situation. Famine typically broke out in the 88th month.  The agarian population dropped dramatically, below what they had been initially.  Sheep, goats and cows died off in their herds, and the land was left barren by the end.  Given a free hand, most players engineered a wasteland. One player, by the end of the simulation, had a stable population and had significantly better quality of life for the villagers.  Failure was the rule, but somehow he had found an exception. The Breakdown The litany of possible errors was a long one, and so immediately recognisable that it's hard to suppress a wince of empathy on reading. The players who did badly tended not to ask "why" things happened.  They tended to jump from one subject to another, switching aimlessly, without focus.  They proposed hypotheses without testing them.  If they did test their hypotheses, they did so on an adhoc basis, testing success cases without testing possible failure cases.  In some cases they had tunnel vision: focussing on irrelevancies at the expence of the larger picture.  In other cases, they attempted to "delegate" intractable problems to the villagers themselves or refused to deal with the issue at all.  Finally, and most tellingly, most players dealt with the problems that they saw "on the spot" without thinking of the larger, longer term problems that they were setting up with that immediate short term solution. These results were not a surprise.  They were just what Dorner's team was looking for.  Where many scientists would have looked at the succes[...]



Life in Fifty Years

Sat, 23 Apr 2011 00:00:00 -0700

Another presentation at Five Minutes of Fame. Christina took video of the talk, so I might be able to put that up as well. src="https://docs.google.com/present/embed?id=dcxrsgwk_136gnmhrjcq&size=m" frameborder="0" width="555" height="451"> This one took me a while to go through. I picked up a number of books on the subject when Borders shut down, and the talk gave me the impetus to crank through them. The most relevant books were Plan B 4.0, The Ecotechnic Future, and Hot: Living Through The Next Fifty Years. I also recommend The Windup Girl or World Made By Hand for a much better sense of what the future may feel like. Part of the reason I went through the books was to see how likely the Singularity is. As far as I can make out, it's very dependent on how much free energy is available in fifty years. With a strong global economy, all the parts and dependencies sorted and the sheer power requirements it would take to jam a human-equivalent neural network down through silicon pathways, it's technically possible to have AI. But then you've still got the supply chain management to work through before you can develop more advanced AI, and the overall "cost friction" means that, in practice, AI is only going to be as intelligent as is economically reasonable. Fundamentally, silicon AI is a luxury for the rich. As such, it doesn't show up much in the slides, but I didn't have time to do it justice. (This doesn't go into the biological AI depicted in Starfish, but that's another talk for another time.) The reaction from Noisebridge wasn't quite what I'd hoped with this: most people found it really depressing. I was a bit surprised, because I had made a concerted effort to scale down some of the more apocalyptic predictions and pointed out that the US makes out way better than most other countries (pretty simple reason: the countries that get hit the worst are the poorest). Still, it was worth it to do the research, and I got a couple of compliments and a fun conversation afterwards. The future is a large and complicated subject, so I'm going to be going back to the presentation and filling out bits as I find out more. EDIT: The videos of Hot: The Next 50 Years and A Really Inconvenient Truth are worth watching as well. [...]



Best Practices Crypto Algorithms

Fri, 08 Oct 2010 00:00:00 -0700

I've always wanted just a nice simple list of what crypto algorithms go where.  The cryptographic right answers go a long way to answering the question, but are very long.

Updated for 2016: In general, you should use libsodium, or a language wrapper like on top.




Simplest Possible Acceptance Test

Tue, 05 Oct 2010 00:00:00 -0700

If you are fed up with walking through the same flow over and over again and want to have some code drive the browser around (what is sometimes called is known as acceptance/system/end-to-end testing) and don't know quite how: this is for you. I remember when I first wanted to automate my tests. It was painful. Not only did I have to read through lots and lots of documentation, but so much of it was out of date, or badly written, or just incomplete. So let me say up front: I've read through a bunch of different options. The best option right now (October 2010) is Selenium Webdriver. There are a variety of options for setting up Webdriver. If you want to do it using raw Java, then the 5 minute guide is your best option. The way I do it normally is to use Capybara and Steak, and tie them together with rspec – I find writing tests easier this way, both because Capybara tries very hard to make the API intuitive, and because Ruby tends to be very concise. What this looks like in practice is: # spec/acceptance/checkout_spec.rb require File.dirname(__FILE__) + '/acceptance_helper' # You may find the following useful: # # http://rspec.info/documentation/ # http://jeffkreeftmeijer.com/2010/steak-because-cucumber-is-for-vegetarians/ # http://richardconroy.blogspot.com/2010/08/capybara-reference.html # feature "Feature name", %q{ In order to check out As a user I want to submit an order } do scenario "Checkout" do email = Time.now.to_i.to_s + '@example.com' visit "/orders" within("//form[@id='commitOrderForm']") do fill_in "login-email", :with => email fill_in "login-pass", :with => 'password1' fill_in "login-pass-confirm", :with => 'password1' find("//input[@type='submit']").click end page.should have_content("Your order is complete") end end There is some stuff in the acceptance_helper.rb you have to worry about: # spec/acceptance/acceptance_helper.rb require File.dirname(__FILE__) + "/../spec_helper" require "steak" require 'capybara' require 'capybara/dsl' Capybara.default_driver = :selenium # Uses webdriver by default. Capybara.app_host = 'http://localhost:8080' # used for relative URLs. Capybara.run_server = false # turn off the internal rack server. Capybara.default_wait_time = 5 Spec::Runner.configure do |config| config.include Capybara end # Put your acceptance spec helpers inside /spec/acceptance/support Dir["#{File.dirname(__FILE__)}/support/**/*.rb"].each {|f| require f} And then there's the spec helper itself: # spec/spec_helper.rb require 'spec' # Requires supporting files with custom matchers and macros, etc, # in ./support/ and its subdirectories. Dir[File.expand_path(File.join(File.dirname(__FILE__),'support','**','*.rb'))].each {|f| require f} Spec::Runner.configure do |config| en[...]



Information Radiator with Toodledo

Mon, 04 Oct 2010 00:00:00 -0700

In something that has been fairly two years in the making, I wrote a Toodledo gem for Ruby, bought a USB Betabrite Machine, and finally got the two talking to one another.

Here is what goes into Toodledo: http://twitpic.com/2uh58x

Here is what comes out of the Betabrite: http://twitpic.com/2uh5ft

It’s based on a Thoughtworks project called Radiator (which, in turn, uses some of Aaron Patterson’s code, the inspiration for this): I forked the code and then pounded on it until it did what I wanted. The fork is here: http://github.com/wsargent/radiator and I’ve tried to include as clear step by step instructions as I could. I’ve tried to write it so I can add other plugins as well (twitter, itunes, bugs, build failures) that I can add without changing the core rendering server.

As a management tool, it is surprisingly effective and unobtrusive. If I’m working on something and it scrolls by, it’s a good feeling. If I’m doing something else and it scrolls by, then it’s a context shift rather than an alarm — I can choose to do something else, but I am at least reminded of it. And if I’m in a space where I can easily do it, then it’s a win.




Why you should be using PostgreSQL with Rails

Thu, 30 Sep 2010 00:00:00 -0700

Rails has a problem. The standard way of dealing with scale in Rails is to scale horizontally: add more servers. This is commonly called “shared nothing”, although it’s really more “shared database” – the usual pattern is that no state is kept in individual application servers. That’s not the case in a stateful application that uses the database directly. As soon as you have several servers running in parallel against the database, you are essentially running concurrently. There are many subtle problems in dealing with concurrency, but it turns out that there are some very nasty bugs that only show up with multiple servers: depending on the libraries you’re using, your collections can get out of sync. Xavier Shay has done an excellent job of documenting this, so I’ll wait while you go read the following links: Acts As State Machine is not concurrent Acts As list will break in production Massively corrupted nested set using awesome_nested_set You have some choices when you have concurrency, and you need to maintain state immediately. You can use a central lock server. Or you can use the locking that comes into the database in the form of transactions or optimistic locking. And, in fact, the “quick fixes” for using acts as list or acts as state machine involve wrapping it in a SERIALIZABLE transaction. So. Why don’t Rails developers use transactions? More broadly, why the aversion to databases in general? Why do Rails developers tend to treat the database as a raw, dumb datastore and shun even simple database constraints? Frankly, I think the reason is MySQL. From what I can tell, MySQL got where it was mostly by having better documentation than PostgreSQL when it really mattered, working on Windows with IIS, and better marketing. It came in by default by Perl and PHP websites, and never left. For a long time, MySQL didn’t support transactions, let alone sub selects — and since it was the first database that most developers had been introduced to, they didn’t know it worked any other way. Now MySQL has transactions. Sort of. Most of the time, they work. But sometimes, they deadlock. To a developer, it can seem like deadlock can happen for pretty much any reason at all. A single insert into an InnoDB table can cause a deadlock, because they use row level locks internally. Foreign key references use row level locks internally. You can also get silent deadlocks — you’ll just get lock wait timeouts when you have row level locking between an InnoDB table and a table level lock MyISAM table, and it won’t trigger the InnoDB deadlock detection. The documentation mentions that you should be prepared to reissue a transaction if it fails. “Always be prepared to re-issue a transaction if it fails due to deadlock. Deadlocks are not dangerous. Just try again.” This is not something you can do when you’ve submitted a credit card authorization to the payment gateway. But MySQL doesn&rsquo[...]



What I Believe About Writing Software

Tue, 28 Sep 2010 00:00:00 -0700

So. I left ATG consulting so I could get a better understanding of what different environments are like. I’ve worked at a variety of startups, and written some Ruby, some Scala, even some Flex and Actionscript, and done a fair amount of fixing, writing and rewriting different systems and architectures. The good news is that all in all, I don’t suck. I’m very much of the “Domain Driven Design” school of development: I think that the domain and the concerns of the business team are a (if not THE) priority and I believe that it’s possible to write code in such a way that makes it hard to write bugs. I’ve tried different styles, and while the style of code I prefer has been charitably described as “verbose and up front,” I think it is clear and leaves little room for ambiguity. If there are bugs, it’s not a mystery where they are – I don’t like metaprogramming and I don’t like using any more tricks than I have to, so I can live with the occasional verbosity. In fact, some of the code is actually better than I remember. It’s a nice feeling to read code and be impressed by the writer… and then realize the writer was you. It turns out that the biggest problem I have is not a technical problem. It’s that I believe different things than other people. Some people believe different things than me, and are surprised and concerned when they find that I have a different background than them and don’t believe the same things they do. I’ve been told that some of the things on the list are “not agile” (not XP, technically) and that “you aren’t going to need this.” I don’t believe XP prescribes or proscribes any programming technique: I don’t believe that YouAintGonnaNeedIt is an automatic veto, just as TheSimplestThingThatCouldPossiblyWork isn’t a green light to do what you want: I’ve heard “simplest thing” used as a justification for everything from never using transactions to checking in over a gigabyte of binaries into a project (including both Linux and MacOS binaries of Apache and MySQL) to “simplify” the developer’s build environment. So. Here’s what I believe. I believe that loosely coupled, encapsulated systems are the way to go, for many reasons. I believe that they are easier to mock, easier to debug, and easier to use. I believe that good strong interfaces make for good neighbors. I believe in separation of concerns. Specifically, I believe in separation of content from presentation. I believe in the SOLID principles. I believe that programming languages have their strengths and weaknesses. Arguing which is better is like asking if a hammer is better than a trowel. I believe that operating systems have their strengths and weaknesses. Given the right toolset, I’m as happy programming in Windows as I am in Linux or MacOS. I believe that editors have their strengths and weaknesses. On a [...]



Making Avatar Make Sense

Sun, 03 Jan 2010 00:00:00 -0800

Okay, so. I’m going to assume that everyone has now seen Avatar, and hence I’ll assume you’re all up for massive spoilers, etc. The movie we all just watched was not the movie we thought we watched. And this is not just some George “hahaha, I destroyed the hopes and dreams of a generation” Lucas screwing with the fanbase… this is lurking beneath the surface of Avatar from the beginning. The existence of the Na’vi. The Na’vi have no common morphology with the rest of the planet. Sig even comments on it in a Youtube article. Why do they exist? How is it that they speak a recognizable language, and have genes close enough to human that it’s possible to MIX IN HUMAN DNA with the Na’vi? How is it that the Na’vi have built in neural interfacing equipment that can instantly domesticate the larger animals and even predators? Wouldn’t evolution make such a thing impossible? The answer is that the Na’vi aren’t a natural race. Eywa made them. They’re close enough to human that the humans can communicate with them and think they look cute and cuddly (Ewya may have been slightly confused here), and alien enough that they can survive in the local environment. If that wasn’t enough, Eywa provided with some elevated sudo privileges, so they could take advantage of the local fauna without Eywa being directly involved. The how is easy. Eywa’s a worldmind capable of transferring human neural networks into Na’vi clones on the second try, and it’s entirely feasible to create a race and set it up with false memories. And something that’s smart enough to create room temperature superconductor in bulk (what? You think Eywa runs on plants alone, when there’s massively complex electromagnetic flux happening around the tree and the Na’vi just happening to be sitting on giant deposits of unobtainium?) will have no problem reading our electromagnetic communications. Eywa’s on Alpha Centauri; it’s been listening in since the first radio broadcasts. The why is a bit harder to explain. Why would a worldmind play dumb? Well, probably because it has a very good understanding of what happens to things that look like a threat to humanity. If we had any conception of what Eywa is, we’d be terrified, and we’d probably sterilize the entire planet before we even set foot on it. As it is, Eywa looks harmless. It looks beautiful. It’s not exactly friendly, but it’s the kind of environment that keeps humans focused on the trees instead of on the forest. Eywa can afford to watch and wait until it has to act. It also makes a hell of a way to see human capabilities up close and personal. Eywa is well capable of doing a vacuum cleaner impression on every single EM communication on the planet. And when Eywa saw a human built Na’vi run out the container and disregard orders, it could lay bets that this was someone stupid and romantic enough to provide[...]



What Makes People Happy

Mon, 28 Dec 2009 00:00:00 -0800

If you want to know what makes you happy, you have to be willing to think hard about what happiness is, and pay attention to what makes you happy.

“Happiness comes in small doses, folks. It’s a cigarette butt, or a chocolate chip cookie, or a five second orgasm. You come, you smoke the butt, you eat the cookie, you go to sleep, wake up and go back to fucking work the next morning, THAT’S IT! End of fucking list!” – Denis Leary

So after much time and experience, here’s the list of things that make me happy.

1) Direct sunlight.
2) 8 hours of sleep.
3) Movement outside.
4) Social interaction.
5) Regular meals.
6) Satisfying work.

The upshot from this list is that I’m not all that complicated. Also, what actually makes me happy may not be what I spend most of your time thinking about. I don’t think about sunlight all that much. But I can tell the difference between when I have it and when I don’t. Biking has a huge effect on my mood. Not eating has a huge effect on my mood. And I’m going to take a leap in logic and say that this list is globally applicable to all humans.

The complicated part of this is social interaction and satisfying work. But even the satisfying work is simpler than you might expect.

But hang on a sec. This list doesn’t just apply to humans. Look at dogs, for example.

Dogs need sunlight. Just ask one.
Dogs need sleep. They get cranky if they don’t.
Dogs need walkies.
Dogs need to meet other dogs and sniff each others butts. And humans.
Dogs need regular meals. And they’ll eat anything you give them.
Dogs need to do something. Pointers need to herd, bloodhounds need to sniff.

You break it down and you’ll conclude that human beings are social animals, and have the same needs as social animals. I used to think that the people who get up at 8 am, eat breakfast and then jog for an hour were obnoxiously happy people who were just naturally gifted. They’re not. They’re happy because doing those things will make you a happy human. Likewise, staying up until 3 am, not getting outside, getting crappy sleep and reading existentialist philosophy will make you pretty damn unhappy. It doesn’t matter what your brain thinks about the activities you’re doing — do these things and you’ll look and act like an unhappy person the next day.

Again, this goes back to how to get the most milk out out of cows or the most work out of programmers.

The Dog Hypothesis: Human beings need walkies.




Getting work out of Programmers, Part 2

Mon, 20 Aug 2007 00:00:00 -0700

So here’s how I think you get the most work out of programmers. This is a follow on from part 1. Morale Programmers have good morale when they are treated well, and they are given a problem that they know they can solve. Thank programmers whenever they do something. Let them know you not only know how much they work, but that you care. Keep track of morale through weekly one on ones with each and every programmer. Keep track of commitments you make to your programmers (including the verbal ones) and follow through on them. Make it clear that you are looking out for their interests. (Creating a Software Engineering Culture, Chapter 3.) Money and stock options only have a limited effect on morale, and may have a negative effect (Agile Development, p63). Most often, a gift of money or stock options is a substitute. (Rapid Development, p262). Morale events can be fun, but don’t actually raise morale — they just allow for a different kind of interaction with people. (How to avoid Lame Morale Events). And there are all kinds of things that can hurt morale. I think the major one is the Broken Window Theory (Pragmatic Programmer, p4). Under the Broken Window Theory, any neglect or rot in a system that is not directly addressed and countered is a drag on morale. People wonder why it is that they have to write good code and do things right, when they’re not allowed to fix the crappy code. Management assertions that “we don’t have time right now” or “we’ll do it later” start to sound empty and hollow as project after project goes by, and the crappy code festers and rots as hack after hack is piled on top of it. The Psychology of Computer Programming, Chapter 10 deals specifically with Morale and Motivation. Rapid Development, Chapter 11 goes into typical developer motivations. They are both very much worth reading. Sleep You can’t make people sleep, and you can’t do much about sleeping arrangements. But you can tell them that you want them to get 8 hours of sleep a night, and you can bring up lack of sleep as an issue. Anyone who looks sleep deprived needs help; either they have been trying to sneak in work late at night, or they’re suffering in other ways. Give them all the help you can. The number of work hours will be an issue there. And if someone loads up on caffeine and junk food late at night… well, I’d point out that there may be a connection (How to Sleep Better). Exhausted employees are easy to spot. They’re the people who frighten small children and spouses. They are not fit for work. They barely even know they are at work. They should be sent home until they know what they’re doing. Just that simple act of humanity will raise morale. Focus The best way to ensure focus is to ensure transparency and feedback. If programmers have a public, physical way to see what needs to be done at a granular level, whether in a tod[...]



Getting work out of Programmers, Part 1

Thu, 16 Aug 2007 00:00:00 -0700

I was recently asked the question: what is the best way to get the most work out of my programmers? I’ve thought about this for a while. In some ways, it’s the story of my career — every manager I’ve ever had has wrestled with this question. I’ve worked with enough teams and individuals that I think I have a good understanding of what programmers can and can’t do, and how work gets done. And how work doesn’t get done, for that matter. I believe I have an answer, although certainly not the answer. The first thing to do is to look at the assumptions behind the way the question is phrased. The assumption here is that there are veins of untapped work, hidden inside of programmers somewhere. And the manager’s job is to extract it. I think that this is the wrong place to start from. I believe that most people would rather do a good job than a bad one, and will do the best work they can do given what they have to work with. (Software Creativity 2.0, p22.) The best way to phrase this question would be: “What can I do to help an individual programmer be most productive?” So, let’s construct a hypothetical employee. What is known to create productivity? Let’s start from the simplest level and work up. Morale An employee who feels safe and secure in his position and his ability to raise issues will do more work. You can argue whether this is the cause or the effect, but it’s well recognized that morale is extraordinarily important. Someone who believes in the team, the company and the work he is doing is likely to do more of it. I believe motivation and morale are equivalent, but morale feels like a better word to me. Sleep An employee who has slept 8 hours a day will be more productive than one who has slept 5 hours. I have yet to hear anyone argue with this in principle. Studies have been done that compare the effects of sleep deprivation to alcohol intoxication. The outcome: a programmer awake for 21 hours is effectively too drunk to drive. Think about that. There is a second stage of incapacitation, which is exhaustion AKA burnout. Exhausted employees are, by definition, not productive. Focus An employee who knows what he is supposed to be working on will be more productive than one who is not sure which programming task takes priority. An employee who knows that he can complete the work he is doing without having his priorities changed will be more productive than one who does not know what’s coming next. Programmers dislike surprises as much as anyone else. But this also covers the programmer’s inherent self-discipline. Given a gap in work, will the programmer ask what he should be working on, or will he create his own scripting language? Background An employee who knows the domain, the history and the subtle interplay of forces at work in the codebase will be more productive than a programmer who has no prev[...]