Subscribe: Sean McGrath
http://seanmcgrath.blogspot.com/rss/seanmcgrath.xml
Added By: Feedage Forager Feedage Grade B rated
Language: English
Tags:
acts  back  corpus law  corpus  data  law part  law  legal corpus  legal  meaning  new  part  regulations  time  world 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Sean McGrath

Sean McGrath



Sean McGrath's Weblog.



Last Build Date: Tue, 23 May 2017 08:33:54 +0000

 



What is law? - part 12

Tue, 16 May 2017 10:14:00 +0000

Previously : what is law? - part 11 There are a few odds and ends that I would like to bundle up before proceeding. These are items that have occurred to me since I wrote the first What is Law? post back in March. Items I would have written about earlier in this series, if they had occurred to me. Since I am writing this series as I go, this sort of thing is inevitable I guess. Perhaps if I revisit the material to turn it into an essay at some point, I will fold this new material in at the appropriate places. Firstly, in the discussion about the complexity of the amendatory cycle in legislation I neglected to mention that it is also possible for a new item of primary legislation to contain amendments to itself. In other words it may be that as soon as a bill becomes and act and is in force, it is immediately necessary to modify it using modifications spelled out in the act itself. Looking at it another way, a single Act can be both a container for new law and a container for amendatory instructions, all in one legal artifact. Why does this happen? Legislation can be crafted over long periods of time and consensus building may proceed piece by piece. In a large piece of legislation, rather than continually amending the whole thing – perhaps thousands of pages – sometimes amendments are treated as additional material tacked on the end so as to avoid re-opening debate – and editorial work - on material already processed through the legislative process. It is a bit of a mind bender. Basically if an Act becomes law at time T then it may instantaneously need to be codified in itself before we can proceed to codify it into the broader corpus. Secondly, I mentioned that there is no central authority that controls the production of law. This complicates matters for sure but it also has some significant benefits that I would like to touch on briefly as the benefits are significant. Perhaps the biggest benefit of the de-centralized nature of law making is that it does not have a single point of failure. In this respect, it is reminiscent of the distributed packet routing protocol used on the internet. Various parts of the whole system are autonomic resulting in an overall system that is very resilient as there is no easy way to interrupt the entire process. This distribution-based resilience also extends into the semantic realm where it combine with the textual nature of law to yield a system that is resilient to the presence of errors. Mistakes happen. For example, a law might be passed that requires train passengers to be packaged in wooden crates. (Yes, this happened.). Two laws might be passed in parallel that contradict each other (yes, this has happened many times.) When this sort of thing happens, the law has a way of rectifying itself, leveraging the “common sense” you can get with human decision making. Humans can make logical errors but they have a wonderful ability to process contradictory information in order to fix up inconsistent logic. Also humans possess an inherent, individual interpretation of equity/fairness/justice and the system of law incorporates that, allowing all participants to evaluate the same material in different ways. Thirdly, I would like to return briefly to the main distinction I see between legal deductive logic and the deductive logic computer science people are more familiar with. When deductive logic is being used (remembering always that it is just one form of legal reasoning and rarely used on its own) in law, the classic “if this then that” form can be identified as well as classical syllogistic logic. However, legal reasoning involves weighing up the various applicable deductive statements using the same sort of dialectic/debate-centric reasoning mentioned earlier. Put another way, deductive logic in law very rarely proceeds from facts to conclusion in some nice tidy decision tree. Given the set of relevant facts (which have themselves to be argued as “the relevant facts”) there may well be multiple applicable deductive logic forms in the corpus of law w[...]



What is law? - part 11

Thu, 04 May 2017 16:39:00 +0000

Previously: what is law? - part 10 Gliding gracefully over all the challenges alluded to earlier with respect to extracting the text level meaning out of the corpus of Law at time T, we now turn to thinking about how it is actually interpreted and utilized by practitioners. To do that, we will continue with our useful invention of an infinitely patient person who has somehow found all of the primary corpus and read it all from the master sources, internalized it, and can now answer our questions about it and feed it back to us on demand. The first order of business is where to start reading? There are two immediate issues here. Firstly, the corpus is not chronologically accretive. That is, there is no "start date" to the corpus we can work from, even if, in terms of historical events, a foundation date for a state can be identified. The reasons for this have already been discussed. Laws get modified. Laws get repealed. Caselaw gets added. Caselaw gets repealed. New laws get added. I think of it like a vast stormy ocean, constantly ebbing and flowing, constantly adding new content (rainfall, rivers) and constantly loosing content (evaporation) - in an endless cycle. It has no "start point" per se. In the absence of an obvious start point, some of you may be thinking "the index", which brings us to the second issue. There is no index! There is no master taxonomy that classifies everything into a nice tidy hierarchy. There are some excellent indexes/taxonomies in the secondary corpus produced by legal publishers, but not in the primary corpus. Why so? Well, if you remember back to the Unbounded Opinion Requirement mentioned previously, creating an index/taxonomy is, necessarily, the creation of an opinion on the "about-ness" of a text in the corpus. This is something the corpus of law stays really quite vague about - on purpose - in order to leave room for interpretation of the circumstances and facts about any individual legal question. Just because a law was originally passed to do with electricity usage in phone lines, does not mean it is not applicable to computer hacking legislation. Just because a law was passed relating to manufacturing processes does not mean it has no relevance to ripening bananas. (Two examples based on real world situations, I have come across by the way.) So, we have a vast, constantly changing, constantly growing corpus. So big it is literally humanly impossible to read, regardless of the size of your legal team, and there are no finding aids in the primary corpus to help us navigate our way through it.... ...Well actually, there is one and it is an incredibly powerful finding aid. The corpus of legal materials is woven together by an amazingly intricate web of citations. Laws invariably cite other laws. Regulations cite laws. Regulations cite regulations. Caselaw cites law and regulations and other caselaw....creating a layer that computer people would call a network graph[1]. Understanding the network graph is key to understanding how practitioners navigate the corpus of law. The don't go page-by-page, or date-by-date, they go citation-by-citation. The usefulness of this citation network in law cannot be overstated. The citation network helps practitioners to find related materials, acting as a human-generated recommender algorithm for practitioners. The citation networks not only establish related-ness, they also establish meaning, especially in the caselaw corpus. We talked earlier about the open-textured nature of the legal corpus. It is not big on black an white definitions of things. Everything related to meaning is fluid on purpose. The closest thing in law to true meaning is arguably established in the caselaw. In a sense, the caselaw is the only source of information on meaning that really matters because at the end of the day, it does not matter what you or I or anyone else might think a part of the corpus means. What really matters is what the courts say it means. Caselaw is the place you go[...]



Zen and the art of motorcycle....manuals

Wed, 26 Apr 2017 10:21:00 +0000

I heard the sad news about Robert Pirsig passing.

His book : Zen and the art of motorcycle maintenance was a big influence on me and piqued my interest in philosophy.

While writing the book his day job was writing computer manuals.

About 15 years ago, I wrote an article for ITWorld about data modelling with XML called Zen and the art of motorcycle manuals, inspired in part by Pirsig's book and his meditations on how the qualities in objects such as motorcycles are more than just the sum of the parts that make up the motorcycle.

So it is with data modelling. For any given modelling problem there are many ways to do it that are all "correct" at some level. Endlessly seeking to bottom out the search and find the "correct" model is a pointless exercise. At the end of the day "correctness" for any data model is not  a function of the data itself. It is a function of what you are planning to do with the data.

This makes some folks uncomfortable. Especially proponents of top-down software development methodologies who like to conceptualize analysis as an activity that starts and ends before any prototyping/coding begins.

Maybe somewhere out there Robert Pirsig is talking with Bill Kent - author of another big influence on my thinking : Data and Reality.

Maybe they are discussing how best to model a bishop :-)




What is law? - Part 10

Fri, 21 Apr 2017 15:52:00 +0000

Previously: What is Law? - Part 9 Earlier on in this series, we imagined an infinitely patient and efficient person who has somehow managed to acquire the entire corpus of law at time T and has read it all for us and can now "replay" it to us on demand. We mentioned previously that the corpus is not a closed world and that meaning cannot really be locked down inside the corpus itself. It is not corpus of mathematical truths, dependent only on a handful of axioms. This is not a bug to be fixed. It is a feature to be preserved. We know we need to add a layer of interpretation and we recognize from the outset that different people (or different software algorithms) could take this same corpus and interpret it differently. This is ok because, as we have seen, it is (a) necessary and (b) part of the way law actually works. Interpreters differ in the opinions they arrive at in reading the corpus. Opinions get weighed against each other, opinions can be over-ruled by higher courts. Some courts can even over-rule their own previous opinions. Strongly established opinions may then end up appearing directly in primary law or regulations, new primary legislation might be created to clarify meaning...and the whole opinion generation/adjudication/synthesis loop goes round and round forever... In law, all interpretation is contemporaneous, tentative and de-feasible. There are some mathematical truths in there but not many. It is tempting - but incorrect in my opinion - to imagine that the interpretation process works with the stream of words coming into our brains off of the pages, that then get assembled into sentences and paragraphs and sections and so on in a straightforward way. The main reason it is not so easy may be surprising. Tables! The legal corpus is awash with complex table layouts. I included some examples in a previous post about the complexties of law[1]. The upshot of the use of ubiquitous use of tables is that reading law is not just about reading the words. It is about seeing the visual layout of the words and associating meaning with the layout. Tables are  such a common tool in legal documents that we tend to forget just how powerful they are at encoding semantics. So powerful, that we have yet to figure out a good way of extracting back out the semantics that our brains can readily see in law, using machines to do the "reading". Compared to, say, detecting the presence of headings or cross-references or definitions, correctly detecting the meaning implicit in the tables is a much bigger problem. Ironically, perhaps, much bigger than dealing with high visual items such as  maps in redistricting legislation[2] because the actual redistricting laws are generally expressed purely in words using, for example, eastings and northings to encode the geography. If I could wave a magic wand just once at the problem of digital representation of the legal corpus I would wave it at the tables. An explicit semantic representation of tables, combined with some controlled natural language forms[4] would be, I believe, as good a serialization format as we could reasonably hope for, for digital law. It would still have the Closed World of Knowledge problem of course. It would also still have the Unbounded Opinion Requirement but at least we would be in position to remove most of the need for a visual cortex in this first layer of interpreting and reasoning about the legal corpus. The benefits to computational law would be immense. We could imagine a digital representation of the corpus of law as an enormous abstract syntax tree[5] which we could begin to traverse to get to the central question about how humans traverse this tree to reason about it, form opinions about it, and create legal arguments in support of their opinions. Next up: What is law? - Part 11. [1] http://seanmcgrath.blogspot.ie/2010/06/xml-in-legislatureparliament_04.html [2] https://ballotpedia.org/Redistricting [3] https://en.wikipedia.org/wiki/Easting_and_northin[...]



What is law? - Part 9

Wed, 19 Apr 2017 13:24:00 +0000

Previously: What is law? - Part 8 For the last while, we have been thinking about the issues involved in interpreting the corpus of legal materials that is produced by the various branches of government in US/UK style environments. As we have seen, it is not a trivial exercise because of the ways the material is produced and because the corpus - by design - is open to different interpretations and open to interpretation changing with respect to time. Moreover, it is not an exaggeration to say that it is a full time job - even within highly specialized sub-topics of law - to keep track of all the changes and synthesize the effects of these changes into contemporaneous interpretations. For quite some time now - centuries in some cases - a second legal corpus has evolved in the private sector. This secondary corpus serves to consolidate and package and interpret the primary corpus, so that lawyers can focus on the actual practice of law. Much of this secondary corpus started out as paper publications, often with so-called loose-leaf update cycles. These days most of this secondary corpus is in the form  of digital subscription services. The vast majority of lawyers utilize these secondary sources from legal publishers. So much so that over the long history of law, a number of interesting side-effects have accrued. Firstly, for most day-to-day practical purposes, the secondary corpus provides de-facto consolidations and interpretations of the primary corpus. I.e. although the secondary sources are not "the law", they effectively are. The secondary sources that are most popular with lawyers are very high quality and have earned a lot of trust over the years from the legal community. In this respect, the digital secondary corpus of legal materials is similar to modern day digital abstractions of currency such as bank account balances and credit cards etc. I.e. we trust that there are underlying paper dollars that correspond to the numbers moving around digital bank accounts. We trust that the numbers moving around digital bank accounts could be redeemed for real paper dollars if we wished. We trust that the real paper dollars can be used in value exchanges. So much so, that we move numbers around bank accounts to achieve value exchange without ever looking to inspect the underlying paper dollars. The digital approach to money works because it is trusted. Without the trust, it cannot work. The same is true for the digital secondary corpus of law, it works because it is trusted. A second, interesting side-effect of trust in the secondary corpus is that parts of it have become, for all intents and purposes, the primary source. If enough of the worlds legal community is using secondary corpus X then even if that secondary corpus differs from the primary underlying corpus for some reason, it may not matter in practice because everybody is looking at the secondary corpus. A third, interesting side effect of the digital secondary corpus is that it has become indispensable. The emergence of a high quality inter-mediating layer between primary legal materials and legal practitioners has made it possible for the world of law to manage greater volumes and greater change rates in the primary legal corpus.  Computer systems have greatly extended this ability to cope with volume and change. So much so, that law as it is today would collapse if it were not for the inter-mediating layer and the computers. The classic image of a lawyers office involves shelves upon shelves of law books. For a very long time now, those shelves have featured a mix of primary legal materials and secondary materials from third party publishers. For a very long time now, the secondary materials have been the day-to-day "go to" volumes for legal practitioners - not the primary volumes. Over the last 50 years, the usage level of these paper volumes has dwindled year on year to the point where today, the beautiful paper volumes have become pri[...]



What is law? - part 8

Fri, 14 Apr 2017 16:27:00 +0000

Previously:  what is law? - Part 7. A good place to start in exploring the Closed World of Knowledge (CWoK) problem in legal knowledge representation is to consider the case of a spherical cow in a vacuum... Say what? The spherical cow in a vacuum[1] is a well known humorous metaphor for a very important fact about the physical world. Namely, any model we make of something in the physical world, any representation of it we make inside a mathematical formula or a computer program, is necessarily based on simplifications (a "closed world") to make the representation tractable. The statistician George Box once said that "all models are wrong, but some are useful." Although this mantra is generally applied in the context of applied math and physics, this concept is incredibly important in the world of law in my opinion. Law can usefully be thought of as an attempt at steering the future direction of the physical world in a particular direction. It does this by attempting to pick out key features of the real world (e.g. people, objects, actions, events) and making statements about how these things ought to inter-relate (e.g. if event E happens, person P must perform action A with object O). Back to cows now. Given that the law may want to steer the behavior of the world with respect to cows, for example, tax them, regulate how they are treated, incentivize cow breeding programs etc. etc., how does law actually speak about cows? Well, we can start digging through legislative texts to find out but what we will find is not the raw material from which to craft a good definition of a cow for the purposes of a digital representation of it. Instead, we will find some or all of the following: Statements about cows that do not define cows at all but proceed to make statements about them as if we all know exactly what is a cow and what is not a cow Statements that "zoom in" in cow-ness without actually saying "cow" explicitly e.g. "animals kept on farms", "milk producers" etc, Statements that punt on the definition of a cow by referencing the definition in some outside authority e.g. an agricultural taxonomy Statements that "zoom in" on cow-ness by analogies to other animals eg. "similar in size to horses, bison and camels." Statements that define cows to be things other than cows(!) e.g. "For the purposes of this section, a cow is any four legged animal that eats grass." What you will not find anywhere in the legislative corpus, is a nice tidy, self contained mathematical object denoting a cow, fully encapsulated in a digital form. Why? Well, the only way we could possibly do that would be to make a whole bunch of simplifications on "cow-ness" and we know where that ends up. It ends up with spherical objects in vacuums just as it does in the world of physics! There is simply no closed world model of a cow that captures everything we might want to capture about cows in laws about cows. Sure, we could keep adding to the model of a cow, refining it, getting it close and closer to cow-ness. However, we know from the experience of the world of physics that we reach the point where have to stop, because it is a bottomless refinement process. This might sound overly pessimistic or pedantic and in the case of cows for legislative purposes it clearly is, but I am doing it to make a point. Even everyday concepts in law such as aviation, interest rates and theft are too complex (in the mathematical sense of complex) to be defined inside self-contained models. Again, fractals spring to mind. We can keep digging down into the fractal boundary that splits the world into cow and not-cow. Refining our definitions until the cows come home (sorry, could not resist) and we will never reach the end of the refinement process. Moreover many of the real world phenomena law wants to talk about exhibit a phenomenon known as "se[...]



What is law? - Part 7

Fri, 07 Apr 2017 12:34:00 +0000

Previously: What is law? - Part 6 Last time we ended with the question : “Given a corpus of law at time T, how can we determine what it all means?” There is a real risk of disappearing down a philosophical rabbit hole about how meaning is encoded in the corpus of law. Now I really like that particular rabbit hole but I propose that we not go down it here This whole area is best perused, in my experience, with comfy chairs, time to kill and a libation or two (semiotics, epistemolgy and mereotopology anyone?). Instead, we will simply state that because the corpus of law is mostly written human language it inherits some fascinating and deep issues to do with how written text establishes shared meaning and move on. For our purposes, we will imagine an infinitely patient person with infinite stamina, armed with a normal adults grasp of English, who is going to read the corpus and explain it back to us, so that we computer people can turn it into something else inside a computer system. The goal of that “something else” being to capture the meaning but be easier to work with inside a computer than a big collection of “unstructured” documents. This little conceptual trick of employing a fantastic human to read the current corpus and explain it all back to us, allows us to split the problem of meaning into two parts. The first part relates to how we could read it in its current form and extract its meaning. The second part relates to how we would encode the extracted meaning in something other than a big collection of unstructured documents. Exploring this second question, will, I believe, help us tease out the issues in determining meaning in the corpus of law in general, without getting bogged down in trying to get machines to understand the current format (lots and lots of unstructured documents!) right off the bat. I hope that makes sense? Basically, we are going to skip over how we would parse it all out of its current myriad document-form into a human brain and instead look at how we would extract it from said brain and store it again – but into something more useful than a big collection of documents. Assuming we can find a representation that is good enough, the reading of the current corpus should be a one-off exercise because as the corpus of law gets updated, we would update our bright shiny new digital representation of the corpus and never have to re-process all the documents ever again. So what options do we have for this digital knowledge representation? Surely there is something better than just unstructured document text? Text after all, is what you get if you use computers as typewriters. Computers do also give us search, which is a wonderful addition to typesetting, but understanding is a very different thing again. In order to have machines understand the corpus of law we need a way to represent the knowledge present in the law - not just what words are present (search) or how the words look on the page (formatting). This is the point where some of you are likely hoping/expecting that I am about to suggest some wonderful combination of XML and Lisp or some such that will fit the bill as a legal corpus knowledge representation alternative to documents... It would be great if that were possible but in my opinion, the textual/document-centric nature of a significant part of the legal corpus is unavoidable for reasons I will hopefully explain. Note that I said “significant part”. There are absolutely components of the corpus that do not have to be documents. In fact, some of the corpus has, already transitioned out of documents but, if anything, this has actually increased the interpretation complexities – of establishing meaning - not reduced them. I will hopefully explain that too:-) I think the best way of explaining why I think some form of electronic documents is as good as we can hope for, for large p[...]



What is law? - Part 6

Fri, 31 Mar 2017 12:33:00 +0000

Previously: What is law? - Part 5. To wrap up our high level coverage of the sources of law we need to add a few items to the “big 3” (Statutes/Acts, Regulations/Statutory Instruments and Case law) covered so far. Many jurisdictions have a foundational written document called a constitution which essentially "bootstraps" a jurisdiction by setting out its core principles, how its government is to be organized, how law will be administered etc. The core principles expressed in constitutions are, in many respects, the exact opposite of detailed rules/regulations. They tend to be deontic[1] in nature, that it, they express what ought to be true. They tend to be heavily open textured[2] meaning that they refer to concepts that are necessarily abstract/imprecise (e.g. concepts such as "fairness", "freedom" etc.). Although they only make up a tiny tiny fraction of the corpus of law in terms of word count, they are immensely important, as essentially everything that follows on from the constitution in terms of Statutes/Acts, Regulations/Statutory Instruments and case law has to be compatible with the constitution. Like everything else, the constitution can be changed and thus all the usual "at time T" qualifiers apply to constitutionality. Next up is international law such as international conventions/treaties which cover everything from aviation to cross-border criminal investigation to intellectual property to doping in sport. Next up, at local community level residents of specific areas may have rules/ordinances/bye-laws which are essentially Acts that apply to a specific geographic area. There may be a compendium of these, often referred to as a "Municipal Code" in the case of cities. I think that just about wraps up the sources of law. It would be possible to fill many blog posts with more variations on these (inter-state compacts, federations/unions, executive orders, private members bills etc.). It would also be possible to fill many blog posts with how these all overlap differently in different situations (e.g. what law applies when there are different jurisdictions involved in an inter-jurisdictional contract.). I don't think it would be very helpful to do that however. Even scratching the surface as we have done here will hopefully serve to adequately illustrate they key point I would like to make with is this: the corpus of law applicable to any event E which occurred at time T is a textually complex, organizationally distributed, vast corpus of constantly changing material. Moreover, there is no central authority that manages it. It is not necessarily available as it was at time T - even if money is no object. To wrap up, let us summarize the potential access issues we have seen related to accessing the corpus of law at time T. Textual codification at time T might not be available (lack of codification, use of amendatory language in Acts. etc.) Practical access at time T may not be available (e.g. it is not practical to gather the paper versions of all court reports for all the caselaw, even if theoretically freely available.) Access rights at Time T may not be available (e.g. incorporated-by-reference rulebooks referenced in regulations) All three access issues can apply up and down the scale of location specificity from municipal codes/bye-laws, regulations/statutory instruments, Acts/Statutes, case law, union/federation law to international law and, most recently, space law[3]. We are going to glide serenely over the top of these access issues as the solutions to them are not technical in nature. Next we turn to this key question: Given a corpus of law at time T, how can we determine what it all means? [...]



What is law? - Part 5

Wed, 29 Mar 2017 11:02:00 +0000

Previously: What is law? - Part 4 The Judicial Branch is where the laws and regulations created by the legislative and executive branches make contact with the world at large. The most common way to think of the judiciary is as the public forum where sentences/fines for not abiding by the law are handed down and as the public forum where disputes between private parties can be adjudicated by a neutral third party. This is certainly a major part of it but it is also the place where law gets clarified with finer and finer detail over time, in USA-style and UK-style "common law" legal systems. I like to think of the judicial branch as being a boundary determinator for legal matters. Any given incident e.g. a purported incident of illegal parking, brings with it a set of circumstances unique to that particular incident. Perhaps the circumstances in question are such that the illegal parking charge gets thrown out, perhaps not. Think of illegal parking as being – at the highest level – a straight line, splitting a two dimensional plane into two parts. Circumstances to the left of the line make the assertion of illegal parking true, circumstances to the right of the line make the assertion false. In the vast majority of legal matters, the dividing line is not that simple. I think of the dividing line as a Koch Snowflake[1]. The separation between legal and illegal start out as a simple Euclidian boundary but over time, the boundary becomes more and more complex as each new "probe" of the boundary (a case before the courts), more detail to the boundary is added. Simple put, the law is a fractal[2]. Even if a boundary starts out as a simple line segment separating true/false, it can become more complex with every new case that comes to the courts. Moreover, between any two sets of circumstances for a case A and B, there are an infinity of circumstances that are in some sense, in between A and B. Thus an infinity of new data points that can be added between A and B over time. Courts record their judgments in documents known collectively as “case law”. The most important thing about case law in our focus areas of USA-style and UK-style legal systems is that it is actually law. It is not just a housekeeping exercise, recording the activity of the courts. Each new piece of case law produced at time T, serves as an interpretation of the legal corpus at time T. That corpus consists of the Acts/Statutes in force, Regulations/Statutory Instruments in force *and* all other caselaw in force at time T. This is the legal concept of precedent, also known as stare decesis[3]. The courts strive, first and foremost, for consistency with precedents. A lot of weight is attached to arriving at judgements in new cases that are consistent with the judgements in previous cases. The importance of this cannot be over-estimated in understanding law from a computational perspective. Where is the true meaning of law to be found in common law jurisdictions? It is found in the case law! - not the Acts or the regulations/Statutory Instruments. If you are reading an Act or a regulation and are wondering what it actually means, the place to go is the case law. The case law, in a very real sense, is the place where the actual meaning of law is spelled out. From a linguistics perspective you can think of this in terms of the pragmatics counterpart to grammar/syntax. Wittgenstein fans can think of it as “language is use”. i.e. the true meaning of language can be found in how it is actually used in the real world. Logical Positivists might think of it as a behaviorist approach to meaning. That is, meaning comes from behavior. To understand what a law means – watch what the courts interpret it to mean. The meaning of the law comes from how it is used in practice and that use comes from the empi[...]



What is law? - Part 4

Mon, 27 Mar 2017 15:45:00 +0000

Previously: What is law? - Part 3 Now we will turn our attention to the second part of the legal corpus, namely regulations/statutory instruments. I think of this material as fleshing out of the material coming in the form of Acts from the Legislature/Parliament. Acts can be super-specific and self contained, but they can also be very high level and delegate finer detail to government agencies to work out and fill in the details. Acts that do this delegation are known as "enabling acts" and the fine detail work takes the form of regulations (USA terminology) or Statutory Instruments (UK terminology). The powers delegated to executive branch agencies by enabling Acts can be quite extensive and the amount of review done by the Legislature/Parliament differs a lot across different jurisdictions. In some jurisdictions, there is no feedback loop back to the Legislature/Parliament at all. In others, all regulation/statutory instruments must pass a final approval phase back in the Legislature/Parliament. As with the Acts, the regulations go through a formal promulgation process - typically being activated by public notice in a Government gazette/register publication. As with the Acts, an official compendium of regulations may or may not be produced by Government itself and if it exists, it may lag behind the production of new Regulations/Statutory Instruments by months or even years. As with Acts, third party publishers often add value by keeping a corpus of regulations/SIs up to date with each register/gazette publication (often a weekly publication). One useful rough approximation is to conceptualize the Regulations/Statutory Instruments as appendices to Acts. Just as with any other type of publication, a full understanding of the text at time T requires a full understanding of the appendices at time T. In other word, to understand the Act at time T you need the Regulations/Statutory Instruments at time T. This brings us to the first significant complication. The workflows and publication cycles for the Acts and the Regulations/Statutory Instruments are different, and the organizations doing the work are different, resulting in a work synchronization and tracking challenge. Tracking Acts is not enough to understand the Acts. You need to track Regulations/Statutory Instruments too and keep the two in sync with each other. The next complication comes from the nature of the Regulations/Statutory Instruments themselves. When the need arises for very detailed knowledge about some regulated activity, there is often a separate association/guild/institute of specialists in that regulated activity. Sometimes, the rules/guidelines in use by the separate entity can become part of the law by being incorporated-by-reference into the regulations/statutory instruments[1]. Sometimes, the separate association/guild/institute is formally given powers to regulate and becomes what is known as a Self Regulatory Organization (SRO)[2]. The difficulty this presents for the legal decision-making box we are creating in our conceptual model of law is that this incorporated-by-reference material may not be freely available in the same way that the Acts and Regulations/Statutory Instruments are generally freely available (at least in unconsolidated forms). In Part 1, reference was made to the legal concept that "ignorance of the law is no defense". Well, you can see the potential problem here with material that is incorporated-by-reference. If I can only read the incorporated-by-reference aspects of the legal corpus at time T by paying money to access them, then the corpus of law itself (however complex and difficult to interpret it might be) is not actually fully available to me. The important distinction here is between fee-based acces[...]



What is law? - Part 3

Thu, 23 Mar 2017 16:15:00 +0000

Previously : What is Law? - Part 2. The corpus of law - the stuff we all, in principle, have access to and all need to comply with, is not, unfortunately a nice tidy bundle of materials managed by a single entity. Moreover, the nature of the bundle itself, differs between jurisdictions. Ireland is quite different from Idaho. Scotland is quite different from the Seychelles. Jersey is quite different from Japan, and so on. I will focus here on US and UK (Westminister)-style legal corpora to keep the discussion manageable in terms of the diversity. Even then, there are many differences in practice and terminology all the way up and down the line from street ordinances to central government to international treaties and  everything in between. I will use some common terminology but bear in mind that actual terminology and practice in your particular part of the world will very likely be different in various ways, but hopefully not in ways  that invalidate the conceptual model we are seeking to establish. In general, at the level of countries/states, there are three main sources of law that make up the legal corpus. These are the judiciary, the government agencies and the legislature/parliament. Let us start with the Legislature/Parliament. This is the source of new laws and amendments to the law in the form of Acts. These start out as draft documents that go through a consideration, amendment and voting process before they become actual law. In the USA, it is common for these Acts to be consolidated into a "compendium", typically referred to as "The Statutes" or "The Code". The Statutes are typically organized according to some thematic breakdown into separate "titles" e.g. Company Law, Environmental Law and so on. In the UK/Westminster-type of Parliament, the government itself does not produce thematic compendia. Instead, the Acts are a cumulative corpus. So, to understand, for example, criminal law, it may be necessary to look at many different Acts, going back perhaps centuries to get the full picture of the "Act" actually in force. In UK-style systems, areas of law may get consolidated periodically through the creation of so-called "consolidations"/"re-statements". These essentially take an existing set of Acts that are in force, repeal them all and replace them with a single text that is a summation of the individual Acts that it repeals.[1] It is common for third party publishers to step in and help practitioners of particular areas of law by doing unofficial consolidations to make the job of finding the law in a jurisdiction easier. Depending on how volatile the area of law is in terms of change, the publisher might produce an update every month, every quarter, every year etc. In the USA, most US states do a consolidation in-house in the legislature when  they produce The Statutes/Code. In a similar manner to third party publishers, this corpus is updated according to a cycle, but it is typically a longer cycle - every year or two years. So here we get to our first interesting complication with respect to being able to access the law emanating from Legislatures/Parliaments that is in force at any time T. It is very likely that no existing compendium produced by the government itself, is fully up to date with respect to time T. There are a number of distinct reasons for this. Firstly, for Parliaments that do not produce "compendiums", there may not be an available consolidation/re-statement at time T. Therefore, it is necessary to find a set of Acts that were in force at time T, which then need to be read together to understand what the law was at time T. Secondly, for Legislatures that produce compendia in the form of Statutes, these typically lag behind the Acts by anything from months to years. Typically, when a Legislature[...]



What is law? - Part 2

Wed, 22 Mar 2017 15:04:00 +0000

Previously: What is law? - Part 1. The virtual legal reasoning box we are imagining will clearly need to either contain the data it needs, or be able to reach outside of the box and access whatever data it needs for its legal analysis. In other words, we can imagine the box having the ability to pro-actively reach out and grab legal data from the outside world when it needs it. And/or we can also imagine the box directly storing data so that it does not need to reach out and get it. This brings us to the first little conceptual maneuver we are going to make in order to  make reasoning about this whole thing a bit easier. Namely, we are going to treat all legal data that ends up inside the box for the legal analysis as having arrived there from somewhere else. In other words, we don't have to split our thinking into stored-versus-retrieved legal data. All data leveraged by the legal reasoning box is, ultimately, retrieved from somewhere else. It may be that for convenience, some of the retrieved data is also stored inside the box but that is really just an optimization - a form of data caching - that we are not going to concern ourselves with at an architectural level as it does not impact the conceptual model. A nice side effect of this all-data-is-external conceptualization is that it mirrors how the real world of legal decision making in a democracy is supposed to work. That is, the law itself does not have any private data component. The law itself is a corpus of materials available (more on this availability point later!) to all those who must obey the law. Ignorance of the law is no defense.[1] The law is a body of knowledge that is"out there" and we all, in principle, have access to the laws we must obey. When a human being is working on a legal analysis, they do so by getting the law from "out there" into their brains for consideration. In other words, the human brain acts as a cache for legal materials during the analysis process. If the brain forgets, the material can be refreshed and nothing is lost. If my brain and your brain are both reaching out to find the law at time T, we both - in principle - are looking at exactly the same corpus of knowledge. I am reminded of John Adams statement that government should be "A government of laws, not of men."[2] i.e. I might have a notion of what is legal and you might have a different notion of what is legal but because the law is "out there" - external to both of us - we can both be satisfied that we are both looking at the same corpus of law which is fully external to both of us. We may well interpret it differently, but that is another matter, we will be returning to later. I am also reminded of Ronald Dworkin's Law as Integrity[3] which conceptualizes law as a corpus that is shared by and interpreted for, the community that creates it. Again, the word "interpretation" comes up, but that is another days work. One thing at a time... So what actually lives purely inside the box if the law itself does not? Well, I conceptualize it as the legal analysis apparatus itself, as opposed to any materials consumed by that apparatus. Why do I think of this as being inside and not outside the box? Primarily because it reflects how the real world of law actually works. A key point, indeed a feature, of the world of law, is that it is not based on one analysis box. It is, in fact lots and lots of boxes. One for each lawyer and each judge and each court in a jurisdiction... Legal systems are structured so that these analysis boxes can be chained together in an escalation chain (e.g. district courts, appeal courts, supreme courts etc.) The decision issued by one box can be appealed to a higher box in the decision-making hierarchy. Two boxes at the same level in the hierarchy [...]



What is law? - Part 1

Wed, 15 Mar 2017 18:55:00 +0000

Just about seven years ago now, I embarked on a series of blog posts concerning the nature of legislatures/parliaments. Back then, my goal was to outline a conceptual model of what goes on inside a legislature/parliament in order to inform the architecture of computer systems to support their operation.

The goal of this series of posts is to outline a conceptual model of what law actually is and how it works when it gets outside of the legislatures/parliaments and is used in the world at large.

I think now is a good time to do this because there is growing interest around automation "downstream" of the legislative bodies. One example is GRC - Governance, Risk & Compliance and all the issues that surround taking legislation/rules/regulations/guidance and instantiating it inside computer systems. Another example is Smart Contracts  - turning legal language into executable computer code. Another example is Chatbots such as DoNotPay which encode/interpret legal material in a "consultation" mode with the aid of Artificial Intelligence and Natural Language Processing. Another example is TurboTax and programs like it which have become de-facto sources of interpretation of legal language in the tax field.

There are numerous other fascinating areas where automation is having a big impact in the world of law. Everything from predicting litigation costs to automating discovery to automating contract assembly. I propose to skip over these for now, and just concentrate on a single question which is this:
      If a virtual "box" existed that could be asked questions about legality of an action X, at some time T, what would need to be inside that box in order for it to reflect the real world equivalent of asking a human authority the same question?
If this thought experiment reminds you of John Searle's Chinese Room Argument then good:-) We are going to go inside that box. We are taking with us Nicklaus Wirth's famous aphorism that Algorithms + Data Structures = Programs. We will need a mix of computation (algorithmics) and data structures but let us start with the data sources because it is easiest of two.

What data (and thus data structures) do we need to have inside the box? That is the subject of the next post in this series.

What is law? - Part 2.





Custom semantics inside HTML containers

Mon, 27 Feb 2017 13:58:00 +0000

This article of mine from 2006 (I had to dig it out of the way back machine!) Master Foo's Taxation Theory of Microformats came back to mind today when I read this piece Beyond XML: Making Books with HTML. It is gratifying to see this pattern start to take hold. I.e. leveraging an existing author/edit toolchain rather than building a new one. We do this all the time in Propylon, leveraging off-the-self toolsets supporting flexible XML document models (XHTML, .docx, .odt) but encoding the semantics and the business rules we need in QA/QC pipelines. Admittedly, we are mostly dealing with complex, messy document types like legislation, professional guidance, policies, contracts etc. but then again, if your data set is not messy, you might be better off using a relational database to model your data and use the relational model to drive your author/edit sub-system in the classic record/field-oriented style.



Paper Comp Sci Classics

Mon, 20 Feb 2017 10:02:00 +0000

Being a programmer/systems architect/whatever brings with it a big reading load just to stay current. It used to be the case that this, for me, involved consuming lots of physical books and periodicals. Nowadays, less so because there is so much good stuff online. The glory-days of paper-based publications are never coming back so I think its worth taking a moment to give a shout out to some of the classics.

My top three comp sci books, the ones I will never throw out are:
- The C Programming Language by Kernighan and Ritchie
- Structure and Interpretation of Computer Programs, Abelson and Sussman
- Godel, Escher, Bach, Hofstadter

Sadly, I did dump a lot of classic magazines:-/ Byte, Dr Dobbs, PCW....

Your turn:-)





ChatOps, DevOps, Pipes and Chomsky

Fri, 27 Jan 2017 12:39:00 +0000

ChatOps is an area I am watching closely, not because I have a core focus on DevOps per se, but because Conversational User Interfaces is a very interesting area to me and ChatOps is part of that.

Developers - as a gene pool - have a habit of developing very interesting tools and techniques for doing things that save time down in the "plumbing". Deep down the stack where no end-user ever treads.

Some of these tools and techniques stay there forever. Others bubble up and become important parts of the end-user-facing feature sets of applications and/or important parts of the application architecture, one level beneath the surface.

Unix is full of programs, patterns etc. that followed this path. This is from Doug McIllroy in *1964*

"We should have some ways of coupling programs like garden hose--screw in another segment when it becomes when it becomes necessary to massage data in another way."

That became the Unix concept of a bounded buffer "pipe" and the now legendary "|" command line operator.

For a long time, the Unix concept of pipes stayed beneath the surface. Today, it is finding its way into front ends (graphics editing pipelines, audio pipelines) and into applications architectures (think Google/Amazon/Microsoft cloud-hosted pipelines.)

Something similar may happen with Conversational User Interfaces. Some tough nuts might end up being cracked down in the plumbing layers by DevOps people, for their own internal use, and then bubble up....

The one that springs to mind is that we will need to get to the point where hooking in new sources/sinks into ChatBots doesn't involve breaking out the programming tools and the API documentation. The CUI paradigm itself might prove to be part of the solution to the integration problem.

For example, what if a "zeroconf" for any given component was that you could be guaranteed to be able to chat to it - not with a fully fledged set of application-specific dialog commands, but with a basis set of dialog components from which a richer dialog could be bootstrapped.

Unix bootstrapped a phenomenal amount of integration power from the beautifully simple concept of standard streams for input, output and error. A built-in lingustic layer on top of that for chatting about how to chat, is an interesting idea. Meta chat. Talks-about-talks. That sort of thing.

Dang, just as Chomsky's universal grammar seems to be gathering dissenters...:-)





The new Cobol, the new Bash

Wed, 21 Dec 2016 11:46:00 +0000

Musing, as I do periodically, on what the Next Big Thing in programming will be, I landed on a new (to me) thought.

One of the original design goals of Cobol was English-like nontechnical readability. As access to NLP and AI continues to improve, I suspect we will see a fresh interest in "executable pseudo-code" approaches to programming languages.

In parallel with this, I think we will see a lot of interest in leveraging NLP/AI from chat-bot CUI's in programming command line environments such as the venerable bash shell.

It is a short step from there I think, to a read-eval-print loop for an English-like programming environment that is both the programming language and the operating system shell.

Hmmm....




Recommender algorithms R Us

Fri, 25 Nov 2016 10:16:00 +0000

Tommorow, at congregation.ie, my topic is recommender algorithms, although, at first blush, it might look like my topic is the role of augmented reality in hamster consumption.

A Pokemon ate my hamster.




J2EE revisited

Fri, 04 Nov 2016 15:31:00 +0000

The sheer complexity of the Javascript eco-system at present, is eerily reminiscent of the complexity that caused many folk to balk at J2EE/DCOM back in the day.

Just sayin'.




Nameless things within namless things

Wed, 19 Oct 2016 13:32:00 +0000

So, I got to thinking again about one of my pet notions - names/identifiers - and the unreasonable amount of time IT people spend naming things, then mapping them to other names, then putting the names into categories that are .... named.... aliasing, mapping, binding, bundling, unbundling, currying, lamdizing, serializing, reifying, templating, substituting, duck-typing, shimming, wrapping...

We do it for all forms of data. We do it for all forms of algorithmic expression. We name everything. We name 'em over and over again. And we keep changing the names as our ideas change, and the task to be accomplished changes, and the state of the data changes....

It gets overwhelming. And when it does, we have a tendency to make matters worse by adding another layer of names. A new data description language. A new DSL. A new pre-processor.

Adding a new layer of names often *feels* like progress. But it often is not, in my experience.

Removing the need for layers of names is one of the great skills in IT in my opinion. It is so undervalued, the skill doesn't have um, a, name.

I am torn between thinking that this is just *perfect* and thinking it is unfortunate.




Semantic CODECs

Wed, 05 Oct 2016 16:16:00 +0000

It occurred to me today that the time-honored mathematical technique of taking a problem you cannot solve and re-formulating it as a problem (perhaps in a completely different domain) that you can solve, is undergoing a sort of cambrian explosion.

For example, using big data sets and deep learning, machines are getting really good at parsing images of things like cats.

The more general capability is to use a zillion images of things-like-X to properly classify a new image being either  like-an-X or not-like-an-X, for any X you like.

But X is not limited to things we can take pictures of. Images don't have to come from cameras. We can create images from any abstraction we like. All we need is an encoding strategy....a Semantic CODEC if you will.

We seem to be hurtling towards large infrastructure that is specifically optimized for image classification. It follows, I think, that if you can re-cast a problem into an image recognition problem - even if it has nothing to do with images - you get to piggy-back on that infrastructure.

Hmmmmm.



The next big exponential step in AI

Wed, 21 Sep 2016 13:11:00 +0000

Assuming, for the moment, that the current machine learning bootstrap pans out, the next big multiplier is already on the horizon.

As more computing is expressed in forms that require super-fast, super-scalable linear algebra algorithms (a *lot* of machine learning techniques do this), it becomes very appealing to find ways to execute them on quantum computers. Reason being, exponential increases are possible in terms of parallel execution of certain operations.

There is a fine tradition in computing of scientists getting ahead of what today's technology can actually do. Charles Babbage, Dame Ada Lovlace, Alan Turing, Doug Englebart, Vannever Bush, all worked out computing stuff that was way ahead of the reality curve, and then reality caught up with their work.

If/when quantum computing gets out of the labs, the algorithms will already be sitting in the Machine Learning libraries ready to take advantage of them, because forward looking researchers are working them out, now.

In other words, it won't be a case of "Ah, cool! We have access to a quantum computer! Lets spend a few years working out how best to use them.". Instead it will be "Ah, cool!. We have access to a quantum computer! Lets deploy all the stuff we have already worked out and implemented, in anticipation of this day."

It reminds me of the old adage (attributable to Poyla, I think) about "Solving for N". If I write an algorithm that can leverage N compute nodes, then it does not matter that I might only be able to deploy it with N = 1 because of current limitations. As soon as new compute nodes become available, I can immediately set N = 2 or 2000 or 20000000000 and run stuff.

With the abstractions being crafted around ML libraries today, the "N" is being prepped for some very large potential values of N.




Deep learning, Doug Englebart and Jimmy Hendrix

Thu, 15 Sep 2016 10:49:00 +0000

The late great Doug Englebart did foundational work in many areas of computing and was particularly interested in the relationship between human intelligence and machine intelligence.

Even a not-so-smart machine can augment human productivity if even simple cognitive tasks can be handled by the machine. Reason being, machines are super fast. Super fast can compensate for "not-so-smart" in many useful domains. Simple totting up of figures, printing lots of copies of a report, shunt lots of data around, whatever.

How do you move a machine from "not-so-smart" to "smarter" for any given problem? The obvious way is to get the humans to do the hard thinking and come up with a smarter way. It is hard work because the humans have to be able to oscillate between smart thinking and thinking like not-so-smart machines because ultimately the smarts have to be fed to the not-so-smart machine in grindingly meticulous instructions written in computer-friendly (read "not-so-smart") programs. Simple language because machines can only grok simple language.

The not-so-obvious approach is to create a feedback loop where the machine can change its behavior over time by feeding outputs back into inputs. How to do that? Well, you got to start somewhere so get the human engineers to create feedback loops and teach them to the computer. You need to do that to get the thing going - to bootstrap it....

then stand back....

Things escalate pretty fast when you create feedback loops! If the result you get is a good one, it is likely to be *a lot* better than your previous best because feedback loops are exponential.

Englebart's insight was to recognize that the intelligent, purposeful creation of feedback loops can be a massive multiplier : both for human intellect at the species level, and at the level of machines. When it works, it can move the state of the art of any problem domain forward, not by a little bit, but by *a lot*.

A human example would be the invention of writing. All of a sudden knowledge could survive through generations and could spread exponentially better than it could by oral transmission.

The hope and expectation around Deep Learning is that it is basically a Doug Englebart Bootstrap for machine intelligence. A smart new feedback loop in which the machines can now do a vital machine intelligence step ("feature identification") that previously required humans. This can/should/will move things forward *a lot* relative to the last big brohuha around machine intelligence in the Eighties.

The debates about whether or not this is really "intelligence" or just "a smarter form of dumb" will rage on in parallel, perhaps forever.

Relevance to Jimmy Hendrix? See https://www.youtube.com/watch?v=JMyoT3kQMTg




The scourge of easily accessible abstraction

Thu, 11 Aug 2016 11:57:00 +0000

In software, we are swimming in abstractions. We also have amazingly abstract tools that greatly enhance our ability to create even more abstractions.

"William of Ockham admonished philosophers to avoid multiplying entities, but computers multiple them faster than his razor can shave." -- John F. Sowa, Knowledge Representation.

Remember that the next time you are de-referencing a URL to get the address of a pointer to a factory that instantiates an instance of meta-class for monad constructors...






Sebastian Rahtz, RIP

Wed, 03 Aug 2016 10:57:00 +0000

It has just now come to my attention that Sebastian Rahtz passed away earler this year.
RIP. Fond memories of conversations on the xml-dev mailing list.

https://en.wikipedia.org/wiki/Sebastian_Rahtz