Subscribe: /~colmmacc/
http://www.stdlib.net/~colmmacc/feed/atom/
Language: English
Tags:
atheist ireland  atheist  element  elements  good  ireland  mac  number  numbers  people  random  runs  set  tasks  time
Rate this Feed

Feed Details and Statistics
Preview: /~colmmacc/

# /~colmmacc/

## An Irishman's Fiery

Updated: 2011-05-15T17:01:26Z

Weighty matters

2011-05-15T17:01:26Z

One of my favourite programming exercises is to write a text generator using Markov Chains. It usually takes a relatively small amount of code and is especially useful for learning new programming languages. A useful goal is to able to have a tool that ingests a large body of text, breaks it into N-tuples (where [...]One of my favourite programming exercises is to write a text generator using Markov Chains. It usually takes a relatively small amount of code and is especially useful for learning new programming languages. A useful goal is to able to have a tool that ingests a large body of text, breaks it into N-tuples (where N is chosen by the operator), and then emits a random text of length L where that text uses those tuples at the same frequency that the original body did. On its own, that task can produce some fun results, but it’s also a very repurpose-able technique. You can use this kind of statistical generation to simulate realistic network jitter (record some N-tuples of observed RTT with ping), or awesome simulated-user fuzz tests (record some N-tuples of observed user inputs). It’s surprising that it isn’t more common. But when approaching these problems, from experience of working with newcomers, what seems to be a common first tripping point is how to do weighted selection at all. Put most simply, if we have a table of elements; element weight A 2 B 1 C 1 how do we write a function that will choose A about half the time, and B and C about a quarter each? It also turns out that this is a really interesting design problem. We can choose to implement a random solution, a non-random solution, a solution that runs in constant time, a solution that runs in linear time and a solution that runs in logarithmic time. This post is about those potential solutions. Non-random solutions When coming to the problem, the first thing to decide is whether we really want the selection to be random or not. One could imagine a function that tries to keep track of previous selections, for example; 1234567891011121314151617elements = [ { 'name' : 'A' , 'weight' : 2 },              { 'name' : 'B' , 'weight' : 1 },              { 'name' : 'C' , 'weight' : 1 } ] def select_element(elements, count):   total_weight = 0   for element in elements:     total_weight += element['weight']     if count < total_weight:          return  ( element ,                    (count + 1) % len(elements) ) # select some elements c = 0 (i , c) = select_element(elements, c) (j , c) = select_element(elements, c) (k , c) = select_element(elements, c) This function uses the “count” parameter to remember where it left off last time, and runs in time proportionate to the total number of elements. Used correctly, the first two times it’s called this function will return A, followed by B, followed by C, followed by A twice again and so on. Hopefully that’s obvious. There’s a very simple optimisation possible to make it run in constant time, just sacrifice memory proportionate to the total sum of weights; 12345678elements = [ { 'name' : 'A' },              { 'name' : 'A' },              { 'name' : 'B' },              { 'name' : 'C' } ] def select_element(elements, count):   return  ( element[count] ,             (count + 1) % len(elements) ) This basic approach is surprisingly common, it's how a lot of networking devices implement weighted path selection. Random solutions But if we're feeding into some kind of simulation, or a fuzz-test, then randomised sel[...]

Prime and Proper

2010-09-02T06:24:59Z

A common software engineering pattern is the need to test elements for set membership. Practical questions like “is this item in my cache” arise surprisingly often. Bloom Filters are a great mechanism for performing this test, and if you haven’t encountered them before – they’re well worth knowing about. But over the last year, I’ve [...]A common software engineering pattern is the need to test elements for set membership. Practical questions like “is this item in my cache” arise surprisingly often. Bloom Filters are a great mechanism for performing this test, and if you haven’t encountered them before – they’re well worth knowing about. But over the last year, I’ve been experimenting with a different method for these kinds of membership tests on enumerated sets, and thought it was worth sharing. It’s definitely not the first time the technique has been used, Cian found some papers on its use. But it’s not a common pattern, which is a pity – because it’s incredibly simple, and lends itself very well to modern hardware. First things first, some huge disclaimers; The approach only works on enumerated sets. Each element in the problem space has to be assigned a unique counted number. Hashes won’t do. The method is constrained by the size of product of the total number of elements and the maximum size of any set. So, it’s very simple; Take each element in the problem space, assign it a unique prime number, and then represent sets as the products of those primes. This works because of the fundamental theorem of arithmetic. Here’s a summary of operations; Computing a set: Compute the product of all of the elemental primes in the set. s = a * b * c Testing for element membership: Test the modulus of the set number. (s % a) == 0 Adding to the set: Multiply the existing set product by the new element’s prime number. s *= d Removing from the set: Divide the existing set product by the element’s prime number. s /= a Now at this point, you’re probably skeptical about the usefulness of the technique, given the constraints. Obviously other operations like unions and intersections between sets are possible, but they require factorisation – and so are not particularly efficient (though you’d be surprised how quickly they do run). But look at the benefits; Unlike Bloom filters, there are zero false positives, and zero false negatives. The method is 100% precise. Due to their use in cryptography; libraries, CPUs and other hardware have arisen that are highly efficient at computing the products and modulus of very large numbers. As a case-in-point, let’s use the internet as an example. Suppose you want to model the internet as a system of interconnected autonomous systems and paths. A really common test when using such models is to determine whether or not a particular system is on a particular path. If we take the internet as 100,000 autonomous systems, and the longest practical path on the internet as containing 20 such systems (longer paths exist, but are oddities – in real terms these numbers are both over-estimates) the largest product we would ever need to compute is smaller than 2410. That’s much smaller than even modern cryptographic primes themselves, and much much smaller than their products. There is a lot more room. Actually, that was my first real-world use-case for this technique – modeling the internet in a few hundred lines of python, on a laptop – as a prototype. Surprisingly, a Macbook Air could perform well over one million such membership tests per second without skipping a beat – after reordering to assign the most popular AS numbers the lowest prime-numbers. And incidentally, if that kind of problem interests you – Cloudfront are hiring. Now a question, for the information theorists; How is it that this method can seemingly perfectly encode N bits of information in fewer than N bits? For exa[...]

Fact-checking Ireland Offline

2010-03-29T19:43:33Z

Holy Orders

2010-01-03T15:14:14Z

Point Break

2009-12-28T21:12:52Z

So, about 2 minutes after taking this photo; I slipped on the paraffin-laden solid granite pavement, everything went flying, and I ended up getting a very different kind of photo entirely; That’s my right arm, and somewhere in there a minimally-deforming radial fracture that I don’t remotely have the training to actually see. Though I [...]

So, about 2 minutes after taking this photo;

I slipped on the paraffin-laden solid granite pavement, everything went flying, and I ended up getting a very different kind of photo entirely;

(image)

That’s my right arm, and somewhere in there a minimally-deforming radial fracture that I don’t remotely have the training to actually see. Though I definitely don’t like the look of that suspiciously lumpy bit of bone. Here’s another look;

(image)

As fractures go, I think I got the best kind. It’s now 3 weeks later and I’m almost back to a full range of movement and strength is starting to return. I can write, type, take photos and most importantly, play music once again. The people at St. James’s Hospital have been very good, and if you ever plan to break anything I can recommend doing it near them.

Science is really really cool. Randomised control trials have shown that a simple millennia-old sling (not a cast) is the best treatment, so that’s what I got. About a century of rigourous bio-chemical engineering has led to little pills I could buy in a pharmacy that magically suppress complex sources of pain with minimal side-effects.

Over that same century we’ve progressed from a naive understanding of “Röntgen radiation” as mysterious emanations from a vacuum tube, to a complex quantum-mechanical model of X-Ray interactions that allows us to record these highly-ionising photon jet-streams on a semi-conductor, convert the impression into a digital image and then have it pop up on my clinicians desktop. Right now, reading a random blog on the internet, you can see inside my body – as easily as you might look out the window.

And social science is used too. Because people frequently forget their appointments, there are text message reminders; a few days, and one day before anything that’s scheduled. Again, a trial has shown that this is both cost-effective and clinically beneficial. And when I go to the physio-therapy clinic, I get given a simple set of tried and tested exercises that have been shown to lead to improvements.

Cooler still is that despite all of the science, and care and attention involved in the whole process – really it’s the body doing the work. Through some magic – mostly unknown – system of DNA signaling, controlled protein unfolding, stem-cells and “stuff”, tiny microscopic organic “bits” somehow communicate and coordinate the building of whole new bone, mostly in the right place. So without having to do all that much, I can play Fussball and write dumb blog entries again.

It’s almost worth doing just for the experience.

Con-science

2009-12-16T17:43:38Z

I’m sitting on a train, the Enterprise, moving along at about 140 kilometers per hour between Drogheda and Dublin. Something is causing practically every particle in my body to spark into a new life at a different position in space one instant to the next. For each one of those particles, there’s some chance it [...]I’m sitting on a train, the Enterprise, moving along at about 140 kilometers per hour between Drogheda and Dublin. Something is causing practically every particle in my body to spark into a new life at a different position in space one instant to the next. For each one of those particles, there’s some chance it could just pop into life on the far side of the moon, or even the universe. But something, I guess “motion” is averaging everything out and look, there the sum of me pops, every tiny instant getting slightly closer to Dublin. If the train was in the vaccuum of space – or somehow free from the effects of friction – it wouldn’t even take anything to keep this atomic leap-frogging going. Each small micro-part of me would just keep on dancing. And somehow at the same time, it’s exactly as if I were staying still all along – there’s no real difference. This has been tested. But if for any reason a change of tempo was required, to move from a waltz to a samba, something intangible and ephemeral – mysteriously lumped into the word “energy” – is required. It’s weird and mystical and it doesn’t make a whole lot of sense, but it can be measured and described – and relied upon. And as I sit here and write, using the word “I” like that, it feels like there’s a “me”. It seems as if there’s a real sense of ownership over my thinking, I can’t tell where my thoughts come from, but they are mine. I feel like I’m free to choose things, when I get off of the train I could get straight into a taxi or walk to a tram. I don’t know which I’ll do yet, there are a lot of factors to consider, but neither choice seems pre-ordained, and once it happens it will feel like “I” owned that choice. “Free will” seems as real to me as the forward motion of this train, it’s an inexplicable dance, but it’s my tune. And that makes less sense. Reality is describable by all of these symbols and equations, and we can make predictions with them. This has been tested. There are even divisions between what is predictable in detail, and what is predictable only in the aggregate, random and unknowable at the finer grain. Here in my head, it doesn’t seem like my thoughts are predictable, or could be. If they are just an inevitable cadence, then that “I” is merely an illusion. It also doesn’t seem like thoughts are random or unknowable, to me anyway. I have patterns of thinking, recurring themes, and a detectable personality. It can’t be the same dance. How many miracles are there on this train? Motion is miraculous enough, that “I” can think, that the universe even exists seems miraculous too. But most astonishing, is that these simple realities are seemingly contradictory. I’m still on that train, I haven’t gone anywhere, there haven’t been any angels visiting me, or deitious interventions. These thoughts haven’t led anyone to complicated dogmas, wars or ceremonies. Some say science is cold. Maybe the dance is a myth. And if this mental dance should one day just end, where are the real love-songs? Songs that speak to the real meaning, the genuine warmth, of spending a fleeting, passing, bittersweet whirl around the floor with someone. Not an infinitesimal slice of eternity living in hope of a better dance; that’s cold. This has been tested. [...]

Role Models

2009-12-13T20:24:14Z

Period Pain 3

2009-11-26T22:54:36Z

Period Pain part 2

2009-09-27T12:52:45Z