thinking

My Photo
Name:
Location: Mountain View, California, United States

thinking := [life, games, movies, philosophy, math, coding, pizza, &c.]

Monday, March 26, 2007

angle between subspaces

Someone asked me this question today, which turns out to have a very elegant solution if you know a "decent amount" of linear algebra (say, at least one advanced class about matrix computations):

Given two subspaces of Rn, what is the angle between them?

To be rigorous, we could state this as: how can you efficiently find the minimum angle between any two nonzero vectors, one in subspace A, and the other in subspace B? I choose the minimum here because I feel that it corresponds most closely with our intuition in 3D in the case of a plane vs. a line. Given a line that intersects a plane at a single point (maybe picture a sundial in your mind), we usually think of "the angle" between the two as the smallest angle. It also seems more distance-like than the maximum angle since, for example, we expect the "distance" between a subspace and itself to be zero, which is true for the min angle, but not necessarily the max.

I'll post what I think is a nice solution in a little bit. In the meantime, can anyone think of how to answer this?

Friday, March 16, 2007

Strings and knots

A friend of mine at work recently asked me the following problem, which was supposedly given as an interview question (not at my company):

Suppose n strings (physical strings, not character arrays!) are placed in a bowl so that we only see the ends of each string, and do not know which ends belong to the same string. Next we proceed by picking two ends at random and tying them together, and continue to do so until all the ends are tied. On average, how many loops will you end up with?

As an example of how it might work: if n=2, then there are 4 string ends left hanging out of the bowl. When we pick the first two ends and tie them together, either we are making a loop out of a single string, or we are tying two different strings together. Next we tie the last ends together (there are only two ends left so there is no choice). If our first knot was a self-knot that made a loop out of a single string, then there will end up being two loops in total. But if our first knot was between two different strings, then the remaining ends joined will result in a single long loop containing both strings. So the average in this case is somewhere between 1 and 2 (it's a fraction).

I will not give away the answer just yet! If anyone would like to know, just post a comment, and I'll put it up.

Update: I've written up the answer. If you've thought at least a little bit about the problem, then you have permission to check out the answer :)

Wednesday, March 14, 2007

happy pi day!

:)

Monday, March 12, 2007

the subjective nature of causality

I would like to argue that the idea of something "causing" another thing is not an intrinsic property of the universe -- rather, it is a subjective property which is somewhat artificially assigned to the world by our own minds. It is a "useful fiction", much like the idea of a center of gravity in physics. So I do not mean to say that causality is nothing, but rather that it is an idea we use to simplify our understanding of the world, and which does not inherently exist outside our interpretations.

This is why: we feel that event x causes event y when
  1. y would have not happened without x, and
  2. we feel that "x almost didn't happen".
Both of these statements are purposefully vague, because the idea of causality is itself vague. The first point means that, if we imagine the world exactly the same, except that x did not happen, then it seems to us that y would also have (most likely) not happened. The second point means that, along the chain of events leading up to y, x was somehow among the least predictable. If the second point holds for several events as a possible cause (I'll give an example in a second), then we often feel that each of these events was also a cause of the ultimate event y.

In a moment I will argue that these points are dependent in a fundamental way on the ignorance of the one making the causality judgement, because if we "knew everything" then it would be hard (perhaps impossible) to imagine a world in which everything else was the same except that x didn't happen, and, further, it would be truly impossible to distinguish some event x which more "almost didn't happen" than any other past event. (I am assuming a deterministic world here -- and that is a matter for a different posting!)

Here are a couple examples to help get our mind around this intuitive "definition" of causality:
Example 1. Suppose someone named Boberta shoots an apple. In this very straightforward scenario, we might say that the bullet rather speedily touring the apple's interior is a cause of the apple's demise. But what is the "real reason" the apple was destroyed? I think most people would be more satisfied with saying that "Boberta shot the apple" is more of a "real" cause. Why is this a better answer? By the second point above, it is easier to imagine that Boberta didn't shoot the apple than it is to imagine that Boberta did shoot the apple yet the bullet did not destroy it. (Let's think of the shooting as being at very close range, so that aim is not a question.) Out of those two events, it would be less surprising if Boberta never shot the gun than if Boberta did shoot the gun and the bullet did not destroy the apple. This illustrates how the second point is important for our idea of causality. (By the way, if you are wondering why Boberta shot the apple, it's because the apple killed Boberta's father. It was a bad apple.)

Example 2. Now let us suppose that both Alfonzo and Boberta, with a merciless thirst for applejuice, simultaneous shoot the apple. In this case it intuitively feels as if either "Alfonzo shot the apple" or "Boberta shot the apple" alone are each not quite the cause of the apple's demise. In some sense, we feel a bit cheating if we do not mention both shootings. This is meant to illustrate the importance of point one -- if only one of these events didn't happen, the apple would still be maimed. Only when we imagine both shootings not taking place do we imagine the apple remaining, at least for a short while, peacefully intact.

So far I have offered some defense of my definition of causality. Now let me proffer the notion that "x caused y" is inevitably a subjective idea. This follows because our ability to imagine a slightly different world, in which either x did not happen and all else remained the same, or further our ability to estimate the probability with which x almost didn't happen, could never exist in the presence of a perfect knowledge of the world. That is, our ignorance is an integral part of our causality judgement.

Example 3. This one is a little more far-fetched in order to really capture ignorance as an almost essential property of someone in the situation. We've all seen The Matrix, our pop culture modern version of Plato's allegory of the cave. Suppose that in the future we are running a highly complex simulation of an entire world on a vast system of computers. This world is completely deterministic. And the creators of this simulated world have set it up so that at some point in time, the simulated planet of all these simulated people will be hit by a (simulated) comet, leading to the deaths of many (simulated) people. The question is: what is the cause of all these deaths? Within the simulated world, as far as these people ever know, the comet has caused this disaster. Certainly if the comet were not present (or had missed the planet), it would not have happened (point one of causality). And, for most people, it seems very easy to imagine a world in which a particular comet did not exist, or at least whose trajectory was a little bit different (point two of causality). Now let us recall that we know the entire world to be a simulation. In this example, we could also pretend that the entire simulation was set up by a rather sadistic fellow who enjoyed the idea of a planet being devastated by a comet. Now that we know more about the world, we can give a better-informed "real cause" of the destruction to be this programmer's decision to set up the simulated world in such a way.

The point of this example is to illustrate the role of ignorance in causality. No one in the simulated world could ever possibly know about the sadistic programmer unless someone in the real, non-simulated world decided to somehow interfere (and for the sake of the thought experiment we may suppose this does not happen). Hence within the simulated world, there is simply no way to ever know a better "cause" of the comet hit beyond the mere laws of physics within that world.

As far as we may ever know, we might as well be these simulated people, never knowing the complete story, but filling in those gaps in our minds by guessing at the small differences in which the world may evolve, and basing our causal links on our own internal model of many possibilities and our own idea of their probabilities. Yet these possibilities and probabilities could never exist with complete knowledge of the world, for there is but one world; no other possibilities, and no question of probability.

By the way, I suspect there's a lot of pre-existing philosophical work discussing similar ideas, and I'd love to hear about them! Also, thanks to my friend Lara for some useful brainstorming while thinking about this, and Rebecca for reminding me of Plato's somewhat-Matrixy allegory :)

Friday, March 09, 2007

the (un)expected value

the mean and the nice

The term "expected value" has a very precise meaning in probability theory, but I think it often clashes with what we as everyday humans would really "expect" from certain probabilistic situations.

To avoid confusion, I will refer to the traditional definition of expected value as the mean (this is also standard terminology [wiki]) and contrast it with a new idea which I will here refer to as the nice.

Let's see why the mean is not always what's expected. Consider a game of chance which you can play for free. Here's how it works: you flip 20 coins. If any of them are heads, you win a million dollars. If they're all tails, you owe 10 trillion dollars (yes, you're really screwed if you lose). Would you play this game? I would, and I think that many people would happily do the same. The chances you'll lose are less than one in a million (since (1/2)^(20) < 1/million). Yet the mean value is negative, since the loss of -10 trillion outweighs the far-more-likely win of a million. Intuitively speaking, what value do you really expect? I think most people would agree that they are least surprised by the outcome of winning a million dollars.

Hence I am hoping to mathematically capture this notion of the "least surprising" outcome, which we will call the nice value. Here are some properties the nice value should have:
  • The values close to the nice value should be far more likely than those far away from the nice value.
  • The nice value should be an actually possible value in the probability distribution.
  • It should be understood that some nice values are "expected" with much less confidence than others.

I mention the last point because the above game is relatively easy to predict. Consider rolling a six-sided die. What should be the nice value? I think in that case, intuitively speaking, there is no best value. We expect any of the 6 numbers to show up with equal probability. So there are many probability distributions in which we have low confidence about the outcome. The motivation for a nice value is targeted toward probability distributions which are concentrated around a particular value.

a gambling strategy

Part of my motivation for thinking about the nice value is the following old "guaranteed-to-work" gambling strategy. Suppose you're playing a double-or-nothing game with 50/50 odds. In other words, you bet a certain amount, flip a coin, and if it's heads you get twice your money, tails you lose all your money. (So the mean value is zero, since half the time you lose x dollars and the other half you gain as much.) Here is the strategy: first you bet x. If you win, you stop, and you've successfully won x dollars. If you lose, you bet 2x. If you win, your net gain is now 2x-x = x, and you stop. If you lose, you're down by 3x. So bet 4x. If you win, you're up by x, and you stop. Et cetera. You just keep doubling your bet until you win, and you're "guaranteed" to end up winning x dollars.

What's the catch? The problem is that you could run out of money before you ever win. Suppose you start with 6x dollars. Then there's a 25% chance that you'll lose twice in a row, in which case you won't be able to afford doubling your bet again. If you play this strategy long enough, you're essentially guaranteed to go broke.

However, what this strategy does achieve is a way to effectively turn this essentially unpredictable game into a (possibly longer) game with a single very likely outcome -- that you'll end up winning x dollars. Specifically, let's say you start with 63x dollars. Then you can afford to lose up to 5 times in a row and still ultimately win overall in the last game (with a bet of 32x on that last game). This means you'll net x dollars profit with probability 63/64 = 98.4375% chance. So I would call x the nice value of that game.

By the way, you may be wondering why people don't use this strategy all the time if it sounds so tempting? Well, in the long run, it's actually not a great strategy. I think it's a good one-time strategy, but not something you'd like to repeat. For example, suppose you go to a casino and use this strategy once a month. By the end of 4 years, it's most likely that you'll have lost at least once. Even if you only lose once, and win the other 47 times, you've still lost overall since a single loss means losing 6 times in a row, or a loss of 63x. This perspective shows that the mean value is the "right value" for long-term additive behavior, while the nice value is meant to express a one-time value which is the least surprising outcome.

toward a rigorous definition of the nice

Suppose we have a metric space X and a probability distribution P() on X. (I also assume we have a measure space on X including all spheres generated by the distance function of the metric.) Given any x in X, we can measure its niceness via the function f_x:R → [0,1] defined by

f_x(z) = P({y: dist(x,y) < z}).

Basically, f_x() is an increasing function which (if no distance in X is infinite) asymptotically approaches 1. The faster f_x() approaches 1, the nicer x is -- that is, the less surprising it would be that x is the outcome. The trick is that functions are not easily put into a linear ordering, so it is hard to say which value of x is the nicest. Beyond that, we can easily imagine situations in which many values of x have the same function f_x() -- for example, rolling a die. But when there is a value x in X which gives the greatest function f_x(), we can safely say that that value is the nicest.

Future work: try to define the nicest value in other cases!

Sunday, March 04, 2007

corner bistro

To those in search of an excellent new york city cheeseburger: may I recommend to you the Corner Bistro, happily located a mere two blocks from my place of abode (not that this proximity affects in any way the taste of said burger).

Vegetarians, lovers of healthifulnessity, deniers of hedonism: avert your fair untainted eyes.



Meandering the serpentine streets of the west village, your first hint of grilled-bovine salvation is a concise neon appellation: "Corner Bistro", hold the serifs. Once inside, this sans frills style saturates the scene. A lazy haze of smoke once permeated the very soul of this cozy joint -- haze illegal now, its ghost remains in the nostalgic distance which, in your mind alone, separates your suddenly-anachronistic chic from the grimy aged authenticity of the regulars around you. The bar's cash register, a metalic curve of plastic mushroom buttons, is easily a century old if a day. A shaded flourescent bulb dimly illuminates a ten-line "menu" on the wall, consisting primarily of four varieties of burger. The missing 'i' in "chiliburger" imparts a lacunal singularity of character. An impressively laconic waitstaff waves your thankfully small party down a narrow coat-racked hall, adorned most curiously by a row of Heinz ketchup bottles -- each, label facing out, standing at attention, patiently awaiting its ephemeral tabletop role.

Be seated, or rather scrunched, at your table, and choose your fare; a 'bistro' means baconcheeseburger, or hold the pork, or the cheese or both or add some chili. Curiously baconic fries? Grilled cheese? A drink or two or so away, and you may wonder for a moment if that last fleeting gesticulation of your server was acknowledgement of an order soon fulfilled or was it just your hope assigning undue meaning to a twitch as they flurried away, order unheeded. But here! Yes, they heard! Half a pound of freshly-grilled, juice-imbued, cheese-melting, tomato-strewn, barely-bun-encompassed, beautiful beef sirloin sits steaming snuggly under your overwhelmed olfactory-orifacing salivation-inducing proboscis dinostrilonimus (don't worry I completely made up that last word). Now here we find a burger which, viewed in any cross section of significance, is still beef by vast majority: in volume, density, height, diameter, and sheer incorrigible flavorocity. Every bite produces from its opposite edge another egress of warm oily beef-nectar, which in turn deposits itself, filtered lingeringly along your fingertips, into a congealing puddle of wisely ignored goop atop your paper plate.

Upon completion of this iniquitous consumption, it will eventually occur to you that you must now open your eyes, pay the bill, and peacefully resume your normally scheduled life.