Accuracy of Wikipedia

2008-03-24

Wikipedia is too often treated monolithically when its accuracy is being assessed. It shouldn't be, as Wikipedia is a common technical and regulatory system for encyclopaedic articles, and the accuracy of any given article is a function both of the behaviour of the Wikipedia system and the article's own subject matter.

Like the Linux kernel, it's possible for an individual working alone to make a small incremental improvement to Wikipedia, and for the beneficial improvements to be detected and preserved by others, such that they preponderate. This is what Yochai Benkler calls peers-based commons production in Coase's Penguin.

Wikipedia's regulatory system has a policy and an enforcement mechanism.

Wikipedia's policy is called NPOV, Neutral Point of View. This may sound like a sincere attempt to be objective on matters of fact, but in principle means to be agnostic as between the truth and its alternative. I have no conclusive evidence for the following smear, but the reason for this agnosticism is that Wikipedia's creator, Jimbo Wales, is or was an "Objectivist", an adherent of an anti-authoritarian anti-collectivist pseudo-philosophy invented by novelist Ayn Rand, with whose works I am unfamiliar as to read them would see me stripped of my degree faster than throwing a brick through the Senate House window or watching rugby league.

Wales doesn't want an authoritarianism about what is true, so Wikipedia remains neutral as to "point of view". With minimal funding, of course, Wikipedia can't actually afford a truth-reckoning authority, but that is beside the point, because in practice, NPOV is pretty benign.

Wikipedia enforces its policies; there's a dispute-resolution procedure. I've used it. It sucks. Material can be forcibly removed from Wikipedia on grounds of NPOV-violation, falsehood or non-notability. The cost of contesting a strongly-fought dispute of this character is many hours of one's time. It's not worth it.

The relative costs and benefits of truth and falsehood in a Wikipedia article depend on the subject matter. The better our understanding of something, or the simpler it is, the lower the cost of maintaining an accurate Wikipedia article about it. (Simple and well-understood are not the same: quantum gravity is doubtless trivial; turbulence is complex but we know a huge amount about it; biology or economics on the other hand presently have neither characteristic.) There are few benefits to anyone of inaccurate Wikipedia articles about the physical sciences or mathematics.

Quite the opposite obtains in the case of history, politics, religion, linguistics, et c. Linguistics? I like to tell people that the point of historical linguistics is rape and pillage, and certainly in areas of the world where ethnic tensions are high, linguistic/onomastic arguments (e.g., your village's name is actually from the language of my ancestors, ergo ...) are used to inflame passions. It is easier to come up with a plausible historical linguistic argument than it is to plausibly refute one. People sufficiently motivated by contrarianism or a desire to fit in with the local racists can combine this fact with the general, non-domain-specific, tools of intellectual dishonesty are not going to be stopped by Wikipedia's wrist-slapping dispute resolution procedures.

This is why hard sciences and maths coverage on Wikipedia is very good, and some linguistic articles are an affront to the human capacity to perceive and reason about the external world. The good reputation of Wikipedia's physical science articles will attach to the lies of the ethnic cleansers, and vice-versa.

Unpacking "identity"

2008-03-24

The Big Issue recently carried a so-called "debate" on ID cards between Mike Parker of NO2ID and Meg Hillier, the relevant Government minister. Much of Hillier's writing merely states what the Government is threatening to do, rather than attempting to justify it, so it's not a debate in the proper sense.

Hillier is actually "Parliamentary Under-Secretary of State for Identity", which is horrifying to me, as I don't believe that "identity" exists. I'll admit that it's a useful concept in maths and logic, but that's not what is meant here. What people mean by the word seems to be either something like "race", "culture", "ethnicity", "nationality", "gender", "sexuality", "class" or "government-issued documentation making assertions about a particular human being". The latter is sometimes a development of the all-too-common mistake that truth is what the government says it is (people don't really think this; it's what happens when they fail to think), or the definition of words and concepts is properly the responsibility of the government (some people really desperately want this to be true, and for some countries it is).

Since the minister's title precedes her argument on p4 of the 17 March edition of the paper in question, it's possible to disagree with her premisesbefore she's even started writing. She begins:

It's simply inconceivable that in a modern society, people do not have a single, simple, safe way of securing and verifying their identity. We all need to be able to prove who we are---when travelling, opening a bank account, getting a new mobile phone or applying for a job.

How many times has human thought gone wrong in those 51 words? (The whole article is about 750 words long.) I don't think it's worth counting, but just as an exercise, I shall try to identify some of them.

First, "inconceivable" is hyperbolic. We're already in the situation she describes, so if she cannot conceive of that situation, she cannot conceive of the material universe as it currently exists, and something is badly wrong with her brain and she should pursue a different occupation such as painting. What she means is that a means of verifying identity is necessary for some people and that is it wrong, in an egregious way, if for some reason this isn't possible. That might not actually be unreasonable. I just don't think it's as necessary as she does.

Whenever I read word "modern" I always mentally substitute the phrase "post-Renaissance" and see what I get. People try to use the word to mean whatever they want it to mean, and so it has come roughly to mean "in the last five or ten years, or at least since the election of the Blair or Attlee governments".

So, on to "single, simple". Why should Hillier's constituents get ripped off by a "single" public or private identity-verification monopolist? We are never told.

Why should any such system be simple? Take all the pairs of principals who wish to participate in a transaction in which "identity" is "verified". This is quite hard for me as I can't think of any legitimate examples, but let's take Hillier's example of opening a bank account. The only things the bank cares about (when not giving credit) is really your address and some means of working out that future withdrawals are authorised by you. It may be forced to care about "who" you are, e.g., what your name is by external factors such as regulatory punishments. So, the set of all instances where a particular bank and a particular person wish to open an account, and the set of all analogous circustances in other situations. (It should be noted that the banks are mainly after proof of residence rather than proof of identity, which is what all those utility bills are examined for). We're already in complex territory, and only dealing concretely with bank accounts. Why is the degree of complexity in verification not properly a matter to be decided on a transaction by transaction basis? Some banks and some customers might want to use much more complex means of checking the bona fides of the other party in certain circumstances. Then again, the minister probably didn't even think as she typed the word "simple". It'll be there just because some better alternative to the Government's scheme is more complex.

What on earth does it mean to "secure" an identity? To get to the bottom of this, one needs to come up with a concrete description of what these people who believe in "identity" mean by it, if that's possible. I think what is meant is a sort of metaphysical object, outside physical reality, which is or represents a unique human individual. It's the sum of all the things like one's name, address, parents, serial numbers issued by various bodies. A soul in an age of unbelievers, composed of information. I note in passing that before Plato, atomists thought the soul was made up of matter like the body. This "identity" concept seems to be a set of pieces of information, some of them intimately concerned with physical material (biometric data). Ultimately, an identity in this sense is reducible to a single, very large, unique number. Is it not just an abuse of the word "secure" to apply it to such a thing? What is presumably meant is to impose access control on the physical objects known to be containing copies of this data. "Security" is sometimes treated as a black and white, boolean concept; either something is secure or insecure. This is nonsense. One must ask "secure against what?" and calculate, in respect of each threat under consideration, whether the cost of countering the threat exceeds the harm otherwise obtaining, taking into account the probability of such harm, which is a often a function of the costs and benefits of causing it.

It is just pathetically vague to say "We all need to be able to prove who we are". The three or four examples given of circumstances in which this need supposedly arises are ones in which the "need" is imposed by the Government, and in some cases imposed partially with the intention of increasing the demand for identity-verification services! We don't need to prove "who we are", whatever that means. We just need to satisfy the other party in a situation of whatever it is he wants to know, often name and address, or length of residence, or creditworthiness, not "identity".