Every man is entitled to his opinion ...

2008-09-25

... taken down and used against him.

A species of awfulness seldom considered in the excitement about social networking is what may happen to online discussion as it becomes cheaper to associate people with their previous Internet activity.

Firstly, if online discussion sites require people to authenticate themselves against operating web-of-trust systems such as Facebook, certain types of irritating behaviour will be dissuaded: if your employer can read that you're running a campaign to fill Wikipedia full of false or opinionated drivel, you're less likely to do it. Depseudonymisation won't change the rules of the game, but it will change the score.

More disturbingly, discussion site operators may try to use stylostatistic profiling and heuristics to moderate or exclude access. The possibilities are broad: statistically, anyone who has used any of the phrases "Ponzi", "fiat money", "superstate", "an ethics of", "a politics of", indeed any plural abstract noun ending in "ic" preceded by the indefinite article, is unlikely to be worth conversing with. I don't have the figures, but I'm prepared to bet serious money that anyone who uses the phrase "fiat money" wants lower taxes and has very particular views on monetary policy, anyone who says "superstate" is an old school Eurosceptic, and anyone who says "a politics of" is or was a lefty. The usage of active verbs with inanimate or even abstract subjects, e.g., "democracy claims that ...", "capitalism claims that ...", is probably correlated with a holistic point of view, and thereby to collectivism.

More abstractly, it will be possible to take stylistic elements, such as the conformance of punctuation to the literary standard, number of subclauses, etc, to generate an overall linguistic competence score. A potential application of this sort of analysis would be for screening CVs: if you don't want to employ someone likely to be sympathetic to unionisation, then you type their name into some website, and it could give you a probability for how closely their writing conforms to the traditional literary standard (a proxy for private secondary educational background, which is itself correlated with political views) before any analysis of the substance of their writing is necessary. We could be facing "credit score" like metrics for many aspects of linguistic expression: eventually it may be a legal requirement for reference agencies to comply with demands that writing misattributed to individuals be dissociated from their records, the way credit reference agencies currently must in relation to financial data.