Previous Entry Share Next Entry
.. while this two hour query runs..
2012
unknownj
The thing with analysis is that you have to appreciate cause and effect.. Along that theme, I did a profile comparison report on People Called James, versus People Called Naomi, to prove my point..

The following root demographics apply:
People called Naomi are more likely to be aged 0-34
People called James are more likely to be aged 45+

People called Naomi are more likely to be female
People called James are more likely to be male
By that, I mean these are unchangeable facts about these customers that are set in stone from birth. Most of the time, these demographics override every other property that an individual can have.. Being called James doesn't make you male - being male was what caused you to be called James. Similarly, being called Naomi doesn't mean you were born in the 70s and later. The time you were born, and the common names at the time, is why you're called Naomi.

Then, looking at the profiles, some more facts come out:
People called Naomi are more likely to be single
People called James are more likely to be married

People called Naomi are more likely to be high risk
People called James are more likely to be low risk

People called Naomi are likely to be low earners
People called James are likely to be high earners
So there we have it, people called James earn more, hooray! And we're more likely to get married!

Except, that's not the case at all.. People called James are more likely to be married, because they're found in the 'Males over 45' group, which tends to be married. People called Naomi are more likely to be single because they're 'Females under 35', not because they're called Naomi.

Similarly, the risk and earnings statistics are based not on the property that I'm differentiating between (first name), but in fact based on the age/gender property.

So really, age and gender drives choice of name. Age and gender drives earnings. Using conditional probability, you can make some assumptions about the earnings of somebody based on their name, but there is no cause & effect link between the two. They are both effects of a separate cause.


This comes into play a lot in my job.. Customers using a certain product seem to be higher risk customers. But when you look at it again, you see that it's just that they're younger customers, which skews the risk statistics. The product is not attracting people based on risk - it is attracting a certain type of customer, and they're bringing their other characteristics with them. These characteristics are purely coincidental, and play no part in their choice of product.


This sort of analysis has interesting applications in racial profiling. As far as I'm concerned, it's a perfectly valid initial technique, but it is open to drastic misuse and can build up feedback loops. Imagine the following scenario:

- The computer tells you that 80% of arrests for theft of cars belong to ethnic group A
- You concentrate your efforts in pulling over people in ethnic group A, for best results
- You catch lots of people from A who have stolen cars, thus validating the model

It doesn't matter how many people in ethnic group B are stealing cars, as soon as the model is used to predict future behaviour, it gets into a loop of self-fulfillment. Add to that a potential for willful misinterpretation of results based on political or racially motivated agendas, and it's a bad thing..

Nevertheless, take this example: You know a young man has stolen a car. You know that there is a direct correlation between car theft and poverty levels, ergo it is more likely that the criminal grew up in poverty. Statistics show that there is also a direct correlation between ethnicity and poverty levels. It's hard to see a purely logical argument for why you wouldn't initially suspect somebody with a specific ethnic background..

And yet, ethnicity is not a direct cause in any of this. Ethnicity lends itself to certain more probable backgrounds, which themselves lead to predictable patterns of behaviour. To draw any conclusions about somebody based on race would be to generalise and stereotype, reducing them to a statistic rather than an individual. Which is perhaps the human argument against the whole thing - to do this is to pre-judge somebody based on circumstances beyond their control.

In any case, it's reasonably true that race implies circumstances, and circumstances imply behaviours. In the case of crime, surely prevention is better than a cure. To cure it, I guess you possibly should target people of certain races. But to prevent it, and to fix the problem that causes it, you need to target those same people, and improve their quality of life.

I guess when you put it like that, it's fairly simple.. Either wait for people to commit a crime, then punish them, or get to them in advance and give them a better standard of living, so that the crime doesn't happen.. If only they'd bother to use racial profiling when looking at things like local funding initiatives...

  • 1
I almost felt compelled to respond in a mature fashion to your well reasoned and thought provoking post. Alas, that would require the strenuous effort of typing it. Besides, instead of thinking about it you should have just talked to Naomi - she could have told you that black people all need rounding up and gassing, none of this pre-emptive targeting bollocks, which is clearly a massive WOFTAM.

  • 1
?

Log in

No account? Create an account