Posts Tagged ‘biases’

People Hearing Without Listening

August 21, 2013

I’ve seen several links and discussions today to this paper about judging classical music competitions.

The experimenter had people observe clips of musicians in competitions, then guess how well the musicians placed. Subjects guessed better when given video-only clips as compared to audio clips or audio+video clips. Conclusion: people care about looks far more than they think or admit they do.

But I think we can’t jump to such a conclusion based on this paper for a few reasons.

First, the clips were taken from the top three places at prestigious international competitions. These people are already the very best; there was probably very little variation between them. If we rate the auditory quality of the music they played out of 100, maybe they’re at 94, 95, and 96, or something. It’s not surprising that experts didn’t accurately judge who would win based on sound.

The failure of audio clips to predict competition placement is similar to how SAT scores aren’t very good predictors of the performance of Caltech students. If you took randomly-selected students from everyone applying to college and admitted them to Caltech, SAT score would be an excellent predictor of their success. But Caltech only admits people with very high SAT scores to begin with, so there’s not that much variation available to do the predicting.

Meanwhile, the variation in how the musicians move and express themselves physically could potentially be large – 50, 70, 90, for example. So even if judges base their scores mostly on the quality of playing, the visual aspect can still dominate the final rankings. The data don’t support the author’s claim “the findings demonstrate that people actually depend primarily on visual information when making judgments about music performance.” To show that, you’d need to show that visual information still trumps auditory information even when the players are not at about the same level. And it’s not like people with visual information did very well – they got to roughly 50% accurate. If you go from a distribution of 1/3 -1/3-1/3 to 1/2-1/4-1/4 you’ve reduced your entropy by about five percent.

Additionally, the clips used in this paper were six seconds long. So what we’ve shown is that you get a better quick, gut-instinct impression with visual than with auditory, but this doesn’t say a whole lot about the judges who were watching and listening to the entire performance.  (Edit: as a commenter pointed out, the paper contains a vague description of the results holding with clips of up to one minute.)

Perhaps visual aspects of the performance are correlated with auditory aspects. Further, maybe six seconds is enough time to get a good feel for the visual aspects, but not the audio aspects (six seconds might not even be one entire phrase of the music). In that case, expert judgments during competitions could be based almost entirely on the audio aspect, but people would still predict those judgments better from videos.

It’s interesting that people were bad at predicting which choice (audio, visual, audio+visual) would give them the best results, but people have very little experience with this contrived task, so it’s not especially surprising. Further, I think the conclusions of the paper are probably true – visual impressions matter a lot in music performance, but I hold that belief based on my general model of how people work. The evidence in this paper is somewhat lacking, and it’s disappointing that a news source like NPR fails to state the important fact that the clips were not complete recordings, but very short, six-second impressions.

Elsewhere:

NPR

John Baez

Robin Hanson

Advertisements

On the Height of a Field

January 1, 2013

This is a short story about belief and evidence, and it starts with the GPS watch I use when I go for a run. Here’s the plot of my elevation today:

runElevation

It looks a little odd until I show you this map of the run:

runMap

Each bump on the elevation plot is one lap of the field. In the middle, I changed directions, giving the elevation chart an approximate mirror-image symmetry. (I don’t know what causes the aberrant spikes, but my friend reports seeing the same thing on his watch.)

According to the GPS data, the field is sloped, with a max height of 260 feet near the center field wall and 245 feet near home plate. It’s insistent on this point, reiterating these numbers each time I do the run (except once when the tracking data was clearly off, showing me running across parking lots and through nearby buildings.) I disagreed, though. The field looked flat, not sloped at 3 degrees. I was disappointed to have found a systematic bias in the GPS data.

But I occasionally thought of some minor consideration that impacted my belief. I remembered that when I went biking, I often found that roads that look flat are actually uphill, as can be verified by changing directions and feeling how much easier it becomes to go a given pace. I Googled for the accuracy of GPS elevation data, and found that it’s only good to about 10 meters. But I didn’t care about absolute elevation, only change across the field, and I couldn’t find any answers on the accuracy of that. (Quora failed me.) I checked Google Earth, and it corroborated the GPS, saying the ground was 241 ft behind home plate and 259 in deep center field. But then I read that the GPS calibrated its elevation reading by comparing latitude/longitude coordinates with a database, and so may have been drawing from the same source as Google Earth.

People wouldn’t make a sloped baseball field, would they? That would dramatically change the way it plays, since with a 15-foot gain, what was once a solid home run becomes a catch on the warning track. Googling some more, I found that baseball fields can be pretty sloped; the requirements are fairly lax, and in fact they are typically sloped to allow drainage.

I was starting to doubt my initial judgment, and with this in mind, when I looked at the field, it made more and more sense that it’s sloped. Along the right field fence, there’s a short, steep hill leading up to the street. It’s about five feet high and at least a 30-degree slope. It’s completely unnatural, as if it exists because the field as a whole used to be considerably more sloped, but was dug out and flattened. The high edge of the field was then below street level, so there’s that short, steep hill leading up. And if the field was dug out and flattened, maybe they didn’t flatten it all the way. The entire campus is certainly sloped the same general direction as the GPS claimed for the field. It drops about 70 feet from north to south, and it’s frequently noticeable as you walk or bike around. There’s another field I run on with essentially the same deal, and I found that when I knew what to look for, I could indeed see the slope there.

Eventually, the speculation built up enough to warrant a little effort to make a measurement. I asked a wise man what to do, and he suggested I find a protractor, hang a string down to detect gravity, and site from one side of the field to the other. I did so, expecting to feel the boldness of an impartial, truth-seeking scientific investigator as I strode across the grass. That wasn’t what I got at all.

First, I felt continuous fluctuations in my confidence. “I’m 60% confident I’ll find the field is sloped,” I told myself, then immediately changed it to 75, not wanting to be timid, then felt afraid of being wrong, and went back to 50. I’ve played The Calibration Game and learned what beliefs mean, and mostly what it’s done is give me the ability to not only be uncertain about things, but to be meta-uncertain as well – not sure just how uncertain I am, since I don’t want to be wrong about that!

Second, I felt conflicting desires. I couldn’t decide what I wanted the result to be. I wanted the field to be flat to validate my initial intuition, not the stupid GPS, but I also wanted the field to be sloped so I could prove to myself my ability to change my beliefs when the evidence comes in, even if it goes against my ego. (A strange side-effect of wanting to believe true things is that you find yourself wanting to do things not because they help you believe the truth, but because you perceive them to be the sort of things that truth-seekers would do.) I recalled a video I had seen years ago about Gravity Probe B, and the main thing I remembered from it was a scientist with long, gray hair and huge unblinking eyeballs explaining in perfect monotone that he didn’t have a desire for the experiment to confirm or refute general relativity; he only wanted it to show what reality was like.

On top of all this, there was the sense of irony at so much mental gymnastics over a triviality like the slope of a baseball field, and the self-consciousness at the absurdity of standing around in the cold pointing jerry-rigged protractors at things. So at last I crossed the field and lined up my protractor for the moment of truth

It didn’t work. I had placed my shoes down on the grass as a target to site, but from center field they were hidden behind the pitcher’s mound. I recrossed the field and adjusted them, and went back. I still couldn’t see the shoes; they were too small and hidden in the grass. I could see my backpack, though, so I sited off that. But it still didn’t really work. I didn’t have a protractor on hand, so I had printed out the image of one from Wikipedia and stapled it to a piece of cardboard, but the cardboard wasn’t very flat, making siting along it to good accuracy essentially impossible.

I scrapped that, and after a few days went to Walgreens and found a cheap plastic protractor and some twine that I used to tie in my water bottle as a plumb bob. Returning to the field, I finally found the device to be, well, marginal. Holding it up to my eye, it was impossible to focus along the entire top of the protractor at once, and difficult to establish unambiguous criteria for when the protractor was accurately aimed. I was also holding the entire thing up with my hands, and trying to keep the string in place between siting along the protractor and moving my head around to get the reading.

Nonetheless, my reading came to 87 degrees from center field to home plate and 90 degrees from home plate back to center field. This three-degree difference seemed pretty good confirmation of the GPS data. In a final attempt to confirm my readings, I repeated the experiment in a hallway outside my office, which I hope is essentially flat. It’s 90 strides long, (and I’m about two strides tall) and I found 88 degrees from each side, roughly confirming that the protractor readings matched my expectations. (I’d have used the swimming pool, which I know is flat, but it’s closed at the moment.)

I’m now strongly confident that the baseball field is sloped – something around 95% after considering all the points in this post. That’s enough that I don’t care to keep investigating further with better devices, unless maybe someone I know turns out to have one sitting around.

Still, there is some doubt. Couldn’t I have subconsciously adjusted my protractor to find what I expected? There were plenty of ways to mess it up. What if I had found no slope with the protractor? Would I have accepted it as settling the issue, or would I have been more likely to doubt my readings? It’s perfectly rational to doubt an instrument more when it gives results you don’t expect – you certainly shouldn’t trust a thermometer that says your temperature is 130 degrees – but it still feels intuitively a bit wrong to say the protractor is more likely to be a good tool when it confirms what I already suspected.

The story of how belief is supposed to work is that for each bit of evidence, you consider its likelihood under all the various hypotheses, then multiplying these likelihoods, you find your final result, and it tells you exactly how confident you should be. If I can estimate how likely it is for Google Maps and my GPS to corroborate each other given that they are wrong, and how likely it is given that they are right, and then answer the same question for every other bit of evidence available to me, I don’t need to estimate my final beliefs – I calculate them. But even in this simple testbed of the matter of a sloped baseball field, I could feel my biases coming to bear on what evidence I considered, and how strong and relevant that evidence seemed to me.  The more I believed the baseball field was sloped, the more relevant (higher likelihood ratio) it seemed that there was that short steep hill on the side, and the less relevant that my intuition claimed the field was flat. The field even began looking more sloped to me as time went on, and I sometimes thought I could feel the slope as I ran, even though I never had before.

That’s what I was interested in here. I wanted to know more about the way my feelings and beliefs interacted with the evidence and with my methods of collecting it. It is common knowledge that people are likely to find what they’re looking for whatever the facts, but what does it feel like when you’re in the middle of doing this, and can recognizing that feeling lead you to stop?