I always find these very interesting and allows me to easily see things more clearly. Two piping judges just are not enough for high stakes competitions. I say make band comeptitions at LEAST four piping, three ensemble and three drumming.
I am more than somewhat interested in judging and the validity (in the technical sense) judges' decisions and I was looking forward to Tim's analysis of the 2007 Worlds. Even with Tim's caveats of a statistician the data implies that things are actually quite good in pipe band judging.
As Tim said ".... I think that, to many people, providing evidence is nowhere near as much fun as just downing several beers and confidently proclaiming [judges] idiots."
Fortunately data always trumps feelings. The picture painted by Tim's analysis suggests that most decisions were agreed upon by the two judges and thus are likely to be right - in grade one this seems especially so where the correlations for the qualifier, MSR and Medley were .846 .749 and .886 and those by any standards are bloody good levels of agreement. Of course if one's starting point is that all the judges are incompetent (as some writers in another thread seem to believe) then this argument fails.
But my main reason for writing is that I suspect looking at the data that there is a difference in the performance of judges depending on their experience (of judging not playing) and that is a testable hypothesis, but it needs more data than is currently available. I think this analysis is worth doing because it would understand more about the variables which affect judinging performance.
Many thanks Tim, I enjoyed the read.
Great stuff, Tim! Thanks for taking the time to do this again.
Grade 3 is obviously all across the board, but I couldn't help but notice the disparity in Grade 2 (8-1 and 1-4?)
I also wonder if there is a way to factor in ensemble rankings? Perhaps you could do a full regression analysis that factors in how far a band travels, what chanters they play, etc.!
Or we could just ask Catherine Watt for her predictions.
While it's aways interesting to see the results of studies like this remember that pipe bands are not an exact science and at the end of the day, results are all about opinions - yes even those of the judges!
One might think that (particulary in the case of two piping judges) listening to an entire grade would always yield a similar result. However the fact that judges move around the circle is just the first of many variables that will create so called anomalous results. (particularly in he case of grades 1 & 2 bands are now massive)
For example - Judge A may choose to stand for the majority of the performance round the PM's side an hear quite a good performance - meanwhile Judge 2 starts at the PS's side and hears a completely diffent performance. Or perhaps as one judge moves they just happen to always be in the 'wrong place' from the bands perspective as minor slips occur.
Add to that that Judge 1 may have decided to 'hammer' any band with poor drones while judge 2 may be willing to let a slightly poorer drone sound go if the fingering is good but is less generous with bands displaying rushed tempos or a chanter sound which doesn't hold.
So yes, sometimes when a band is judged to be 1st and 8th questions should be asked of the judges - but perhaps, just perhaps questions should be asked within the band!
And that is why we love this 'game'
Kevin's hit the nail on the head (IMO). Referring specfically to the final paragraph, it's easy for us to look at results and give bad press where, for example, one judge has a band 1st and another has them 10th. However, we're being damning without knowing what's written on the sheets for such bands.
Questions should only be asked of the judges if the questions have not already been answered in the crit sheet.
And that's exactly why judges should be allowed to compare notes at the end of the event to share what they heard, appreciate and consider their respected colleague's opinion, and adjust their result if they wish.
In Brittany the grade one competitions( 15 Bagads ) have 13 judges who sit amongst the audience (indoor auditorium Brest in February). Their result compares accurately with general audience reaction and predictions. Individual bias by a judge is lost in the crowd. Much fairer. Expensive but there again so is the travel to the World Champ.
Year after year this subject raises it's ugly head and the same drum beating goes on. The suggestion that judges should be allowed to compare notes is certainly not the way to go. This would in all probability create confusion, indecisiveness, second guessing call it what you like. If the judge (s) hear something regardless of their respected colleague's omission in the crit sheet then that is what is expected and accepted. We don't want sheep, we want people who have the courage and conviction to state what is on "their" mind and to hear/see "their" views and not to base it on which judge has the strongest character or given any sort of edge when it comes to making a decision because of his or her history or connections. Judges should be given their due and be allowed to express themselves without the need for mediation - Next thing we would be reading about is the suggestion or accusation of collusion, nepotism and the like.
The PPBSO's used a consultative judging process for three years now, and it has been extremely successful. A quick look at the results from Maxville - http://www.pipesdrums.com/Contents.aspx?M=Articles&AT=Results&AID=13947 - and you'll see that by no means are judges always placing bands in the same order. Any adjudicator unduly trying to influence or bully a colleague would be going against the organization's Judges' Code of Conduct. To my knowledge, this has never happened. Nor has there been a formal complaint from a band about the consultative process. It works if you want it to. It's all about tolerance and respect.
Forgot to add: the PPBSO membership made a formal rule in the 1990s that prevents immediate family from judging family. This has completely stopped accusations of nepotism that were previously rampant.
To Jim and the two anonymous posters thanks for your kind words.
Jamie and Kevin, your points are well taken. I originally did the analysis of the results from the Worlds last year in response to a series of posts about the inconsistency in the judging so I decided to test it statistically and there were two contests that really stood out as odd.
Anomalies will occur, for the reasons suggested (personal preference and/or location outside the circle) as well as others and these anomalies will (in fact should) occur, hopefully for the latter reason more than the former. This may be especially true in lower grade bands where the consistency around the circle may not be as high as in higher grade bands. So, anomalies will happen, and are likely justified in most cases, but if the set of results are considered as a whole then the general consistency underlying those few anomalies should prevail. It is only when there are more anomalies than consistent judgments (if you will excuse the oxymoron) that an overall result becomes suspect. One strength (and admittedly also a weakness in some ways) of correlation is that it looks at general, overall, linear patterns. Examination of anomalies (e.g. how a band can get a first and an eighth) is worth discussing, but as has been pointed out, perhaps that should be a discussion between the judges.
Also, to Jim, a regression would be fun, but there is too little data available. I did take all the predictions from the contest and calculated an aggregate prediction from all 79 entries. Each band got a 6 for a first place prediction, 5 for second etc. As a group the “average” prediction for order of finish was SFU (5.1), FMM (4.7), Strathclyde (4.3), Shotts (2.9), SL78th (1.6), SLOT (1.4). In other words, as a group we got the top six correct (something the panel of experts cannot claim :-) ), or as you pointed out, we could just rely on Catherine. :-)
One last point. . . The “should we let judges talk” is a testable issue. Ask the judges to submit their initial placements and then allow them to discuss the result and make any adjustments as they see fit. Then the effect of the discussions could be examined. If one judge has any undue influence it will show up eventually in the changes from his/her judgng partners original placements. Tom is correct, if we allowed the judges to discuss the result there could be a whole new set of arguments, but if these changes are examined in a systematic way, then there will at least be some objective evidence to support the arguments.
I am not at all certain it would create that much confusion or indecisiveness (or another way to think of it is. . . even if it did make a judge confused, would that be any worse than the confusion felt by the band that gets the 1st and 8th and has to try and piece together the reason for that from two separate score sheets?).
There is a large amount of agreement among judges already so I imagine that the only real discussion would be around the anomalies, and they should be discussed anyway. What better place/time than while the competition is fresh in the judges mind?
I would agree that if (in a more perfect world) there were the resources to have 6, 8, 10 piping judges and several ensemble and drumming judges as well then discussion among the judges would not be an issue. The overall average would very likely correct the anomalies (or high and low scores could be discarded etc), but in the realistic situation of the RSPBA putting a high proportion of the qualified available judges on the field that day, then unless bands and individuals are willing to accept that every so often you will get a 1st and an 8th (for whatever reason), allowing the judges to discuss the result may be a way to reduce these discrepancies and actually lessen the confusion.
Andrew, nothing is perfect and that includes the PPBSO. The "Tolerance and Respect" that you mention has rarely been shown towards the RSPBA. Instead it gets thrown in their face that this, that or the next individual or association knows better. What a load of rubbish! Instead of working together and trying to make improvements we work in opposite directions.
The "Code of Conduct" that you speak of isn't worth the paper it's written on and it's naive of you to think otherwise. It's impossible to monitor or enforce. You mention that the PPBSO in the 1990's made a formal ruling regarding immediate family judging family - Did they also include the same ruling for instructors and pupils? If they did, then I take my hat of to them or do I? Tolerance and respect also means that we accept and appreciate that our selected judges no matter their bloodline or their relationship with the competitor has an "Ethical Compass" and therfor their true commitment to the art is sincere. If we think of our judges with tolerance and respect, then we should allow them to judge as individuals and not as some puppet. Giving them the opportunity to rank, offer praise and constructive criticism based on his/her own thoughts and not thrown in as an accumulation and being manipulated by a stronger personality.
"Tom" - I was only pointing out to you that the consultative process that you seem to think is impossible to have work well in fact appears to work very well in Ontario. I believe other associations are planning to adopt it, too. Not sure why the RSPBA abandoned it in thew 1990s, but perhaps the membership itself struck it down. Fair enough.
While the PPBSO asks judges to declare students and competitors to declare teachers, there is no rule preventing teachers judging students. Maybe some day, but for now it would be impossible to enforce without having to cancel events for lack of adjudicators.
I am a 100% supporter of consultive judging and would easily go on the record to say that the very best solo event ( a major Piob contest with 20 playing) that I ever judged was with John Wilson a few years back.
We had agreed on first place easily and in a very short time we agreed on fourth place. But, for over 1/2 hour, we sat and talked about all the merits of the two other players. Were we giving this 2nd prize to player A because he played different than we normally would and didn't want to seem Bias? or, were we right not to give him 2nd because we just felt it wasn't as good as the other. Back and forth we went discussing the merits of each tune. Their tone, technicality and music. We even sat for a few minutes and went completely through each others sheet. This was a tremendous learning lesson and in the end, we both walked out of that room very confident that we had exhausted every possible nuance of the two tunes and had made the right descision.
I'm not saying you have to sit for 1/2 hour after every band contest but in these days of 20+piper bands, I would hate to award a good prize to band A while not being informed that a player out of my hearing or vision made a serious error. You just cannot be everywhere all the time and cannot see everything. As well, if you are really judging an event and you give a band first, then I would expect them to be right there on top or very close and like wise, if you give a band 10th in piping (I am speaking as a piping judge) are you really happy that they ended up in 3rd place?
The whole "that's the way I felt today and that's that" is good but does it really help to determine the best band? I say no. Even in the major piob events with a panel of three, there is a fellow in the middle that reads the written score to make sure that there is not a major technical error so the other judges can concentrate solely on the music. You may not always agree with the Gold medal result or that of the clasp but you do know in the end that the three adjudicators all listened, made comments or notes and discussed why or why not the players got the prizes and they usually (to my knowledge) leave the room with confidence that the correct prize list has been established.
I cannot comment regarding consultive judging as I have not experienced it. I will say that it has not been uncommon for me or one of my fellow piping judges to hear something going on in the circle that the other did not because of where we were standing at the time. Depending on the seriousness or frequency of the "event" this could cause a major discrepancy in a final score in which case consultive judging would clearly have its merits.
In my opinion such discrepancies will become become more common with the advent of bigger the pipe sections we see in contests today. It amazes me that the statistics show this apparently didn't happen at this year's Worlds. I'm pretty impressed with the job the various judging panels did. For once it seems to speak more to their competence rather than the opposite.
Pipers: The golden rule when working with a reed is, "You can always take material off the cane, but you cannot put it back." So, remember, when removing the slightest amount of material, blow the reed again.