Re: interassessor consistency data on TREC 06 Legal track ad hoc topics



>> assessors. The sample consisted of 25 documents judged relevant by
>> the first assessor (or all such documents if fewer than 25), and
>> enough nonrelevant to bring the sample to 50 documents (49 in one
>> case due to a glitch).
>
> I realize this is difficult when the sample is drawn this way, but  
> have you
> tried measuring the runs using this data, and seeing if they rank  
> differently?

Ian - No, but if anyone is interested in trying that, I'm happy to  
make available the data.  (Gordon, I just wrote you separately about  
your offer - thanks!)

As you say, it would take some thought to come up with a sensible  
measure based on this strangely drawn sample.

Dave




Date Index | Thread Index | Problems or questions? Contact list-master@nist.gov