Re: trecvid - general and known-item search evaluation details



Ioannis, 

Thanks for your comment and suggestion regarding the incorporation
of some approximation of recall in the general search evaluation.

> my comment is on the evaluation for the general items. If I
> understand it well, the proposed measure (precision given that at most
> 20 clips/shots will be returned) does not take "recall" into
> consideration. Since the number of returned clips/shots is not fixed,
> this seems to be in favor of systems that return as less clips/shots
> as possible.

> Of course it is difficult to make a ground-truth, since that would
> require to go through a lot of video material. But wouldn't it be
> feasible (and meaningful) to build a semi-ground-truth as the union of
> the "correct" results that each system returns? It seems to me that
> this is achievable without any additional human effort in the
> evaluation procedure.

I agree that the proposal is lacking with respect to measuring the completeness
of the set of shots returned. I would like to hear others' opinions, but I
think your suggestion is a good one and that Ramazan and I should look at
implementing it. We would however have to deal with the consequences of the fact
that the units being submitted as relevant are not predefined and so are likely
not going to be exact matches for each other.

Other opinions?

- Paul
-- 
Paul Over - Retrieval Group
	    Information Access Division
	    Information Technology Laboratory
	    National Institute of Standards and Technology
	    Bldg. 225  Rm. A211  (Mailstop 8940)
	    Gaithersburg, MD  20899-8940   USA
	    Voice: 301 975-6784    Fax: 301 975-5287



Date Index | Thread Index | Problems or questions? Contact list-master@nist.gov