Re: evaluation of known item searches
Hi,
There's a small bug in the evaluation script I sent around half an hour
ago: start and stop times were miscalculated when they were only stated
in miliseconds.
Here's a new version.
Thijs
#!/Utils/bin/perl
#
# Name: trecvideval
#
# Author: Thijs Westerveld
# Date: 25/07/2001
#
# To evaluate results on Known Item topics for TREC's video track
#
# usage: cat <results.xml> | trecvideval <topics.xml>
#
# trecvideval takes one or more trecvid topics and processes a set of results.
#
# Both topics and resulst should be in the official video track xml formats:
# http://www-nlpir.nist.gov/projects/t01v/topics/videoTopic.dtd
# http://www-nlpir.nist.gov/projects/t01v/topics/videoTopics.dtd
# http://www-nlpir.nist.gov/projects/t01v/videoSearchResults.dtd
#
# The output of the script is a filtered version of <results.xml>, in which for each topic only the
# relevant results are shown (i.e. the results that match an known item).
#
# In addition for each topic, the average precision is reported (the mean of the precision obtained after each known item is retrieved, using zero as the precision for known items not retrieved).
# Finally, at the end, these precision values are averaged over all known-item topics.
#
# A fragment is regarded relevant if its start point lies between the start and stop times of one of the known items
# In addition, we allow fragments to start just before the known item.
#
#
# ------------------------------|----------------------------------|--------------------------------------
# KI-start KI-stop
#
# <--deltaT-->
# |_______________________________________________|
# <--------- relevant starting points ---------->
#
# The tolerance at the starting point can be varied by changing the value of deltaT.
# Also, the matching function can be changed
#
# Note: Stop times of fragments are disregarded, because we think someone using a video search system will not
# stop watching a retrieved fragment just because the system thinks it's no longer relevant.
# A user will stop watching either when he/she has found a what was searched for or
# when he/she gets bored and thinks there's nothing there.
#
die "usage: cat <results> | trecvideval <topics.xml>\n" unless ($#ARGV == 0);
$deltaT = 1; # tolerance at starting point (in seconds)
%knownItems=();
$numKItopics=0;
$sumavgprec=0;
open(TOPICS, "$ARGV[0]")|| die "Cannot open $ARGV[0]\n";
# Read known items from topics descriptions
while(<TOPICS>){
if (/videoTopic num/){
($topic) = /num=\"(\d+)\"/;
}
if(/knownItem.*src/){
$knownItems{$topic}.="$_";
}
}
while(<STDIN>){
if(/VideoSearchResult tNum/){
print;
# This is a new Topic, so get KnownItems
@ki=();
($topic)=/tNum=\"(\d+)\"/;
if(exists $knownItems{$topic}){
# If there are known items for this topic, then read them into a list
$i=0;
foreach $item (split (/\n/, $knownItems{$topic})){
$hour=$min=$sec=$milsec=0;
# known item source
($ki[$i]{src}) = ($item =~ m/src=\"([^\"]+)(.mpe?g)?\"/i);
$ki[$i]{src} =~ s/\..*$//;
# known item start time
($start)=($item =~ m/start=\"([^\"]+)\"/);
($houropt,$hour,$minopt,$min,$secopt,$sec,$milsecopt,$milsec) = ($start =~ m/((\d+)h)?((\d+)m)?((\d+)s)?((\d+)ms)?$/);
$start=3600*$hour+60*$min+$sec+0.001*$milsec;
$ki[$i]{start} = $start;
# known item stop time
($stop)=($item =~ m/stop=\"([^\"]+)\"/);
($houropt,$hour,$minopt,$min,$secopt,$sec,$milsecopt,$milsec) = ($stop =~ m/((\d+)h)?((\d+)m)?((\d+)s)?((\d+)ms)?$/);
$stop=3600*$hour+60*$min+$sec+0.001*$milsec;
$ki[$i]{stop} = $stop;
# next item
$i++;
}
print "<!-- $i known-Items for this Topic -->\n";
# Initialise number of found Known Items and score
$numfound=0;
$score=0;
}
else {
# No known Items for this topic, so clear knownitem list
@ki=();
print "<!-- No known-Items for this Topic -->\n";
}
}
elsif(/item seqNum/){
# This is a result
if (@ki !=() ){
# If we there are known items,
# then extract result info and compare to known items
$result=$_;
$hour=$min=$sec=$milsec=0;
#result rank
($rank) = ($result =~ m/seqNum=\"([^\"]+)\"/);
# result source
($src) = ($result =~ m/src=\"([^\"]+)(.mpe?g)?\"/i);
$src =~ s/\..*$//;
# result start time
($start)=($result =~ m/start=\"([^\"]+)\"/);
($houropt,$hour,$minopt,$min,$secopt,$sec,$milsecopt,$milsec) = ($start =~ m/((\d+)h)?((\d+)m)?((\d+)s)?((\d+)ms)?$/);
$start=3600*$hour+60*$min+$sec+0.001*$milsec;
# result stop time
($stop)=($result =~ m/stop=\"([^\"]+)\"/);
($houropt,$hour,$minopt,$min,$secopt,$sec,$milsecopt,$milsec) = ($stop =~ m/((\d+)h)?((\d+)m)?((\d+)s)?((\d+)ms)?$/);
$stop=3600*$hour+60*$min+$sec+0.001*$milsec;
# compare to each known item
foreach $i (0...$#ki){
#
# Below is the matching function: compare sources check differences between start (and stop) times
#
# Change the next line if you want to experiment with different matching functions
#
if(uc($ki[$i]{src}) eq uc($src) && ($start >= ($ki[$i]{start}-$deltaT)) && ($start < ($ki[$i]{stop}))){
# MATCH
$numfound++;
# Add precision to total score
$score+=$numfound/$rank;
# make sure we don't find the same known item twice
$ki[$i]{src}="thisitemisalreadyfound";
print;
# break;
}
}
}
}
elsif (/\<\/VideoSearchResult\>/){
# end of topic
if (@ki !=() ){
# IF there are know items print average precision for this topic
# avg prec. is mean of precisions after each known item is retrieved (using 0 as prec. for K-I not retrieved
print "<!-- average precision = ",$score/($#ki+1)," -->\n";
#add average prec. to sum over all topics:
$sumavgprec+=$score/($#ki+1);
# ...and increase number of topics for which Known Items exist
$numKItopic++;
}
print;
}
else{
print;
}
}
print "<!-- MEAN AVG. PREC. over $numKItopic Known-Item topics:", $sumavgprec/$numKItopic," -->\n";
Date Index |
Thread Index |
Problems or questions? Contact list-master@nist.gov