..
scoring-models
[!question]- What is the problem with boolean search
[!question]- What is the problem with jaccard coefficient
Procedure
-
Find
tf(t, d)
which is the number of times that term occurs in the document -
Convert this to a log scale
1 + log(tf(t, d))
if tf > 1 else 0 -
Find the
idf(t)
-
Multiply these together
-
Normalize the vector
-
Do the same for the query
-
Find the dot product