You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for providing a common format and benchmark suite for many standard datasets.
Issue:
I believe the original DEEP dataset is using Euclidean distance, not Angular as you have it.
Since, the vectors are l2-normalized, the two distances are highly correlated but not the same, so you might not notice immediately from QPS-Recall.
The only reason I am not certain and have a question mark in the title, is that based on #145, your download source is different and on another format from the following sources (.fvecs vs .ibin).
Sources:
I'm looking at big-ann-benchmarks regarding this issue, since the author of the original paper for DEEP is listed one of the organizers of the original '21 challenge (Artem Babenko). I've also consistently seen deep mentioned for euclidean distance on research papers, which makes sense as, to the best of my knowledge, that's more common for images, and IP/angular is more common for text data.
Thank you for providing a common format and benchmark suite for many standard datasets.
Issue:
I believe the original DEEP dataset is using Euclidean distance, not Angular as you have it.
Since, the vectors are l2-normalized, the two distances are highly correlated but not the same, so you might not notice immediately from QPS-Recall.
The only reason I am not certain and have a question mark in the title, is that based on #145, your download source is different and on another format from the following sources (.fvecs vs .ibin).
Sources:
I'm looking at big-ann-benchmarks regarding this issue, since the author of the original paper for DEEP is listed one of the organizers of the original '21 challenge (Artem Babenko). I've also consistently seen deep mentioned for euclidean distance on research papers, which makes sense as, to the best of my knowledge, that's more common for images, and IP/angular is more common for text data.
The text was updated successfully, but these errors were encountered: