Slide 4 of 25
Notes:
Borrowed from the N-gram method of comparing free-text documents
Global signature is computed for each document.
Signature is typically represented by a histogram of the number of times that each substring of length N occurs in the document
DOT PRODUCT USED TO DETERMINE THE SIMILARITY BETWEEN 2 DOCUMENTS
CANDID calculates a GLOBAL signature that represents such features as: local textures, shapes, and colors