February 15, 2020

Indexing 2: inverted index

A search engine represents documents as vectors over the vocabulary. If we arrange all documents as row vectors in a matrix, then the column vectors are inverted lists: for each keyword in our vocabulary they give a list of occurrences of this keyword in all documents.

