Text Retrieval and Search Engines - Digital Marketing Consultant

Online Course Support | Text Retrieval and Search Engines

What is the advantage of tokenization (normalize and stemming) before index?

ByAdmin August 15, 2021August 15, 2021

5. Question 5 What is the advantage of tokenization (normalize and stemming) before index? 1 point Improves performance by mapping words with similar meanings into the same indexing term…

Online Course Support | Text Retrieval and Search Engines

Let w1, w2, and w3 represent three words in the dictionary of an inverted index. Suppose we have the following document frequency distribution:

ByAdmin August 15, 2021August 15, 2021

1. Question 1 Let w1, w2, and w3 represent three words in the dictionary of an inverted index. Suppose we have the following document frequency distribution: Word Document Frequency w1…

Online Course Support | Text Retrieval and Search Engines

Which of the following is false?

ByAdmin August 15, 2021August 15, 2021

2. Question 2 Which of the following is false? 1 point Search engines rely on the text push mode. Recommender systems are based on the text…

Online Course Support | Text Retrieval and Search Engines

Consider the following retrieval formula: Where c(w, D) is the count of word w in document D, dl is the document length, avdl is the average document length of the collection, N is the total number of documents in the collection,

ByAdmin August 15, 2021August 15, 2021

9. Question 9 Consider the following retrieval formula: Where c(w, D) is the count of word w in document D, dl is the document length, avdl is the average document…

Online Course Support | Text Retrieval and Search Engines

Assume the same scenario as in Question 7, but with TF-IDF weighting. Which of the following words do you expect to have the highest weight in this case?

ByAdmin August 15, 2021August 15, 2021

8. Question 8 Assume the same scenario as in Question 7, but with TF-IDF weighting. Which of the following words do you expect to have the highest weight in this…

Online Course Support | Text Retrieval and Search Engines

Consider the instantiation of the vector space model where documents and queries are represented as bit vectors. Assume we have the following query and two documents:

ByAdmin August 15, 2021August 15, 2021

3. Question 3 Consider the instantiation of the vector space model where documents and queries are represented as bit vectors. Assume we have the following query and two documents: Q…

Online Course Support | Text Retrieval and Search Engines

In VSM model, which of the following will be a better way to measure similarity/distance?

ByAdmin August 15, 2021August 15, 2021

10. Question 10 In VSM model, which of the following will be a better way to measure similarity/distance? 1 point Cosine similarity: cos( v_1, v_2 )cos(v1,v2) L2 distance:…

Online Course Support | Text Retrieval and Search Engines

Suppose we compute the term vector for a baseball sports news article in a collection of general news articles using TF weighting only. Which of the following do you expect to have the highest weight?

ByAdmin August 15, 2021August 15, 2021

7. Question 7 Suppose we compute the term vector for a baseball sports news article in a collection of general news articles using TF weighting only. Which of the following…

Online Course Support | Text Retrieval and Search Engines

In the “simplest” VSM instantiation, if instead of using 0-1 bit vectors but we use the word count instead, when we concatenate each document by itself, will the ranking list still remain the same?

ByAdmin August 15, 2021August 15, 2021

5. Question 5 In the “simplest” VSM instantiation, if instead of using 0-1 bit vectors but we use the word count instead, when we concatenate each document by itself, will…

Online Course Support | Text Retrieval and Search Engines

Consider the same scenario as in Question 3, with dot product as the similarity measure. Which of the following is true?

ByAdmin August 15, 2021August 15, 2021

4. Question 4 Consider the same scenario as in Question 3, with dot product as the similarity measure. Which of the following is true? 1 point Sim(Q,D1) = 4 …