Online Course Support | Text Retrieval and Search Engines

You are given the query Q= “online courses” and two documents: D1 = “online courses search engine” D2 = “online education is affordable”

3. Question 3 You are given the query Q= “online courses” and two documents: D1 = “online courses search engine” D2 = “online education is affordable” Assume you are using…

Online Course Support | Text Retrieval and Search Engines

You are given a vocabulary composed of only four words: “the,” “computer,” “science,” and “technology.” Below are the probabilities of three of these four words given by a unigram language model.

2. Question 2 You are given a vocabulary composed of only four words: “the,” “computer,” “science,” and “technology.” Below are the probabilities of three of these four words given by…

Online Course Support | Text Retrieval and Search Engines

Refer to the Rocchio feedback formula in the lectures. If you want to eliminate the effect of non-relevant documents when doing feedback, which of the following parameters must be set to zero?

8. Question 8 Refer to the Rocchio feedback formula in the lectures. If you want to eliminate the effect of non-relevant documents when doing feedback, which of the following parameters…

Online Course Support | Text Retrieval and Search Engines

Assume you are using a unigram language model to calculate the probabilities of phrases. Then, the probabilities of generating the phrases “study text mining” and “text mining study” are not equal, i.e., P(“study text mining”) \neq​= P(“text mining study”).

1. Question 1 Assume you are using a unigram language model to calculate the probabilities of phrases. Then, the probabilities of generating the phrases “study text mining” and “text mining…

Online Course Support | Text Retrieval and Search Engines

Let qq be the original query vector, D_R=\{P_1,…,P_n \}DR​={P1​,…,Pn​} be the set of positive document vectors, and D_N=\{N_1,…,N_m\}DN​={N1​,…,Nm​} be the set of negative document vectors. Let q_1q1​ be the expanded query vector after applying Rocchio on D_RDR​ and D_NDN​ with positive parameter values \alphaα, \betaβ, and \gammaγ. Let q_2q2​ be the expanded query vector after applying Rocchio on D_RDR​ and D_NDN​ with the same values for \alphaα, \betaβ, but \gammaγ being set to zero. Which of the following is correct?

9. Question 9 Let qq be the original query vector, D_R=\{P_1,…,P_n \}DR​={P1​,…,Pn​} be the set of positive document vectors, and D_N=\{N_1,…,N_m\}DN​={N1​,…,Nm​} be the set of negative document vectors. Let q_1q1​…

Online Course Support | Text Retrieval and Search Engines

Assume you are using Dirichlet Prior smoothing to estimate the probabilities of words in a certain document. What happens to the smoothed probability of the word when the parameter \muμ is increased?

6. Question 6 Assume you are using Dirichlet Prior smoothing to estimate the probabilities of words in a certain document. What happens to the smoothed probability of the word when…

Online Course Support | Text Retrieval and Search Engines

Assume the same scenario as in Question 3, but using linear interpolation (Jelinek-Mercer) smoothing with \lambda = 0.5λ=0.5. Furthermore, you are given the following probabilities of some of the words in the collection language model:

4. Question 4 Assume the same scenario as in Question 3, but using linear interpolation (Jelinek-Mercer) smoothing with \lambda = 0.5λ=0.5. Furthermore, you are given the following probabilities of some…