1. Consider applying PLSI to the following corpus (each line is a separate document0:ABACAA BCABABB CACBABfurthermore, assume that there are two topics, and ABC are the only types that are...

In the file


1. Consider applying PLSI to the following corpus (each line is a separate document0: ABACAA BCABABB CACBAB furthermore, assume that there are two topics, and ABC are the only types that are available. a. Now suppose we initially assigned words to topics as above (black for topic 1, red/underline for topic 2). Calculate the topic-word vectors and document-topic vectors. b. Use the vectors generated in part (a) to calculate the topic probability for each word in the corpus c. Use the result of (b) to recalculate the topic-word vectors and document-topic vectors. d. Calculate whether the vectors in (c) is better for the set of documents. 2. Now consider this corpus (each line is a separate sentence): ABCCC ADBB CDADD CABB DACB Suppose we want to build a bigram model based on the corpus above. Assume we have both a begin and end sentence symbol for each sentence. Calculate the perplexity of each sentence (separately) for each of the two cases a. The base case (no smoothing) b. Using Laplace (plus 1) smoothing. Also show the probabilities for each bigram (preferably in a 2-d matrix).
Feb 26, 2023
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here