Problem 1) import nltk nltk.download() Scroll through the list and find a corpus to download (do not use brown or inaugural) Answer the following: 1. What directory does the corpus download to? 2. How...

1 answer below »

attached

Problem 1) import nltk nltk.download() Scroll through the list and find a corpus to download (do not use brown or inaugural) Answer the following: 1. What directory does the corpus download to? 2. How many Files are there for that Corpus? 3. In 3-4 sentences, what is the purpose of that corpus and what genres does it cover? Problem 2 Below are examples of how to access your corpus Example Description fileids() the files of the corpus fileids([categories]) the files of the corpus corresponding to these categories categories() the categories of the corpus categories([fileids]) the categories of the corpus corresponding to these files raw() the raw content of the corpus raw(fileids=[f1,f2,f3]) the raw content of the specified files raw(categories=[c1,c2]) the raw content of the specified categories words() the words of the whole corpus words(fileids=[f1,f2,f3]) the words of the specified fileids words(categories=[c1,c2]) the words of the specified categories sents() the sentences of the whole corpus sents(fileids=[f1,f2,f3]) the sentences of the specified fileids sents(categories=[c1,c2]) the sentences of the specified categories abspath(fileid) the location of the given file on disk encoding(fileid) the encoding of the file (if known) open(fileid) open a stream for reading the given corpus file root if the path to the root of locally installed corpus readme() the contents of the README file of the corpus Answer the following: (you might have to try different corpora than question 1, try a few until you find one with the required info) 1. How many categories are in your corpus? 2. How many sentences are in the corpus? 3. How many sentences are in each category? For instance for brown you can import it by from nltk.corpus import brown brown.[function] brown.raw() Problem 3) first: pip install matplotlib import nltk from nltk.corpus import inaugural word1='country' word2='city' cfd = nltk.ConditionalFreqDist((target, fileid[:4])for fileid in inaugural.fileids()for w in inaugural.words(fileid)for target in [word1, word2] if w.lower().startswith(target)) cfd.plot() Try finding two words to replace country and city. Find one word that is becoming more popular in recent years (2009) and one that was popular but is not longer.

assignmentlab-1-nlp-corpora-hv14kzpn.docx

Answered Same DayJun 03, 2021

Answer To: Problem 1) import nltk nltk.download() Scroll through the list and find a corpus to download (do not...

Mani answered on Jun 03 2021

143 Votes

Solutions
Solutions
Problem #1:
import nltk
download command
nltk.download()
a. files got downloaded to /home//nltk_data/corpora/ location
b. 473 files
c. It is Patient Information Leaflet (pil) and contains detailed information about various
medicines, their usage, contents, what they are used for, how etc.
Problem...

SOLUTION.PDF

Problem 1) import nltk nltk.download() Scroll through the list and find a corpus to download (do not use brown or inaugural) Answer the following: 1. What directory does the corpus download to? 2. How...

Answer To: Problem 1) import nltk nltk.download() Scroll through the list and find a corpus to download (do not...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment