Hello, thank you for your help! The assignment instructions are in the pdf entitled Final.pdf. The assignment creates two short python programs that automatically predict whether an email is spam or...

1 answer below »


Hello, thank you for your help! The assignment instructions are in the pdf entitled Final.pdf. The assignment creates two short python programs that automatically predict whether an email is spam or not spam (given a csv file names spam.csv). The screenshots entiled final_part_one.py and final_part_two.py include code that MUST be included in the resulting programs (part 1 and part 2). The screenshots entitled EXAMPLE 1 and EXAMPLE 2 are my attempts at the code so far; the style of the code should be similar to these examples. My code is not running and I cannot figure out why. Also, I would like a screenshot of the decision tree in WEKA. This is the link for WEKA if needed :


https://waikato.github.io/weka-wiki/downloading_weka/

. Thank you.




Python Program to compute features There is a file named final_part_one.py. This file contains the following functions: - read_data_from_file() o This function will read the spam.csv file and return a list o Each item in the list will contain a line in the spam.csv file - write_features_to_file(text_length, does_have_spammy_words, does_have_links, number_of_symbols) You should add the following functions: - does_have_links(sms_message) o This function returns the string TRUE if a SMS message has links and FALSE if a SMS inks - does_have_spammy_words(sms_ message) o This function returns the string TRUE if the SMS contains spammy words and FALSE if a SMS message does not have spammy words o Here is the list of spammy words you should use: spammy_words = ['WINNER', 'URGENT', 'FreeMsg','Congrats!','free','FREE', 'winner','PRIVATE!', 'URGENT!', '4U', 'Free trial' ] - length_of_text(sms_ message) o This function returns the number of characters including spaces in the text message - main() o Read data from the csv file o Convert each SMS message into a set of features (does_have_links, does_have_spammy_words, length_of_text) o Write these features to a file named features.csv using the write_features_to_file function After you finished creating features.csv, add heading to the first row of the file, features.csv. The heading should be: - LENGTH_OF_TEXT - DOES_HAVE_SPAMMY_WORDS - DOES_HAVE_LINKS - CLASS_LABEL Next, you will import features.csv into WEKA. Weka Select the csv file you created in the first python program. Make sure to select CSV data files under File Format. After you press select, your screen should look similar to the one below (your attributes will be different): Next select, the Classify tab and make sure the field (Nom) value is selected. Then press the choose button, select trees and choose J48. J48 is the decision tree classifier. Finally press Start. Python program to predict whether a SMS message is spam or not spam (add to your previous program) Follow the instructions in red in the image below. Follow the instructions in red in the image below. You should see a visualization similar to the one below. Next, you will create a file named There is a file named final_part_two.py. This file contains the following functions: - read_data_from_file() o This function will read the spam.csv file and return a list o Each item in the list will contain a line in the spam.csv file Copy and paste the following functions from the file final_part_one.py to final_part_two.py - does_have_links(sms_message) o This function returns the string TRUE if a SMS message has links and FALSE if a SMS inks - does_have_spammy_words(sms_ message) o This function returns the string TRUE if the SMS contains spammy words and FALSE if a SMS message does not have spammy words o Here is the list of spammy words you should use: spammy_words = ['WINNER', 'URGENT', 'FreeMsg','Congrats!','free','FREE', 'winner','PRIVATE!', 'URGENT!', '4U', 'Free trial' ] - length_of_text(sms_ message) o This function returns the number of characters including spaces in the text message Add the following functions to your file - predict_spam(does_have_links, does_have_spammy_words, length_of_text) o In this function, you will write your if/else rules from the image in WEKA. o This function should return spam or ham to the main function - main() o read the data from the file spam.csv o convert each SMS message into a set of features (does_have_links, does_have_spammy_words, length_of_text) o pass these features into the predict_spam function o print out the class value returned from the predict_spam function import csv #read the spam csv file into a list def read _data_from file(): data = [] with open('spam.csv') as csv_file: csv_reader = csv.reader(csv_file, delimiter=',"') # skip the first row next (csv_reader) for row in csv_reader: data.append (row) return data #write features to a file def write features_to_file(text_length, does_have spammy words, does_have_links, class_label): # writing to csv file row = [text_length, does_have_ spammy words, does_have_links, class_label] with open("features.csv", 'a', newline='', encoding='utf-8') as csvfile: # creating a csv writer object csvwriter = csv.writer(csvfile) # writing the data rows csvwriter.writerow (row) import csv #read the spam csv file into a list def read _data_from file(): data = [] with open('spam.csv') as csv_file: csv_reader = csv.reader(csv_file, delimiter=',"') # skip the first row next (csv_reader) for row in csv_reader: data.append (row) return data
Answered Same DayNov 27, 2022

Answer To: Hello, thank you for your help! The assignment instructions are in the pdf entitled Final.pdf. The...

Sanskar answered on Nov 28 2022
29 Votes
SPAM OR HAM
ASSIGNMENT No. – 11461
By- Sanskar
Status-Completed , On time
Requirements –
1. Full Python compile setup.
2. Basic knowledge of python.
3. Basic knowledge of read from file.
4. All files mus
t be placed in one folder.
Introduction
To check if any email is spam or not. For this there’s given a file named spam.csv. This program will automatically check length of text , if have spammy words (specified), if have any kind of links and if there are any labels. After checking it will return if the mail is spam or not (ham). In order to check spammy words , all are declared in a list. To increase accuracy of the code user can add more words there (does_have_spammy_words function).
There are total number of 4 files and 1 picture of WEKA diagram :
1. final_part_one.py
2. final_part_two.py
3. spam.csv
4. features.csv
NOTE- In order to run the program/Test the code , delete features.csv file and run final_part_one.py file , The features.csv file will be automatically generated.
WEKA tree Diagram-
According to this WEKA tree code has been designed.
1. first_part_one.py
Comments are available after each function to understand properly .
code-
import csv
# read the span csv file into a list
def read_data_from_file():
    data = []
    with open('spam.csv') as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=',')
        # slip the first row
        next(csv_reader)
        for row in csv_reader:
            data.append(row)
    return data
# write feature to a file
def write_features_to_file(text_length, does_have_spammy_words, does_have_links, class_label):
    # writing to csv file
    row = [text_length, does_have_spammy_words, does_have_links, class_label]
    with open('features.csv', 'a', newline='', encoding='utf-8') as csvfile:
        # creating a csv writer object
        csvwriter = csv.writer(csvfile)
        # writing the data rows
        csvwriter.writerow(row)
# check if the message has any link inside
def does_has_links(sms_message):
    link_list = ['.com', 'https://', 'www.', '.net', 'http://']
    text =...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here