Hello, thank you for your help! The assignment instructions are in the pdf entitled Final.pdf. The assignment creates two short python programs that automatically predict whether an email is spam or not spam (given a csv file names spam.csv). The screenshots entiled final_part_one.py and final_part_two.py include code that MUST be included in the resulting programs (part 1 and part 2). The screenshots entitled EXAMPLE 1 and EXAMPLE 2 are my attempts at the code so far; the style of the code should be similar to these examples. My code is not running and I cannot figure out why. Also, I would like a screenshot of the decision tree in WEKA. This is the link for WEKA if needed :
https://waikato.github.io/weka-wiki/downloading_weka/
. Thank you.
Python Program to compute features There is a file named final_part_one.py. This file contains the following functions: - read_data_from_file() o This function will read the spam.csv file and return a list o Each item in the list will contain a line in the spam.csv file - write_features_to_file(text_length, does_have_spammy_words, does_have_links, number_of_symbols) You should add the following functions: - does_have_links(sms_message) o This function returns the string TRUE if a SMS message has links and FALSE if a SMS inks - does_have_spammy_words(sms_ message) o This function returns the string TRUE if the SMS contains spammy words and FALSE if a SMS message does not have spammy words o Here is the list of spammy words you should use: spammy_words = ['WINNER', 'URGENT', 'FreeMsg','Congrats!','free','FREE', 'winner','PRIVATE!', 'URGENT!', '4U', 'Free trial' ] - length_of_text(sms_ message) o This function returns the number of characters including spaces in the text message - main() o Read data from the csv file o Convert each SMS message into a set of features (does_have_links, does_have_spammy_words, length_of_text) o Write these features to a file named features.csv using the write_features_to_file function After you finished creating features.csv, add heading to the first row of the file, features.csv. The heading should be: - LENGTH_OF_TEXT - DOES_HAVE_SPAMMY_WORDS - DOES_HAVE_LINKS - CLASS_LABEL Next, you will import features.csv into WEKA. Weka Select the csv file you created in the first python program. Make sure to select CSV data files under File Format. After you press select, your screen should look similar to the one below (your attributes will be different): Next select, the Classify tab and make sure the field (Nom) value is selected. Then press the choose button, select trees and choose J48. J48 is the decision tree classifier. Finally press Start. Python program to predict whether a SMS message is spam or not spam (add to your previous program) Follow the instructions in red in the image below. Follow the instructions in red in the image below. You should see a visualization similar to the one below. Next, you will create a file named There is a file named final_part_two.py. This file contains the following functions: - read_data_from_file() o This function will read the spam.csv file and return a list o Each item in the list will contain a line in the spam.csv file Copy and paste the following functions from the file final_part_one.py to final_part_two.py - does_have_links(sms_message) o This function returns the string TRUE if a SMS message has links and FALSE if a SMS inks - does_have_spammy_words(sms_ message) o This function returns the string TRUE if the SMS contains spammy words and FALSE if a SMS message does not have spammy words o Here is the list of spammy words you should use: spammy_words = ['WINNER', 'URGENT', 'FreeMsg','Congrats!','free','FREE', 'winner','PRIVATE!', 'URGENT!', '4U', 'Free trial' ] - length_of_text(sms_ message) o This function returns the number of characters including spaces in the text message Add the following functions to your file - predict_spam(does_have_links, does_have_spammy_words, length_of_text) o In this function, you will write your if/else rules from the image in WEKA. o This function should return spam or ham to the main function - main() o read the data from the file spam.csv o convert each SMS message into a set of features (does_have_links, does_have_spammy_words, length_of_text) o pass these features into the predict_spam function o print out the class value returned from the predict_spam function import csv #read the spam csv file into a list def read _data_from file(): data = [] with open('spam.csv') as csv_file: csv_reader = csv.reader(csv_file, delimiter=',"') # skip the first row next (csv_reader) for row in csv_reader: data.append (row) return data #write features to a file def write features_to_file(text_length, does_have spammy words, does_have_links, class_label): # writing to csv file row = [text_length, does_have_ spammy words, does_have_links, class_label] with open("features.csv", 'a', newline='', encoding='utf-8') as csvfile: # creating a csv writer object csvwriter = csv.writer(csvfile) # writing the data rows csvwriter.writerow (row) import csv #read the spam csv file into a list def read _data_from file(): data = [] with open('spam.csv') as csv_file: csv_reader = csv.reader(csv_file, delimiter=',"') # skip the first row next (csv_reader) for row in csv_reader: data.append (row) return data