Assignment-1-CS522-overview-slides Assignment #1 Autism Detection Overview ● Research has shown that gaze and fixation vision patterns differ for people with autism when compared to those without ● In...

1 answer below »
This is a machine learning assignment






Assignment-1-CS522-overview-slides Assignment #1 Autism Detection Overview ● Research has shown that gaze and fixation vision patterns differ for people with autism when compared to those without ● In this assignment, you will use data gathered by recording people’s gaze patterns as they watch videos Sample Video https://docs.google.com/file/d/1RfZo7LL5ldb7vr55ZW_A2m7ViKNgOBoR/preview Data ● The data comes from a study involving 60 individuals ○ 25 non-autistic (aka control group), 35 autistic ○ Gaze was monitored using an off-the-shelf eye tracker ● The data for the participants has already been processed into .npy files for you ○ Check out this resource if you are unfamiliar: https://towardsdatascience.com/what-is-npy-files-and-why-you-should-use-them-603373c 78883 https://towardsdatascience.com/what-is-npy-files-and-why-you-should-use-them-603373c78883 https://towardsdatascience.com/what-is-npy-files-and-why-you-should-use-them-603373c78883 Your Job ● You are going to perform a simple data analysis pipeline ● Your job is to build a classification model that accurately predicts whether or not someone is autistic based on their gaze patterns ● The goal of this assignment is to better understand feature engineering and how that affects the final classification model ● Skeleton.ipynb is available on Canvas ○ Contains 7 tasks for you to complete ○ Most tasks already have code ○ Some contain questions or require explanations; do not forget to leave responses! Task #1 - Data Loading ● Read in the .npy video files and process them into X and Y vectors ● This is done for you ● For +5 extra credit, you may read in additional videos Task #2 - Feature Engineering ● Calculate features to enhance your dataset ○ You may choose which features you would like to calculate ○ These can be statistical features (min, max, mean, etc.) or something more complex ○ Be creative! ● You should modify the function featurize_input(X,Y) to do the feature engineering ○ You can change the parameters of the function (or add additional parameters) if necessary (just note you may also need to modify the load functions) ● The functions load_data_autistic_fv() and load_data_non_autistic_fv() are written for you and call featurize_input(X,Y) Task #3 - Balancing Classes ● Determine whether you need to re-balance your class labels ● Choose a method for re-balancing your data ○ Make sure to explain why you chose your method! ● If you are unfamiliar with class re-balancing the method I prefer is called SMOTE. ○ You can read up on it here: https://towardsdatascience.com/applying-smote-for-class-imbalance-with-just-a-few-lines -of-code-python-cdf603e58688 https://towardsdatascience.com/applying-smote-for-class-imbalance-with-just-a-few-lines-of-code-python-cdf603e58688 https://towardsdatascience.com/applying-smote-for-class-imbalance-with-just-a-few-lines-of-code-python-cdf603e58688 Task #4 - Creating Classification Labels ● Use your autistic and control groups to assign the actual (read: real, ground truth) labels to the dataset ○ Remember classification is a supervised learning task ● Form your data and labels vectors to be used in the next steps ○ This is also done for you! ○ Use the “sanity check” to confirm your data is correct shape Task #5 - Normalizing ● Determine if the data needs to be normalized ○ Run the code and interpret the plots ○ Does the distribution look normal? ● Perform normalization of the data (regardless of whether you think it should be normalized or not) ○ Code for RobustScalar as studied in class is already written ○ You may modify this if you like Task #6 - SVM Classification ● We have written a basic SVC model for you ○ Your job is to understand the effects of the feature engineering from task 2 ○ You should re-run your SVC with multiple different combinations of feature engineering! ● Briefly explain your model performance and how the feature engineering played a role ○ You may use any statistics/graphs to support your answer Task #7 - SVM Parameter Tuning ● SVM has many hyperparameters that could be tuned to produce a better model ● In this task, we expect you to reuse the code from task 6 but vary the hyperparameters to build two new models ○ You MUST try two different kernels ○ You should use the BEST set of features that you discovered in task 6 ○ CS Majors are encouraged to be creative and also try tuning other parameters! ● The default configs for scikit’s version of SVC are in the docs: https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html Grading Scale Task #1 0% (written for you) Task #2 20% Task #3 20% Task #4 0% (written for you) Task #5 20% Task #6 25% (explanation of models) Task #7 15% (5% per kernel, 5% explanation) Note: -10 penalty if none of the classifiers (from task 6) reach an accuracy of ~95% What you should submit on Canvas: 1. Your assignment.ipynb file 2. assignment.pdf of your completed (and run) assignment.ipynb file To get assignment.pdf (assuming you are in Jupyter): ● Select “run all” under the “Runtime” tab at the top of your environment ● When it completed running, select “download as HTML” under the “File” tab ● Right click on the HTML document and save to PDF A Note on Environment: You should be using the version of Python (3.8+) and packages as outlined in Environment.pdf under the Assignment #1 Canvas page. Failure to follow these environment constraints may result in a 0 on the assignment. Please consult an instructor if you cannot meet the environment constraints or have questions in getting set-up. Another Note on Discord: ● If you have doubts regarding code, instructions, ideas, mistakes, solutions, etc., please use Discord! ● We are fine answering questions using email, but it’s better to discuss anything using Discord since everybody can see that “you are not alone”. ● We are not experts on all the topics, but we all can contribute to share our knowledge. We all are learning! Environment.docx Environment Setup – COSC 522 Machine Learning Prof. Sai Swaminathan To make things easy to help groups and provide useful feedback, we are providing the only commands that are necessary for assignment 1. On the one hand you may feel restricted in some sense (keep in mind that for the final project you will have more freedom in terms of software), but on the other hand, you have to worry about just accomplishing the tasks of the assignment and nothing else. Install Anaconda ● https://docs.anaconda.com/anaconda/install Anaconda Prompt 1. Open the Anaconda Prompt 2. Create a new environment by typing: conda create --name ml python=3.8 a. ml is just the name of the environment. b. If at some point of the semester, the environment is messed up, just create another one (with a different name). 3. Once you have an environment, activate it by using: conda activate ml 4. Install Jupyter and some dependencies in this environment such as pip install jupyterlab pip install -U scikit-learn pip install matplotlib pip install seaborn pip install sklearn-evaluation https://docs.anaconda.com/anaconda/install
Answered 3 days AfterSep 21, 2022

Answer To: Assignment-1-CS522-overview-slides Assignment #1 Autism Detection Overview ● Research has shown that...

Shreya answered on Sep 24 2022
50 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here