Programming AssignmentSubmission: A single PDF with your code (use any programming language), results and analysis.Part 1Learning algorithms (e.g., Q-learning, Monte Carlo, dynamic...










Programming Assignment















Submission: A single PDF with your code (use any programming language), results and analysis.








Part 1








Learning algorithms (e.g., Q-learning, Monte Carlo, dynamic programming, double Q-learning, TD, SARSA and others). Chose any two algorithms and implement on a grid world goal searching problem.





1. Choose two algorithms you are going to implement, briefly introduce the algorithms and





provide their pseudo code.





2. Design your own grid world example (should be bigger than 3*2)





3. Show your goal searching process with step-to-go curve, sum of squared error and/or





theoretical value table





4. Please follow the project report guidelines and submit the report/code








Part 2








When you have a large grid world maze setup, it takes a long time for the agent to learn a value table. One way to eliminate this challenge is to use neural networks to approximate the value function.





There are two options provided below and choose either one to implement.








  1. a. Based on your results in Part 1, choose to build a neural network (or deep neural network) to approximate your obtained Q or V table.


















In this way, use a neural network to generate your Q or V value so that you can guide the agent to move to achieve the goal.








  1. b. Implement an actor-critic architecture (ADP) algorithm for grid world maze navigation.


















In this way, build an action network and a critic network to learn the Q table from scratch.





Report suggestions for part 2:





1. Choose either option you are going to implement and provide the pseudo code.





2. Design your own grid world example.





3. Show the convergence process of mean square error and the weights trajectories.





Mar 06, 2023
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here