CPSC-57200 Artificial Intelligence 2 Homework #4 Introduction For this assignment, you will use the Python programming language to simulate an agent in a grid environment, based on a policy found by...

1 answer below »
attched


CPSC-57200 Artificial Intelligence 2 Homework #4 Introduction For this assignment, you will use the Python programming language to simulate an agent in a grid environment, based on a policy found by the policy iteration algorithm. Environment Description (based on AIMA textbook - see pg. 563) The environment consists of a grid, with obstacles placed randomly within it and two terminating states at predefined locations: one positive with a reward of +1, and one negative with a reward of -1. The agent can move up, down, left, or right, but if the movement results into a collision with a wall (obstacle or grid boundary), then no movement occurs. Otherwise, the agent moves into the intended direction with 80% chance. In 20% of cases, the agent moves at right angles to the intended direction. Requirements 1) Use the provided base code to implement an environment simulator for this environment, such that the specific geography of the environment is easily altered. In particular, you need to create two functions: def getMdpEnv(x_dim, y_dim, pos_terminal, neg_terminal, block_prob): """ Generates a GridMDP of given dimension x_dim by y_dim, with random obstacles placed with uniform probability block_prob and with two terminating states with rewards +1/-1 for pos/neg states,respectively. The terminating state locations are given by tuples pos_terminal and neg_terminal.""" def simulate_agent(env, pi, U): """ Simulate the agent with the found policy pi on environment env from each possible starting state for 1000 iterations. Keep track of rewards. Afterwards, display the average reward received from each starting state. If agent doesn't reach terminal state after 100 moves, then end.""" 2) Run the policy iteration algorithm on random square sized environments of side sizes of 2, 4, 8, 16, 32, 64, and 128. Run the algorithm 10 iterations for each size and compute the average execution time. Repeat for different values of block_prob (0, 0.25, and 0.5). Include the results in your report and answer the question of how does the run time for policy iteration vary with the size of the environment? 3) Simulate an agent that uses policy iteration on a random 5x5 grid environment with positive and terminating states at (4,2) and (4,3), respectively. Measure its performance in the environment simulator from all possible starting states (run the simulate_agent function). Compare the average total reward received per run with the utility of the state, as determined by your algorithm. Make sure to visualize the grid itself, the utilities determine by policy iteration, and the computed average total rewards received per run (use a seaborn heatmap in Python), like the image below: 4) Write a report detailing your implementation and execution. Visualize and discuss the results. Attach your code along with the PDF of the report.
Answered 2 days AfterApr 18, 2022

Answer To: CPSC-57200 Artificial Intelligence 2 Homework #4 Introduction For this assignment, you will use the...

Sandeep Kumar answered on Apr 20 2022
88 Votes
Microsoft Word - report.docx
The implementation has been provided in the code, as for the results:

The above graph represents the simulations of an agent with three different block probabilities, the
one with highest execution time has the highest block probability of 0.5, while the others are...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here