attchedCPSC-57200 Artificial Intelligence 2 Homework #4 Introduction For this assignment, you...

Question

attchedCPSC-57200 Artificial Intelligence 2  Homework #4  Introduction  For this assignment, you will use the Python programming language to simulate an agent in a  grid environment, based on a policy found by the policy iteration algorithm.  Environment Description (based on AIMA textbook - see pg. 563)  The environment consists of a grid, with obstacles placed randomly within it and two terminating  states at predefined locations: one positive with a reward of +1, and one negative with a reward  of -1. The agent can move up, down, left, or right, but if the movement results into a collision  with a wall (obstacle or grid boundary), then no movement occurs. Otherwise, the agent moves  into the intended direction with 80% chance. In 20% of cases, the agent moves at right angles to  the intended direction.  Requirements  1) Use the provided base code to implement an environment simulator for this environment, such  that the specific geography of the environment is easily altered. In particular, you need to create  two functions:  def getMdpEnv(x_dim, y_dim, pos_terminal, neg_terminal, block_prob):      """ Generates a GridMDP of given dimension x_dim by y_dim, with random  obstacles placed with uniform probability block_prob and with two terminating  states with rewards +1/-1 for pos/neg states,respectively. The terminating  state locations are given by tuples pos_terminal and neg_terminal."""  def simulate_agent(env, pi, U):      """ Simulate the agent with the found policy pi on environment env from  each possible starting state for 1000 iterations. Keep track of rewards.  Afterwards, display the average reward received from each starting state. If  agent doesn't reach terminal state after 100 moves, then end."""  2) Run the policy iteration algorithm on random square sized environments of side sizes of 2, 4,  8, 16, 32, 64, and 128. Run the algorithm 10 iterations for each size and compute the average  execution time. Repeat for different values of block_prob (0, 0.25, and 0.5). Include the results in  your report and answer the question of how does the run time for policy iteration vary with the  size of the environment?  3) Simulate an agent that uses policy iteration on a random 5x5 grid environment with positive  and terminating states at (4,2) and (4,3), respectively. Measure its performance in the  environment simulator from all possible starting states (run the simulate_agent function).  Compare the average total reward received per run with the utility of the state, as determined by  your algorithm. Make sure to visualize the grid itself, the utilities determine by policy iteration,  and the computed average total rewards received per run (use a seaborn heatmap in Python), like  the image below: 4) Write a report detailing your implementation and execution. Visualize and discuss the results.  Attach your code along with the PDF of the report.

Sandeep Kumar · Accepted Answer

Microsoft Word - report.docx
The implementation has been provided in the code, as for the results:
The above graph represents the simulations of an agent with three different block probabilities, the 
one with highest execution time has the highest block probability of 0.5, while the others are 0.

CPSC-57200 Artificial Intelligence 2 Homework #4 Introduction For this assignment, you will use the Python programming language to simulate an agent in a grid environment, based on a policy found by...

Answer To: CPSC-57200 Artificial Intelligence 2 Homework #4 Introduction For this assignment, you will use the...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment