# PS3

## Contents

**Problem Set 3** (70 points)[edit]

EECS492 problem sets are a collaborative activity amongst your staff-assigned team. You may also discuss the problem sets and problem solving approach with other teams, but the work you submit must be wholly that of your team. In particular, you may not derive your solutions from previous years' solutions, nor may you derive your solutions from code snippets found elsewhere (including online). Examples: studying pseudo-code as found on Wikipedia is acceptable; adapting code from an open source project is not.

Your submission will include a written portion (which must be typeset, including any equations, figures, or graphs, and must be in PDF format), and a programming portion. Your writeup should include a terse restatement of each question (copy/paste is fine) in a font face that clearly separates the question from your answer, and should be formatted in an easy-to-navigate fashion (i.e., with large headings for task, and smaller headings for each problem). Make sure that the uniqnames of each member are listed at the top of every page as a running header. Deviations from this format will be penalized.

For the programming portions, you must include all of your code in a ready-to-compile state, along with a README that gives invocation instructions. Your PDF and programming files must then be zipped and submitted by email. Your email must have the following form, where the challenge score is computed based on the scoring rules described in the problem set.

TO: eecs492-staff@april.eecs.umich.edu SUBJECT: PS3: <uniqname1>, <uniqname2>, <uniqname3>. challenge=<challenge score> CC: <all of your team members>

You may receive a "moderation pending" message; this is harmless.

## Task 1. Logical Inferences [10 points][edit]

Given the following knowledge base KB:

S1: ((¬A ⇔ B) ⇔ C) ⇔ D

S2: (A ⇒ B) ∧ (C ∨ D)

S3: (¬A ∨ ¬B) ⇔ D

and the sentence T:

A ∨ C

**1.1** Determine if KB entails the sentence T using model checking. [2 points]

**1.2** EXTRA CREDIT Determine if KB entails the sentence T using the logical equivalences in R&N figure 7.11 and the inference rules in 7.5.1 (De Morgan, modus ponen, etc). [3 points]

**1.3** Convert each sentence in KB to CNF form. [2 points]

**1.4** Using your results from 1.3, determine if KB entails the sentence T with resolution. [3 points]

## Task 2. First Order Logic [20 points][edit]

**2.1.** a) Give an example where reducing FOL to PL is possible, but impractical. [2]

b) Give an example where reducing FOL to PL is not possible. [2]

**2.2.** Let F(x,y) be the predicate that x can fool y. Assuming domain to be all people in the world, express the following using quantifiers [10]

- Adam can fool exactly three people.
- No one can fool both Marissa and Rebecca.
- There is no one who can fool everyone.
- There is exactly one person who can be fooled by everyone.
- There is at most one person who can fool Sharon.

**2.3.** R&N Problem 9.7 a [2]

**2.3.** R&N Problem 9.9 [4]

## Task 3. Planning [15 points] [Moved to PS4][edit]

**3.1.**
Consider the cleaning agent in the vacuum world again.
Given definitions of the following actions:

Go(Room1) Preconditions: ¬At(Room1) Effects: At(Room1)

Go(Room2) Preconditions: ¬At(Room2) Effects: At(Room2)

Clean(Room1) Preconditions: At(Room1) ∧ Dirty(Room1) Effects: ¬Dirty(Room1)

Clean(Room2) Preconditions: At(Room2) ∧ Dirty(Room2) Effects: ¬Dirty(Room2)

Now we have initial state

At(Room1) ∧ Dirty(Room1) ∧ Dirty(Room2).

The goal state is

At(Room2) ∧ ¬Dirty(Room1) ∧ ¬Dirty(Room2).

Draw planning graph up to level 3 (S0, S1, S2) for this problem. For each step, state action you can take, show preconditions, mutex links and effects.

## Task 4. Wumpus World (Challenge problem) [25 points][edit]

In this task, we introduce a variant of the Wumpus world. Your agent enters the world at (0,0) and can move or shoot an arrow in any direction N, S, E, W. Pits cause a breeze in adjacent cells (N, S, E, W); the wumpus causes a stink in adjacent cells, both when alive and dead, and gold causes a glimmer when you're in the same cell. There is exactly one wumpus, and exactly three pits. The world is 4x4.

Your agent's goal is to find the gold, which can be in any cell except one containing a pit. (The gold may be in a cell with the wumpus!) The agent can shoot a wumpus in an adjacent cell, but your agent has only one arrow. Once your agent has the gold, your agent must escape from the dungeon by returning to (0,0) and executing action "LEAVE".

Every turn costs you two points. Leaving the dungeon with the gold is worth 100 points: it is not worth anything if you do not escape the dungeon alive. Killing the wumpus is worth 30 points, regardless of whether you subsequently escape the dungeon. Dying costs 20 points. Naturally, your (rational) agent should try to maximize its score.

For this exercise, you will implement a logical agent that controls this agent using DPLL for logical inference. We expect that you will implement a "hybrid" agent: an agent that combines reflex, problem solving (search), and logical inference (via DPLL) in order to compute good actions. For example, grabbing the gold when your agent perceives a glimmer can be easily accomplished with a reflex action. We require all solutions to be in Java, and to interface with our provided test code. Our code is available here:

You can play the wumpus game (as a human) by running the provided code as below. Note that the "1" indicates how many games you'd like to play. Note that the games will always be the same in order to guarantee fairness across different teams, but you should not specifically tune your solutions to the particular environments.

java -cp wumpus.jar wumpus.WumpusGrader wumpus.HumanAgent 1

Legal moves are "MOVE N|S|E|W", "SHOOT N|S|E|W", "GRAB", "LEAVE", and "SUICIDE".

Get started on this problem set early; it requires a significant amount of work!

**4.1.** [3 points] Write the rules of this wumpus game in propositional logic and convert the resulting PL to CNF. For the purposes of the writeup, you can use variable indices (i.e., Wxy) instead of writing out the rules for every cell individually.

**a** "The wumpus smells bad"

**b** "Pits cause a breeze"

**c** "There is exactly one wumpus".

**4.2.** [2 points] Give an example 4x4 wumpus world in which it is not possible to safely find the gold, even with perfect logical inference.

**4.3.** [2 points] Why is it convenient (from a propositional logic and inference perspective) that the wumpus stinks even when dead? What would happen if the wumpus stopped smelling bad once it was shot?

**4.4.** [4 points] Implement a generic DPLL satisfiability algorithm. You can omit the optimizations discussed in the book (such as unit clauses). Finding a good representation for propositions, models, and clauses is important for both your sanity and efficiency. We recommend representing each proposition as an integer, for example W12 might be mapped to 27. You can convert back and forth from the name of a proposition (W12) to its index (27) by using a HashMap.

As DPLL searches, it assigns values to propositions, essentially trying to construct a model in which all of the clauses in the KB are satisfied. The model can be represented as an array of integers; we recommend using the convention that a value of -1 means "indeterminant value", 0 means "definitely false", and 1 means "definitely true". For example, if model[27]==1, then W12 is definitely true.

Finally, a clause is just an array of integers with one element for every term in the clause. Of course, the clause must encode whether each term is negated or not; we recommend using the SIGN to encode the polarity of a literal. For example, consider the clause "W12 v -P22"; if W12 is mapped to 27 and P22 is mapped to 35, the clause would be { 27, -35}.

The workhorse method of your DPLL object is a satisfiability test on *one or more* clauses. This can be implemented by temporarily adding the query clauses to the KB, testing for satisfiability (using the DPLL search algorithm), then removing those query clauses from the KB. The signature of this method would be:

boolean isSatisfiable(ArrayList<int[]> clauses);

Specifying clauses as arrays of integers is convenient computationally, but error prone for humans. You should implement a method that, using the HashMap described above, converts a string in CNF form to a clause. The signature of this method is:

int[] parseClause(String cnfclause);

When this method encouters a proposition it has never seen before, it should automatically allocate a new id for it and add the mapping to your HashMap. Thus, you should be able to populate your KB using human-readable sentences in CNF form.

You should test your DPLL implementation by creating some simple KBs and querying them... do not try to get your entire Wumpus agent working without first checking your logical inference!

Turn in your DPLL code as well as the test cases you used to test it.

**4.5.** [4 points] Implement a simple path planning algorithm that takes a set of desirable goal cells, along with a map of "safe" travel cells. This planning algorithm should return the first move that your agent should take in order to move towards a goal cell (i.e., N, S, E, or W). The goal cells might be, for example, cells that you have not yet explored. "Safe" cells might be cells that you can prove do not contain a wumpus or a pit.

The best algorithm to implement here is the wavefront motion planner, but you could use A* if that's more comfortable. The difference in speed on these small worlds is inconsequential.

**4.6.** [4 points] Implement a hybrid agent that uses your inference method. Your agent should implement the WumpusAgent interface, which is described in the source code. You can test your WumpusAgent on the first 20 environments by invoking the WumpusGrader as follows:

java -cp:. wumpus.jar wumpus.WumpusGrader mypackage.MyWumpusAgent 20

Your hybrid agent must be able to score an average greater than zero on the first 20 test cases, and MUST survive each round. Note that, in order to guarantee survival, your agent might have to "LEAVE" some games before it finds the gold.

You should turn in both your source code and a description of your agent's strategy. What aspects of your agent are reflexive? Problem solving? Logical?

**4.7.** [6 points] CHALLENGE

The challenge for this problem set is to make your wumpus agent as high-scoring as possible on the first 100 test cases (just run the WumpusGrader with 100 as a command line argument). For the challenge, however, you can take risks: you do not need to survive each round. There are several things you can try to improve your score:

- Is it worth while to go out of your way to kill the wumpus, even if you can retrieve the gold? When? Might it be worth while to take a shot at the wumpus even if you're not sure exactly which cell it's in?
- It is okay to risk death in order to maximize your score. How would you decide when to do so? How would you estimate the probability that a cell contains a pit? Given uncertain information, how would you compute the expected score associated with travelling from one point to another?
- What is the best exploration strategy? How can you efficiently find the gold with a minimum of back tracking? There's a Travelling Salesman Problem lurking within, and the problem changes every time you perceive something!
- Can you take into account the information that you will gain in the future? I.e., might it be useful to go somewhere in order to collect percept data even if it's not otherwise on the "shortest path"? (This is hard! If you want to read ahead about these approaches, see R&N's chapter on Sequential Decision Processes).

Describe your approach.

**Submission Instructions**

Your challenge score will be your average score over the full 100 test cases. This is a large number in order to encourage you to develop reasonably fast methods, and to make it difficult to "tune" your approach to specific cases.

Capture the entire text output of your program on 100 test cases with something like:

java -jar wumpus.jar wumpus.MyAgent 100 | tee myscores.txt

Upload the file myscores.txt to the leaderboard.

For your information, the staff's uninspired solution scores an average of 26, and takes about 2 minutes to run.

The winner of this challenge will not be the first place team on the leaderboard. Instead, the top four teams on the leaderboard will become "finalists". We will run your code on a DIFFERENT set of 100 test cases. The actual winner will be announced in lecture!

## Task 5. Certification and Peer Evaluation [required for credit, 0 points][edit]

Print or write the following on a sheet of paper:

**"I participated and contributed to team discussions on each problem, and I attest to the integrity of each solution. Our team met as a group on [DATE(S)]." **

Each team member should sign the paper (physically, not digitally), and a photo of the paper should be included with your problem set. By doing so, you are asserting that you were materially involved in each answer, and that you certify that every solution complies with the collaboration policy of this course. In the event of a violation of the course's guidelines, your certification will be forwarded to the honor council.

If the statement requires qualification (i.e., it's not quite true), please describe the issue and the steps you took to mitigate it. If a team member did not materially contribute to a particular solution, then their score on that problem will generally be reduced. The staff will use their discretion based on the specific circumstances you identify, and we are happy to discuss potential problems prior to the problem set due date.

If your team works together, as required, this will be painless! :)

Each member of your team must also complete a peer evaluation, accessible from the apps page.

Your evaluations of your teammates will remain private. If problems develop with your team, it is up to your team to resolve them: we will not intercede except in the most extreme situations.