COSC 3P71: Lecture 3 and 4 Notes

Agents

Agent: perceives environment with sensors, acts on it with actuators
Percept: what sensors detect
Percept sequence: history of all percepts an agent has ever received
Agent function: maps percept sequence → action
Agent program: concrete implementation of agent function
Rational agent: chooses action expected to maximize performance measure given percepts + knowledge
Performance measure: defines success (objective, e.g., dirt cleaned, safety, cost)
rationality ≠ omniscience or perfection → just maximize expected performance

Agent examples

human agent: eyes/ears/etc. for sensors, hands/legs/etc. for actuators
robotic agent: cameras, infrared, motors
software agent: inputs = keystrokes/files/packets; outputs = screen, files, packets
environment: only the relevant part of the universe affecting perceptions/actions

PEAS framework

Performance measure, Environment, Actuators, Sensors
must define PEAS to design rational agent
examples:
- vacuum agent: perf = clean squares, env = rooms, acts = suck/move, sens = dirt/location
- internet shopping agent: perf = price/quality/efficiency, env = vendors/shippers/websites, acts = click/fill form/display, sens = html pages
- automated taxi: perf = safety/speed/legal/profit/comfort, env = roads/traffic/police/weather, acts = steering/brake/accel/horn/etc., sens = cameras/gps/radar/etc.
- medical diagnosis: perf = healthy patient/minimize cost, env = patient/hospital, acts = display/tests/referrals, sens = touchscreen/voice input
- interactive tutor: perf = max student test score, env = students/testing agency, acts = display/exercises/feedback, sens = keyboard/voice
- part-picking robot: perf = % parts binned correctly, env = conveyor + bins, acts = robotic arm, sens = camera/tactile sensors

Environment types

fully vs partially observable: all relevant info available or not (taxi = partial, chess = full)
deterministic vs stochastic: next state fully determined or probabilistic
episodic vs sequential: one-shot decisions vs decisions affect future
static vs dynamic: environment stays fixed vs changes during deliberation
discrete vs continuous: finite vs infinite states/time/percepts/actions
single vs multi-agent: only one agent vs multiple (competitive/cooperative)
simplest environment: fully observable, deterministic, episodic, static, discrete, single-agent
real world: partially observable, stochastic, sequential, dynamic, continuous, multi-agent

Agent types

four kinds:

simple reflex
model-based reflex
goal-based
utility-based

all can be turned into learning agents

Table-lookup agent (bad)

huge table → infeasible memory
long to build
no autonomy
cannot learn all entries from experience

Simple reflex agent

rectangles = current internal state of agent’s decision process, ovals = background info

action only depends on current percept
implemented via condition-action rules (e.g., if dirty → suck)
works only in fully observable environments

Pseudocode:

function SIMPLE-REFLEX-AGENT(percept) returns an action
    static: rules, a set of condition-action rules
    state <- INTERPRET-INPUT(percept)
    rule <- RULE-MATCH(state, rule)
    action <- RULE-ACTION[rule]
    return action

Model-based reflex agent

used in partially observable environments
maintains internal state updated by percept history + world model
tracks unobservable parts of world

Goal-based agent

tracks world state + goals, chooses action that leads to goal achievement

needs a goal to know desirable states
considers future sequences of actions
tied to search/planning research
more flexible, explicit knowledge representation

Utility-based agent

uses world model + utility function that measures preferences among states

some goals achievable in different ways; some better
utility function: maps states → real number
handles conflicting goals, picks based on success likelihood

Learning agent

earlier agents = action selection only, no origin of program
learning element: improves performance element, critic gives feedback
performance element: picks actions (core agent)
problem generator: suggests exploratory actions (explore vs exploit)
benefit = robustness in unknown environments

Supervised & unsupervised learning

Supervised

“teacher says yes or no”
training data has expected outcome

Unsupervised

only training data, no expected outcome
e.g., clustering or classification in its own way

Problem-solving agents

extend goal-based agents into problem-solving agents
must specify:
- what goal needs to be achieved (task, situation, or properties)
- how to know when goal is reached (goal test)
- who specifies the goal (designer or user)

Problem-solving cycle

goal formulation – define desirable states
problem formulation – identify states and actions relevant to goal
search – find sequence of actions that reaches goal
execution – carry out the actions

flowchart TD
    A[Goal formulation] --> B[Problem formulation]
    B --> C[Search for solution]
    C --> D[Execute actions]

Search problems

a search problem defined by:
- state space (all possible states)
- successor function (actions and their costs)
- start state and goal test
a solution = sequence of actions (plan) from start to goal

example: romania travel

state space: cities
successor: roads with distance costs
start: arad
goal test: is state bucharest?
solution: path of cities (e.g., arad → sibiu → fagaras → bucharest)

graph TD
    Arad --> Sibiu
    Arad --> Zerind
    Sibiu --> Fagaras
    Sibiu --> Rimnicu
    Fagaras --> Bucharest
    Rimnicu --> Pitesti
    Pitesti --> Bucharest

Single-state problem formulation

initial state (e.g., arad)
successor function S(x) = set of action–state pairs
- e.g., S(arad) = {arad→zerind, arad→sibiu, …}
goal test: explicit (at bucharest) or implicit (checkmate)
path cost: additive; step cost c(x,a,y) ≥ 0
solution: sequence of actions from start to goal
optimal solution: lowest path cost

Element	Example
Initial state	Arad
Successor function	S(Arad) = {Arad→Zerind, Arad→Sibiu, …}
Goal test	at Bucharest
Path cost	distance, # of actions

Selecting a state space

real world too complex → must use abstraction
abstract state = set of real states
abstract action = bundle of real actions (e.g., “arad→zerind” covers many routes)
abstraction valid if it preserves solution paths
abstract solution = set of real paths
each abstract action should be simpler than real problem

Example problems

vacuum world
- states: 2 locations, dirty/clean
- actions: {left, right, suck}
- goal test: no dirt
- path cost: number of actions

stateDiagram-v2
    [*] --> LeftDirty
    LeftDirty --> LeftClean: Suck
    LeftClean --> RightDirty: Move Right
    RightDirty --> RightClean: Suck
    RightClean --> Done: all clean

8-puzzle
- states: positions of 8 tiles + blank
- actions: {up, down, left, right}
- transition model: action → new state
- goal test: reach target configuration
- path cost: number of actions
- note: only half of all possible initial states lead to goal

graph TD
    Start[Initial state] --> Move1[Move blank left]
    Start --> Move2[Move blank up]
    Move1 --> Goal[Target configuration]
    Move2 --> ...

8-queens
- states: 0–8 queens on board
- actions: add queen to safe square
- goal: 8 queens, none attacked
- path cost: often irrelevant

graph TD
    EmptyBoard[0 queens] --> Q1[Add queen column 1]
    Q1 --> Q2[Add queen column 2 (non-attacking)]
    Q2 --> Q3[Add queen column 3 ...]
    Q3 --> Goal[8 queens placed safely]

chess
- states: all board positions
- actions: legal moves
- initial: standard chess start
- goal: checkmate opponent
robot assembly
- states: joint angles + object parts
- actions: continuous motions
- goal test: complete assembly
- path cost: execution time

flowchart TD
    Init[Initial arm position] --> Move[Joint motion]
    Move --> Partial[Partial assembly]
    Partial --> Complete[Goal: full assembly]

Real-world example problems

route finding (networks, travel planning, military ops)
touring problem
traveling salesman problem
vlsi design (chip layout and routing)
robot navigation
automatic assembly sequencing

Basic search algorithms (intro)

how to solve the above problems? → search state space
generate search tree:
- root = initial state
- children = successors
in practice, search generates a graph (states may be reached by multiple paths)

Connor's Notes

⬅️ Back to portfolio

Explorer

COSC 3P71: Lecture 3 and 4 Notes

Agents

Agent examples

PEAS framework

Environment types

Agent types

Table-lookup agent (bad)

Simple reflex agent

Model-based reflex agent

Goal-based agent

Utility-based agent

Learning agent

Supervised & unsupervised learning

Supervised

Unsupervised

Problem-solving agents

Problem-solving cycle

Search problems

Single-state problem formulation

Selecting a state space

Example problems

Real-world example problems

Basic search algorithms (intro)

Graph View

Table of Contents