Agents

  • Agent: perceives environment with sensors, acts on it with actuators
  • Percept: what sensors detect
  • Percept sequence: history of all percepts an agent has ever received
  • Agent function: maps percept sequence → action
  • Agent program: concrete implementation of agent function
  • Rational agent: chooses action expected to maximize performance measure given percepts + knowledge
  • Performance measure: defines success (objective, e.g., dirt cleaned, safety, cost)
  • rationality ≠ omniscience or perfection → just maximize expected performance

Agent examples

  • human agent: eyes/ears/etc. for sensors, hands/legs/etc. for actuators
  • robotic agent: cameras, infrared, motors
  • software agent: inputs = keystrokes/files/packets; outputs = screen, files, packets
  • environment: only the relevant part of the universe affecting perceptions/actions

PEAS framework

  • Performance measure, Environment, Actuators, Sensors
  • must define PEAS to design rational agent
  • examples:
    • vacuum agent: perf = clean squares, env = rooms, acts = suck/move, sens = dirt/location
    • internet shopping agent: perf = price/quality/efficiency, env = vendors/shippers/websites, acts = click/fill form/display, sens = html pages
    • automated taxi: perf = safety/speed/legal/profit/comfort, env = roads/traffic/police/weather, acts = steering/brake/accel/horn/etc., sens = cameras/gps/radar/etc.
    • medical diagnosis: perf = healthy patient/minimize cost, env = patient/hospital, acts = display/tests/referrals, sens = touchscreen/voice input
    • interactive tutor: perf = max student test score, env = students/testing agency, acts = display/exercises/feedback, sens = keyboard/voice
    • part-picking robot: perf = % parts binned correctly, env = conveyor + bins, acts = robotic arm, sens = camera/tactile sensors

Environment types

  • fully vs partially observable: all relevant info available or not (taxi = partial, chess = full)

  • deterministic vs stochastic: next state fully determined or probabilistic

  • episodic vs sequential: one-shot decisions vs decisions affect future

  • static vs dynamic: environment stays fixed vs changes during deliberation

  • discrete vs continuous: finite vs infinite states/time/percepts/actions

  • single vs multi-agent: only one agent vs multiple (competitive/cooperative)

  • simplest environment: fully observable, deterministic, episodic, static, discrete, single-agent

  • real world: partially observable, stochastic, sequential, dynamic, continuous, multi-agent

Agent types

four kinds:

  1. simple reflex
  2. model-based reflex
  3. goal-based
  4. utility-based
  • all can be turned into learning agents

Table-lookup agent (bad)

  • huge table → infeasible memory
  • long to build
  • no autonomy
  • cannot learn all entries from experience

Simple reflex agent

rectangles = current internal state of agent’s decision process, ovals = background info

  • action only depends on current percept
  • implemented via condition-action rules (e.g., if dirty → suck)
  • works only in fully observable environments

Pseudocode:

function SIMPLE-REFLEX-AGENT(percept) returns an action
    static: rules, a set of condition-action rules
    state <- INTERPRET-INPUT(percept)
    rule <- RULE-MATCH(state, rule)
    action <- RULE-ACTION[rule]
    return action

Model-based reflex agent

  • used in partially observable environments
  • maintains internal state updated by percept history + world model
  • tracks unobservable parts of world

Goal-based agent

tracks world state + goals, chooses action that leads to goal achievement

  • needs a goal to know desirable states
  • considers future sequences of actions
  • tied to search/planning research
  • more flexible, explicit knowledge representation

Utility-based agent

uses world model + utility function that measures preferences among states

  • some goals achievable in different ways; some better
  • utility function: maps states → real number
  • handles conflicting goals, picks based on success likelihood

Learning agent

  • earlier agents = action selection only, no origin of program
  • learning element: improves performance element, critic gives feedback
  • performance element: picks actions (core agent)
  • problem generator: suggests exploratory actions (explore vs exploit)
  • benefit = robustness in unknown environments

Supervised & unsupervised learning

Supervised

  • “teacher says yes or no”
  • training data has expected outcome

Unsupervised

  • only training data, no expected outcome
  • e.g., clustering or classification in its own way

Problem-solving agents

  • extend goal-based agents into problem-solving agents
  • must specify:
    • what goal needs to be achieved (task, situation, or properties)
    • how to know when goal is reached (goal test)
    • who specifies the goal (designer or user)

Problem-solving cycle

  1. goal formulation – define desirable states
  2. problem formulation – identify states and actions relevant to goal
  3. search – find sequence of actions that reaches goal
  4. execution – carry out the actions
flowchart TD
    A[Goal formulation] --> B[Problem formulation]
    B --> C[Search for solution]
    C --> D[Execute actions]

Search problems

  • a search problem defined by:
    • state space (all possible states)
    • successor function (actions and their costs)
    • start state and goal test
  • a solution = sequence of actions (plan) from start to goal

example: romania travel

  • state space: cities
  • successor: roads with distance costs
  • start: arad
  • goal test: is state bucharest?
  • solution: path of cities (e.g., arad → sibiu → fagaras → bucharest)
graph TD
    Arad --> Sibiu
    Arad --> Zerind
    Sibiu --> Fagaras
    Sibiu --> Rimnicu
    Fagaras --> Bucharest
    Rimnicu --> Pitesti
    Pitesti --> Bucharest

Single-state problem formulation

  • initial state (e.g., arad)
  • successor function S(x) = set of action–state pairs
    • e.g., S(arad) = {arad→zerind, arad→sibiu, …}
  • goal test: explicit (at bucharest) or implicit (checkmate)
  • path cost: additive; step cost c(x,a,y) ≥ 0
  • solution: sequence of actions from start to goal
  • optimal solution: lowest path cost
ElementExample
Initial stateArad
Successor functionS(Arad) = {Arad→Zerind, Arad→Sibiu, …}
Goal testat Bucharest
Path costdistance, # of actions

Selecting a state space

  • real world too complex → must use abstraction
  • abstract state = set of real states
  • abstract action = bundle of real actions (e.g., “arad→zerind” covers many routes)
  • abstraction valid if it preserves solution paths
  • abstract solution = set of real paths
  • each abstract action should be simpler than real problem

Example problems

  • vacuum world
    • states: 2 locations, dirty/clean
    • actions: {left, right, suck}
    • goal test: no dirt
    • path cost: number of actions
stateDiagram-v2
    [*] --> LeftDirty
    LeftDirty --> LeftClean: Suck
    LeftClean --> RightDirty: Move Right
    RightDirty --> RightClean: Suck
    RightClean --> Done: all clean
  • 8-puzzle
    • states: positions of 8 tiles + blank
    • actions: {up, down, left, right}
    • transition model: action → new state
    • goal test: reach target configuration
    • path cost: number of actions
    • note: only half of all possible initial states lead to goal
graph TD
    Start[Initial state] --> Move1[Move blank left]
    Start --> Move2[Move blank up]
    Move1 --> Goal[Target configuration]
    Move2 --> ...
  • 8-queens
    • states: 0–8 queens on board
    • actions: add queen to safe square
    • goal: 8 queens, none attacked
    • path cost: often irrelevant
graph TD
    EmptyBoard[0 queens] --> Q1[Add queen column 1]
    Q1 --> Q2[Add queen column 2 (non-attacking)]
    Q2 --> Q3[Add queen column 3 ...]
    Q3 --> Goal[8 queens placed safely]
  • chess

    • states: all board positions
    • actions: legal moves
    • initial: standard chess start
    • goal: checkmate opponent
  • robot assembly

    • states: joint angles + object parts
    • actions: continuous motions
    • goal test: complete assembly
    • path cost: execution time
flowchart TD
    Init[Initial arm position] --> Move[Joint motion]
    Move --> Partial[Partial assembly]
    Partial --> Complete[Goal: full assembly]

Real-world example problems

  • route finding (networks, travel planning, military ops)
  • touring problem
  • traveling salesman problem
  • vlsi design (chip layout and routing)
  • robot navigation
  • automatic assembly sequencing

Basic search algorithms (intro)

  • how to solve the above problems? → search state space
  • generate search tree:
    • root = initial state
    • children = successors
  • in practice, search generates a graph (states may be reached by multiple paths)