Agents

Agent definitions

  • Agent: something that takes in information from its surroundings (sensors) and does something in response (actuators)
  • Agent function: rule or process that tells an agent what actions to take
  • Agent program: the actual implementation of the agent function
  • Percept: agent sensor input (text, images, sounds, etc.)
  • Percept sequence: chatgpt history (history of what the agent has had as inputs)
  • Rational: reasonably correct (does not need to be perfect), depends on 4 things:
    1. performance measure
    2. agent’s prior knowledge of the environment
    3. actions the agent can perform
    4. percept history to date
  • Rational agent: an agent that selects the action that maximizes performance measure (“does the right thing”)
  • Performance measure: a way to judge how well an agent is doing its job (e.g. chess game win rate, self driving car reaching destination safely and quickly, recommendation system how often users click/like suggestions)
  • PEAS framework:
    • Performance measure
    • Environment: the surroundings the agent operates in
    • Actuators: the parts that lets the agent take action (wheels, arms, display, …)
    • Sensors: the parts that let the agent perceive or gather information (camera, microphone, GPS, …)

Environment types:

  • Fully observable vs partially observable
    • Fully observable: agent can see everything it needs to make a decision (e.g. chess)
    • Partially observable: agent only sees part of the environment (e.g. driving in fog)
  • Deterministic vs stochastic
    • Deterministic: the next state is completely determined by the current state and action (no randomness)
    • Stochastic: there is some randomness in what happens next
  • Episodic vs sequential
    • Episodic: each action is independent what you do now doesn’t effect the next (e.g. image recognition)
    • Sequential: actions build on each other, past decisions matter (e.g. chess, driving)
  • Static vs dynamic
    • Static: environment doesn’t change while the agent is thinking (e.g. crossword puzzle, sudoku)
    • Dynamic: environment changes over time, even if the agent is not acting (e.g. real world, driving)
  • Discrete vs continuous
    • Discrete: limited number of possible states (e.g. chess, other board games)
    • Continuous: states and actions can take any value (e.g. driving speed, robot arm angles)
  • Single vs multi agent
    • Single-agent: only one agent is acting (e.g. solving a maze)
    • Multi-agent: multiple agents interact, possibly competing or cooperating (e.g. soccer, trading bots)

Agent types

  • Simple reflex agent: acts only based on current situation, uses conditional rules

    • only works in fully observable environments
  • Model-based reflex agent: keeps some internal memory (model) of the world, uses current input and past information to decide what to do

    • used in partially observable environments
  • Goal-based agent: has specific goals it tries to achieve, can plan ahead and compare different possible outcomes

  • Problem-solving agent: a type of goal-based agent that uses search and planning to find a way to reach its goal

    • Problem-solving cycle:
      1. goal formulation define desirable states
      2. problem formulation identify states and actions relevant to goal
      3. search find sequences of actions that reach goal
      4. execution carry out actions
  • Utility-based agent: tries to maximize performance, uses utility function to measure how good each action is

  • Learning agent: improves over time by learning from experience

Search algorithms

Uninformed (blind) search algorithms

  • Uninformed (blind) search algorithm: a search algorithm with no extra information about the location of the goal, only knows:
    • starting state
    • possible actions from each state
    • goal test (how to check if a state is the goal)
  • Examples:
AlgorithmDescriptionComplete?Optimal?
Breadth-first search (BFS)expands neighbouring nodes first, searches level by level✅ (if branching factor is finite)✅ (if all step costs = 1)
Depth-first search (DFS)expands deepest nodes first, goes down one path fully before backtracking❌ (can go infinitely deep)
Iterative deepening search (IDS)repeatedly does DFS with increasing depth limits✅ (if step cost = 1)

Informed search algorithms

  • Informed search algorithms: a search algorithm that uses extra knowledge (heuristic) about the problem to guide its search more efficiently towards the goal
  • Heuristic: a way to evaluate a problem state
  • Admissible heuristic: a heuristic () that never overestimates the true cost to reach the goal from any node
  • Examples:
AlgorithmDescriptionUses Heuristic?Complete?Optimal?
Greedy best-first search (GBFS)expands the node that appears closest to the goal, based on the heuristic value ()❌ (can get stuck in loops)
A*expands the node with lowest total estimated cost ()✅ (if is admissible)✅ (if is consistent)
Iterative deepening A* (IDA*)repeatedly runs a depth-limited DFS with increasing cost limits based on A*‘s until it finds the optimal solution✅ (if is admissible)✅ (if is consistent)
Hill climbing searchtries to continuously move towards better (higher valued) states according to heuristic❌ (can get stuck at local maximums or plateaus)

Genetic Algorithms

Darwinian Evolution Principles

  • Survival of the fittest: resources are limited, only the most adaptable individuals reproduce, selection ensures only efficient competitors survive
  • Diversity drives change:
    • behavioural/physical differences are partly inherited, partly random
    • traits increasing reproduction chances and heritable spread over generations
    • leads to new combinations and evolutions of species
  • Population dynamics:
    • population = unit of evolution
    • individual = unit of selection
    • random variations + selection = constant diversity source
    • no guiding force evolution is emergent, not directed

Natural & biological evolution

  • Alleles: possible values of a gene
  • Chromosomes: long DNA molecules containing many genes
  • Genome: complete genetic material
  • Genotype: gene set of an individual
  • Phenotype: observable characteristics from genotype
  • Key conditions for evolution:
    • Reproduction with inheritance: individuals make copies of themselves
      • copies should resemble their parents (not be duplicates)
    • Variation: ensure that copies are not identical to parents
      • mutations, crossover produces individuals with different traits
    • Selection: need a method to ensure that some individuals make copies of themselves more than others
      • fittest individuals (favourable traits) have more offspring than unfit individuals, and population therefore has those traits
      • over time, changes will cause new species that can specialize for particular environments

Genetic Algorithms

Natural ConceptGenetic Algorithm Equivalent
Chromosomesolution representation
Genefeature or variable
Allelefeature falue
Locusgene position
Genotypeencoded structure (solution encoding)
Phenotypedecoded solution (actual parameters)

Main steps

  1. Initialization: generate initial population
  2. Evaluation: compute fitness for each solution
  3. Selection: choose individuals for reproduction
  4. Crossover: mix parents to create children
  5. Mutation: introduce random variations
  6. Replacement: update population for next generation
  7. Termination: stop when condition met (e.g. max generations, target fitness hit)

Key variants

  • Simple GA (SGA): full population replaced each generation
  • Steady-state GA: only a few individuals replaced each iteration

Selection methods

  • Tournament selection: select random individuals and keep the best for reproduction, repeat until you have a chosen amount of parents
  • Roulette wheel selection: probabilistic selection based on fitness,

Premature Convergence

  • the population consists of similar individuals, it reduces likelihood of finding new solutions

Genetic Operators

  • Crossover: a method of combining two candidates from. the population to create new candidates
  • Mutation: lets offspring evolve in new directions introduces a certain amount of randomness (certain traits may become fixed)
  • Replication: copy an individual

Crossover Operations

  • 1-point crossover:
Step 1: Choose random crossover point
P1:  1  2  3  |  4  5  6  7
P2:  7  6  5  |  4  3  2  1
              ↑ crossover point

Step 2: Swap segments after crossover point
C1:  1  2  3  |  4  3  2  1
C2:  7  6  5  |  4  5  6  7

✅ Final Offspring  
C1: 1  2  3  4  3  2  1  
C2: 7  6  5  4  5  6  7
  • 2-points crossover:
Step 1: Choose two random crossover points
P1:  1  2 | 3  4  5 | 6  7
P2:  7  6 | 5  4  3 | 2  1
          ↑         ↑
         crossover points

Step 2: Swap middle segments between parents
C1:  1  2 | 5  4  3 | 6  7
C2:  7  6 | 3  4  5 | 2  1

✅ Final Offspring  
C1: 1  2  5  4  3  6  7  
C2: 7  6  3  4  5  2  1
  • Uniform crossover (UX): each gene has a 50% chance of being inherited from either parent
Step 1: Generate random mask
P1:   1  2  3  4  5  6  7
Mask: 1  0  1  0  0  1  0
P2:   7  6  5  4  3  2  1

Step 2: Copy by mask (1 → from P1, 0 → from P2)
C1:   1  6  3  4  3  6  1
C2:   7  2  5  4  5  2  7

✅ Final Offspring  
C1: 1 6 3 4 3 6 1  
C2: 7 2 5 4 5 2 7
  • Order crossover (OX):
    1. copying a randomly selected set from the first parent
    2. filling the remaining positions with the order of elements from the second
Step 1: Copy randomly selected set from first parent  
p1: 1 2 3 4 5 6 7 8 9  
p2: 9 3 7 8 2 6 5 1 4  

c1: * * * 4 5 6 7 * *  
c2: * * * 8 2 6 5 * *  

Step 2: Copy rest from second parent in order  
Remaining order from p2: 1, 9, 3, 8, 2  

C1: 3 8 2 4 5 6 7 1 9  
C2: ?
Step 1: Copy randomly selected set from first parent  
p1: 1 2 3 4 5 6 7 8 9  
p2: 4 5 2 1 8 7 6 9 3  

c1: * * * 4 5 6 7 * *  

Step 2: Copy rest from second parent in order  
Remaining order from p2: 9, 3, 2, 1, 8  

C1: 2 1 8 4 5 6 7 9 3  
  • Uniform order crossover (UOX):
Step 1: Setup  
Parents (P1, P2) and Mask  
P1:   6  2  1  4  5  7  3  
Mask: 0  1  1  0  1  0  1  → generate new mask for every 2 parents
P2:   4  3  7  2  1  6  5  

Step 2: Copy Genes by Mask  
- Copy genes from P1 → C1 where mask = 1  
- Copy genes from P2 → C2 where mask = 1  

C1:   -  2  1  -  5  -  3  
C2:   -  3  7  -  1  -  5  

Step 3: Fill Remaining from Opposite Parent  
- Fill blanks (-) with remaining genes from the other parent in order  

C1:   4  2  1  7  5  6  3  
C2:   6  3  7  2  1  4  5  

✅ Final Offspring  
C1:  4  2  1  7  5  6  3  
C2:  6  3  7  2  1  4  5  
  • Partially mapped crossover (PMX): preserves relative order and position (used for permutation problems)
Step 1: Select two crossover points
P1:  1  2 | 3  4  5 | 6  7  8
P2:  4  5 | 6  7  8 | 1  2  3

Step 2: Copy middle section from P1 → C1
C1:  *  * | 3  4  5 | *  *  *

Step 3: Map remaining genes from P2 using the mapping of swapped section
Mapping: 3↔6, 4↔7, 5↔8

Fill remaining using P2 order, applying mapping:
P2 order: 4 5 6 7 8 1 2 3  
→ Apply mapping to avoid duplicates

✅ Final Offspring  
C1:  6  7 | 3  4  5 | 8  1  2
  • Cycle crossover (CX): every gene comes from one parent in the same same index across cycles
Step 1: Find cycles based on position mapping
P1:  1  2  3  4  5  6  7
P2:  3  7  5  1  6  2  4

Mapping:  
1 → 3 (pos3) → 5 (pos5) → 6 (pos6) → 2 (pos2) → 7 (pos7) → 4 (pos4) → 1  
Cycle: (1,3,5,6,2,7,4)

Step 2: Copy first cycle from P1, rest from P2
C1:  1  2  3  4  5  6  7
C2:  3  7  5  1  6  2  4

✅ Final Offspring  
C1: 1  2  3  4  5  6  7  
C2: 3  7  5  1  6  2  4

Mutation

  • Inversion: reversed order of selected segment of genes
9 3 7 8 2 6 5 1 4 → 9 3 6 2 8 7 5 1 4
  • Insertion: select one gene and insert it into a random position
9 3 7 8 2 6 5 1 4 → 9 7 8 2 3 6 5 1 4
  • Displacement: select a subtour (continuous segment) and reinsert it somewhere else
9 3 7 8 2 6 5 1 4 → 9 8 2 6 5 3 7 1 4
  • Reciprocal exchange (swap): select two genes and swap their positions
9 3 7 8 2 6 5 1 4 → 5 3 7 8 2 6 9 1 4
  • Scramble mutation: select a subset of genes and randomly shuffle their order
9 3 7 8 2 6 5 1 4 → 9 3 6 8 7 2 5 1 4