Agents
Agent definitions
- Agent: something that takes in information from its surroundings (sensors) and does something in response (actuators)
- Agent function: rule or process that tells an agent what actions to take
- Agent program: the actual implementation of the agent function
- Percept: agent sensor input (text, images, sounds, etc.)
- Percept sequence: chatgpt history (history of what the agent has had as inputs)
- Rational: reasonably correct (does not need to be perfect), depends on 4 things:
- performance measure
- agent’s prior knowledge of the environment
- actions the agent can perform
- percept history to date
- Rational agent: an agent that selects the action that maximizes performance measure (“does the right thing”)
- Performance measure: a way to judge how well an agent is doing its job (e.g. chess game → win rate, self driving car → reaching destination safely and quickly, recommendation system → how often users click/like suggestions)
- PEAS framework:
- Performance measure
- Environment: the surroundings the agent operates in
- Actuators: the parts that lets the agent take action (wheels, arms, display, …)
- Sensors: the parts that let the agent perceive or gather information (camera, microphone, GPS, …)
Environment types:
- Fully observable vs partially observable
- Fully observable: agent can see everything it needs to make a decision (e.g. chess)
- Partially observable: agent only sees part of the environment (e.g. driving in fog)
- Deterministic vs stochastic
- Deterministic: the next state is completely determined by the current state and action (no randomness)
- Stochastic: there is some randomness in what happens next
- Episodic vs sequential
- Episodic: each action is independent → what you do now doesn’t effect the next (e.g. image recognition)
- Sequential: actions build on each other, past decisions matter (e.g. chess, driving)
- Static vs dynamic
- Static: environment doesn’t change while the agent is thinking (e.g. crossword puzzle, sudoku)
- Dynamic: environment changes over time, even if the agent is not acting (e.g. real world, driving)
- Discrete vs continuous
- Discrete: limited number of possible states (e.g. chess, other board games)
- Continuous: states and actions can take any value (e.g. driving speed, robot arm angles)
- Single vs multi agent
- Single-agent: only one agent is acting (e.g. solving a maze)
- Multi-agent: multiple agents interact, possibly competing or cooperating (e.g. soccer, trading bots)
Agent types
-
Simple reflex agent: acts only based on current situation, uses conditional rules
- only works in fully observable environments
-
Model-based reflex agent: keeps some internal memory (model) of the world, uses current input and past information to decide what to do
- used in partially observable environments
-
Goal-based agent: has specific goals it tries to achieve, can plan ahead and compare different possible outcomes
-
Problem-solving agent: a type of goal-based agent that uses search and planning to find a way to reach its goal
- Problem-solving cycle:
- goal formulation → define desirable states
- problem formulation → identify states and actions relevant to goal
- search → find sequences of actions that reach goal
- execution → carry out actions
- Problem-solving cycle:
-
Utility-based agent: tries to maximize performance, uses utility function to measure how good each action is
-
Learning agent: improves over time by learning from experience
Search algorithms
Uninformed (blind) search algorithms
- Uninformed (blind) search algorithm: a search algorithm with no extra information about the location of the goal, only knows:
- starting state
- possible actions from each state
- goal test (how to check if a state is the goal)
- Examples:
Algorithm | Description | Complete? | Optimal? |
---|---|---|---|
Breadth-first search (BFS) | expands neighbouring nodes first, searches level by level | ✅ (if branching factor is finite) | ✅ (if all step costs = 1) |
Depth-first search (DFS) | expands deepest nodes first, goes down one path fully before backtracking | ❌ (can go infinitely deep) | ❌ |
Iterative deepening search (IDS) | repeatedly does DFS with increasing depth limits | ✅ | ✅ (if step cost = 1) |
Informed search algorithms
- Informed search algorithms: a search algorithm that uses extra knowledge (heuristic) about the problem to guide its search more efficiently towards the goal
- Heuristic: a way to evaluate a problem state
- Admissible heuristic: a heuristic () that never overestimates the true cost to reach the goal from any node
- Examples:
Algorithm | Description | Uses Heuristic? | Complete? | Optimal? |
---|---|---|---|---|
Greedy best-first search (GBFS) | expands the node that appears closest to the goal, based on the heuristic value () | ✅ | ❌ (can get stuck in loops) | ❌ |
A* | expands the node with lowest total estimated cost () | ✅ | ✅ (if is admissible) | ✅ (if is consistent) |
Iterative deepening A* (IDA*) | repeatedly runs a depth-limited DFS with increasing cost limits based on A*‘s until it finds the optimal solution | ✅ | ✅ (if is admissible) | ✅ (if is consistent) |
Hill climbing search | tries to continuously move towards better (higher valued) states according to heuristic | ✅ | ❌ (can get stuck at local maximums or plateaus) | ❌ |
Genetic Algorithms
Darwinian Evolution Principles
- Survival of the fittest: resources are limited, only the most adaptable individuals reproduce, selection ensures only efficient competitors survive
- Diversity drives change:
- behavioural/physical differences are partly inherited, partly random
- traits increasing reproduction chances and heritable → spread over generations
- leads to new combinations and evolutions of species
- Population dynamics:
- population = unit of evolution
- individual = unit of selection
- random variations + selection = constant diversity source
- no guiding force → evolution is emergent, not directed
Natural & biological evolution
- Alleles: possible values of a gene
- Chromosomes: long DNA molecules containing many genes
- Genome: complete genetic material
- Genotype: gene set of an individual
- Phenotype: observable characteristics from genotype
- Key conditions for evolution:
- Reproduction with inheritance: individuals make copies of themselves
- copies should resemble their parents (not be duplicates)
- Variation: ensure that copies are not identical to parents
- mutations, crossover produces individuals with different traits
- Selection: need a method to ensure that some individuals make copies of themselves more than others
- fittest individuals (favourable traits) have more offspring than unfit individuals, and population therefore has those traits
- over time, changes will cause new species that can specialize for particular environments
- Reproduction with inheritance: individuals make copies of themselves
Genetic Algorithms
Natural Concept | Genetic Algorithm Equivalent |
---|---|
Chromosome | solution representation |
Gene | feature or variable |
Allele | feature falue |
Locus | gene position |
Genotype | encoded structure (solution encoding) |
Phenotype | decoded solution (actual parameters) |
Main steps
- Initialization: generate initial population
- Evaluation: compute fitness for each solution
- Selection: choose individuals for reproduction
- Crossover: mix parents to create children
- Mutation: introduce random variations
- Replacement: update population for next generation
- Termination: stop when condition met (e.g. max generations, target fitness hit)
Key variants
- Simple GA (SGA): full population replaced each generation
- Steady-state GA: only a few individuals replaced each iteration
Selection methods
- Tournament selection: select random individuals and keep the best for reproduction, repeat until you have a chosen amount of parents
- Roulette wheel selection: probabilistic selection based on fitness,
Premature Convergence
- the population consists of similar individuals, it reduces likelihood of finding new solutions
Genetic Operators
- Crossover: a method of combining two candidates from. the population to create new candidates
- Mutation: lets offspring evolve in new directions → introduces a certain amount of randomness (certain traits may become fixed)
- Replication: copy an individual
Crossover Operations
- 1-point crossover:
Step 1: Choose random crossover point
P1: 1 2 3 | 4 5 6 7
P2: 7 6 5 | 4 3 2 1
↑ crossover point
Step 2: Swap segments after crossover point
C1: 1 2 3 | 4 3 2 1
C2: 7 6 5 | 4 5 6 7
✅ Final Offspring
C1: 1 2 3 4 3 2 1
C2: 7 6 5 4 5 6 7
- 2-points crossover:
Step 1: Choose two random crossover points
P1: 1 2 | 3 4 5 | 6 7
P2: 7 6 | 5 4 3 | 2 1
↑ ↑
crossover points
Step 2: Swap middle segments between parents
C1: 1 2 | 5 4 3 | 6 7
C2: 7 6 | 3 4 5 | 2 1
✅ Final Offspring
C1: 1 2 5 4 3 6 7
C2: 7 6 3 4 5 2 1
- Uniform crossover (UX): each gene has a 50% chance of being inherited from either parent
Step 1: Generate random mask
P1: 1 2 3 4 5 6 7
Mask: 1 0 1 0 0 1 0
P2: 7 6 5 4 3 2 1
Step 2: Copy by mask (1 → from P1, 0 → from P2)
C1: 1 6 3 4 3 6 1
C2: 7 2 5 4 5 2 7
✅ Final Offspring
C1: 1 6 3 4 3 6 1
C2: 7 2 5 4 5 2 7
- Order crossover (OX):
- copying a randomly selected set from the first parent
- filling the remaining positions with the order of elements from the second
Step 1: Copy randomly selected set from first parent
p1: 1 2 3 4 5 6 7 8 9
p2: 9 3 7 8 2 6 5 1 4
c1: * * * 4 5 6 7 * *
c2: * * * 8 2 6 5 * *
Step 2: Copy rest from second parent in order
Remaining order from p2: 1, 9, 3, 8, 2
C1: 3 8 2 4 5 6 7 1 9
C2: ?
Step 1: Copy randomly selected set from first parent
p1: 1 2 3 4 5 6 7 8 9
p2: 4 5 2 1 8 7 6 9 3
c1: * * * 4 5 6 7 * *
Step 2: Copy rest from second parent in order
Remaining order from p2: 9, 3, 2, 1, 8
C1: 2 1 8 4 5 6 7 9 3
- Uniform order crossover (UOX):
Step 1: Setup
Parents (P1, P2) and Mask
P1: 6 2 1 4 5 7 3
Mask: 0 1 1 0 1 0 1 → generate new mask for every 2 parents
P2: 4 3 7 2 1 6 5
Step 2: Copy Genes by Mask
- Copy genes from P1 → C1 where mask = 1
- Copy genes from P2 → C2 where mask = 1
C1: - 2 1 - 5 - 3
C2: - 3 7 - 1 - 5
Step 3: Fill Remaining from Opposite Parent
- Fill blanks (-) with remaining genes from the other parent in order
C1: 4 2 1 7 5 6 3
C2: 6 3 7 2 1 4 5
✅ Final Offspring
C1: 4 2 1 7 5 6 3
C2: 6 3 7 2 1 4 5
- Partially mapped crossover (PMX): preserves relative order and position (used for permutation problems)
Step 1: Select two crossover points
P1: 1 2 | 3 4 5 | 6 7 8
P2: 4 5 | 6 7 8 | 1 2 3
Step 2: Copy middle section from P1 → C1
C1: * * | 3 4 5 | * * *
Step 3: Map remaining genes from P2 using the mapping of swapped section
Mapping: 3↔6, 4↔7, 5↔8
Fill remaining using P2 order, applying mapping:
P2 order: 4 5 6 7 8 1 2 3
→ Apply mapping to avoid duplicates
✅ Final Offspring
C1: 6 7 | 3 4 5 | 8 1 2
- Cycle crossover (CX): every gene comes from one parent in the same same index across cycles
Step 1: Find cycles based on position mapping
P1: 1 2 3 4 5 6 7
P2: 3 7 5 1 6 2 4
Mapping:
1 → 3 (pos3) → 5 (pos5) → 6 (pos6) → 2 (pos2) → 7 (pos7) → 4 (pos4) → 1
Cycle: (1,3,5,6,2,7,4)
Step 2: Copy first cycle from P1, rest from P2
C1: 1 2 3 4 5 6 7
C2: 3 7 5 1 6 2 4
✅ Final Offspring
C1: 1 2 3 4 5 6 7
C2: 3 7 5 1 6 2 4
Mutation
- Inversion: reversed order of selected segment of genes
9 3 7 8 2 6 5 1 4 → 9 3 6 2 8 7 5 1 4
- Insertion: select one gene and insert it into a random position
9 3 7 8 2 6 5 1 4 → 9 7 8 2 3 6 5 1 4
- Displacement: select a subtour (continuous segment) and reinsert it somewhere else
9 3 7 8 2 6 5 1 4 → 9 8 2 6 5 3 7 1 4
- Reciprocal exchange (swap): select two genes and swap their positions
9 3 7 8 2 6 5 1 4 → 5 3 7 8 2 6 9 1 4
- Scramble mutation: select a subset of genes and randomly shuffle their order
9 3 7 8 2 6 5 1 4 → 9 3 6 8 7 2 5 1 4