Cheat Sheet — Decision Support Systems

Decision Theory

One agent, one decision, against an indifferent world.

Decision theory

The normative study of how a single rational agent should choose among actions whose outcomes carry different value (utility) — it prescribes what an ideal agent ought to do, not what people actually do.

normativeactions · states · outcomesutilitythe baseline

SayIt prescribes how a rational agent should decide — not how people actually do.

▸MoreLess

“Normative” = it prescribes what an ideal rational agent ought to do — not what real people do (that is Lecture 5).
Three ingredients: a set of actions, the possible states/outcomes, and a utility over those outcomes.
Two regimes: decision under risk (probabilities known → maximise expected utility) vs decision under ignorance (no probabilities → apply a rule).
It is the yardstick the rest of the course measures everything against.

🎯 In the examAlways first decide which regime you are in — risk (have probabilities) or ignorance (don’t) — because it picks the tool.

Utility

A real-valued function U(o) measuring how much an agent prefers outcome o; acting rationally means choosing so as to maximise it (a concave U over money = risk-averse).

U : outcomes → ℝinterval scaleconcave = risk-averse

SayIt turns preferences into numbers you can compare and average.

▸MoreLess

Only order and relative spacing matter: utility is an interval scale, so any positive affine transform (a·U+b, a>0) represents the same preferences.
It need not be money — a concave utility over money encodes risk aversion (a sure €50 preferred to a 50/50 shot at €100).
For a utility to exist at all, preferences must be complete and transitive (part of the VNM axioms).
The recurring thread: utility becomes the reward in L4, the DSS’s objective in L6, and net benefit in L7.

🎯 In the examNote it is on an interval scale and that concavity = risk aversion.

Expected utility

The probability-weighted average of an action’s outcome utilities; under risk (probabilities known) a rational agent picks the action with the highest expected utility — justified by the von Neumann–Morgenstern axioms.

EU(a)=Σ P(o|a)·U(o)VNM theoremdecision under risk

SayWhen probabilities are known, maximise expected utility (von Neumann–Morgenstern).

▸MoreLess

VNM theorem: if your preferences over lotteries obey four axioms (completeness, transitivity, continuity, independence) you behave as if maximising expected utility.
This is the rule for decision under risk (probabilities given) — not ignorance.
Maximising expected value (money) is the special case where U is linear; expected utility lets you encode a risk attitude.
L5 shows humans systematically break this (Allais paradox, prospect theory) — exactly why the course pivots to behavior.

ExampleAction A gives 0.5·U(win)+0.5·U(lose); compute EU for each action and choose the largest.

🎯 In the examClassic prompt: define expected utility, give the formula, and cite the VNM justification.

Decision under ignorance

Choosing when the outcome probabilities are unknown: you cannot take an expectation, so you apply a decision rule that encodes your attitude to the unknown.

maximin (worst-case)maximax (best-case)Hurwicz (α)minimax-regretLaplace

SayWithout probabilities, the rule you pick is your attitude to the unknown.

▸MoreLess

Maximin (Wald): maximise the worst case — pessimistic / cautious.
Maximax: maximise the best case — optimistic.
Hurwicz: blend worst and best with an optimism coefficient α.
Minimax-regret (Savage): minimise the largest regret (gap from the best you could have done in each state).
Laplace / insufficient reason: assume all states equally likely, then maximise expected utility.

ExampleInvesting with no idea of the odds: maximin picks the option whose worst outcome is least bad.

🎯 In the examBe able to compute each rule from a payoff table and state the attitude it encodes.

Bayesian & Decision Networks

Represent all that uncertainty without exploding.

Open lecture ↗

Bayesian network

A directed acyclic graph whose nodes are random variables and whose edges encode direct dependence; each node stores P(X|parents) and the product of these gives the full joint distribution compactly.

DAG∏ P(Xᵢ|parents)compact factorisationinference

SayMany small tables instead of one exponential joint table.

▸MoreLess

Directed and acyclic (a DAG): arrows go from parent (cause) to child (effect); no cycles.
Each node only needs P(node | its parents), so a sparse graph means dramatically fewer parameters.
It encodes independencies: a node is independent of its non-descendants given its parents (the local Markov property).
Main use is inference: observe some variables, compute the posterior over the rest (diagnosis, prediction).

ExampleRain → WetGrass ← Sprinkler: store P(Rain), P(Sprinkler|Rain), P(Wet|Rain,Sprinkler) instead of one 8-row joint table.

Conditional independence

X and Y are conditionally independent given Z when, once Z is known, Y carries no extra information about X (P(X|Y,Z)=P(X|Z)) — the property that lets the network factorise.

d-separationwhy it’s compactfewer edges

SayIt’s the exact property that lets the network factorise.

▸MoreLess

It is conditional: X and Y can be dependent overall yet independent once Z is fixed (and vice-versa).
Fewer dependencies ⇒ fewer edges ⇒ smaller tables ⇒ a compact model.
“d-separation” lets you read these independencies straight off the graph’s shape.
It is the reason the joint isn’t a full 2ⁿ table and why inference can sometimes be efficient.

ExampleGiven it’s raining (Z), whether the sprinkler is on (Y) tells you nothing more about the season (X).

Decision network

A Bayesian network extended with decision nodes and a utility node; solving it means choosing the decisions that maximise expected utility (also called an influence diagram).

influence diagramchance/decision/utility nodesmax EU

SayLecture 1’s rule, scaled up with the network representation.

▸MoreLess

Three node types: chance (ovals = random variables), decision (rectangles = your choices), utility (diamond = the objective).
Solving it = pick the decision(s) that maximise expected utility given the evidence.
It directly unifies L1 (expected utility) with L2’s compact representation.
It is the basis of the “probabilistic” model paradigm catalogued in Lecture 6.

🎯 In the examName the three node types and that the output is the EU-maximising decision.

Inference is #P-hard

Even though the network is compact to store, computing an exact marginal or posterior is #P-hard in general (as hard as counting a Boolean formula’s solutions) — so in practice we use approximate inference.

#P-hardcounting problemsampling / variational

SayA good representation doesn’t guarantee cheap computation.

▸MoreLess

#P is the counting analogue of NP: not “is there a solution?” but “how many?” — believed even harder than NP-complete.
Exact inference is only efficient for special structures (e.g. trees / low treewidth).
So in practice we use approximate inference: sampling (MCMC), variational methods, loopy belief propagation.
The lesson generalises: a good representation does not guarantee cheap computation — a recurring DSS theme.

🎯 In the examKey word is #P-hard; the takeaway is “use approximate inference”. (Complexity was flagged as lighter for the exam.)

Game Theory & Nash

The world is now other agents, each chasing their own goal.

Open lecture ↗

Game theory

The study of decisions among several rational agents where each one’s payoff depends on the others’ choices, so each must reason about the others’ reasoning.

non-cooperativesimultaneouszero-sum vs general-sum

SayYour best move depends on theirs, and theirs on yours.

▸MoreLess

Key shift from L1: the outcome of your action depends on what others do, so you must reason about their reasoning.
Non-cooperative game theory: players cannot make binding agreements; each maximises its own payoff.
Central questions: what is a “solution” to a game? (Nash) and can players do better by coordinating? (correlated equilibrium).
Games can be zero-sum (pure competition) or general-sum (mixed motives, like the prisoner’s dilemma).

🎯 In the examThe course focuses on non-cooperative games with the Nash equilibrium as the solution concept.

Normal form

A representation of a game as a matrix giving each player’s payoff for every combination of the players’ strategies (it assumes simultaneous moves).

strategic formpayoff matrixpure / mixedvs extensive form

SayRows = your moves, columns = theirs, cells = payoffs.

▸MoreLess

Also called strategic form; it assumes simultaneous moves (neither sees the other’s choice first).
Contrast with extensive form (a game tree) used for sequential moves.
From the matrix you find dominant strategies, best responses, and equilibria.
A strategy may be pure (one action) or mixed (a probability distribution over actions).

ExampleThe prisoner’s dilemma is a 2×2 matrix where (Defect, Defect) is the equilibrium even though (Cooperate, Cooperate) is better for both.

Nash equilibrium

A strategy profile in which no player can increase their own payoff by changing strategy unilaterally — every player is simultaneously best-responding. It always exists (possibly mixed) but need not be efficient.

mutual best responsealways exists (maybe mixed)not always efficientprisoner’s dilemma

Say“Given what everyone else does, I have no reason to move.”

▸MoreLess

“Unilateral”: hold the others fixed — if no single player gains by deviating alone, it is an equilibrium.
Existence (Nash’s theorem): every finite game has at least one equilibrium, possibly in mixed strategies.
It need not be unique, Pareto-efficient, or “fair” — the prisoner’s dilemma equilibrium is worse for both than cooperating.
Interpretation: a self-enforcing convention / fixed point of mutual best response — not necessarily a prediction of real behaviour.
Computing one is generally hard (PPAD-complete in general games).

ExamplePrisoner’s dilemma → (Defect, Defect): neither prisoner gains by switching alone, though both prefer (Cooperate, Cooperate).

🎯 In the examThe prof’s verbatim prompt: “define a Nash equilibrium and discuss its interpretation.” Hit mutual best response · always exists (maybe mixed) · not always efficient.

Mixed strategies

A strategy that plays each action with some probability rather than deterministically; allowing mixtures is what guarantees every finite game has at least one Nash equilibrium.

probabilities over actionsindifference conditionrock-paper-scissors

SayRandomise so you can’t be predicted — and existence is guaranteed.

▸MoreLess

A pure strategy is the special case that puts probability 1 on a single action.
In a mixed equilibrium each player is indifferent among the actions they randomise over — that indifference is what makes it stable.
Why needed: games like rock-paper-scissors or matching pennies have no pure-strategy equilibrium.
Connects to L4’s stochastic policies and to the adversarial dynamics of GAN training mentioned in lecture.

🎯 In the examExplain why mixed strategies are needed (unpredictability) and that they guarantee existence.

Correlated equilibrium

An equilibrium with respect to a shared random signal: given its private recommendation no player wants to deviate. It generalises Nash and can reach joint payoffs that beat every Nash equilibrium.

shared signal / mediatortraffic lightgeneralises Nash

SayCoordination via a signal — like a traffic light avoiding collisions.

▸MoreLess

A trusted “mediator” privately recommends an action to each player; following it is a best response if everyone else follows too.
Every Nash equilibrium is a correlated equilibrium, but not vice-versa — the set is strictly larger.
It can achieve expected payoffs that beat every Nash equilibrium.
It is computationally easier to find than Nash (it is a linear program).

ExampleTraffic light: given your light colour you have no incentive to deviate, and collisions are avoided.

Sequential Decision Making

One agent, many decisions — today changes tomorrow.

Open lecture ↗

Knowledge spectrum

Sequential decision-making = acting over time where today’s choice changes tomorrow’s state; which method you use depends on what you know — model → dynamic programming · no model → reinforcement learning · simulator → MCTS.

policy πdiscounted return γlong-run reward

SayThe tool depends on how much of the world you actually know.

▸MoreLess

The knowledge spectrum: know the model → plan with dynamic programming · no model → learn with reinforcement learning · have a simulator → search with MCTS.
A policy π maps each state to an action; the goal is the policy maximising expected discounted return.
The discount γ trades immediate vs future reward and keeps infinite-horizon sums finite.
Everything in the lecture hinges on the Bellman optimality condition.

🎯 In the examThe prof said he will NOT ask for formulas here — only intuition (e.g. “explain Bellman optimality in intuitive terms”).

MDP ⟨S,A,P,R,γ⟩

A Markov Decision Process — states, actions, stochastic transitions P(s′|s,a), reward and discount γ — obeying the Markov property: the next state depends only on the current state and action.

Markov propertystochastic transitionsrewarddiscount γ

SayThe future depends only on the current state, not the whole history.

▸MoreLess

Markov property: the future depends only on the current state and action, not the whole past — this is what makes it solvable.
Transitions are stochastic: the same action can lead to different next states (the uncertainty thread continuing from L2).
Reward R(s,a) is the per-step payoff; the agent maximises the discounted sum (the “return”).
γ near 0 = myopic / short-sighted; γ near 1 = far-sighted.

ExampleA robot on a grid: state = cell, actions = N/E/S/W, moves may “slip”, reward = +10 at the goal.

Bellman equation

The recursive optimality condition: a state’s value equals the best immediate reward plus the discounted expected value of the next state — the identity every Lecture-4 algorithm solves.

recursive / self-consistentV* , Q*now + future

Say“Best from here = grab now + best from wherever I end up.”

▸MoreLess

It is a self-consistency condition: the optimal value is defined in terms of itself one step later.
Q*(s,a) is the same idea for state-action pairs; the optimal policy is π*(s) = argmaxₐ Q*(s,a).
Solving this equation is solving the MDP — value iteration, policy iteration and Q-learning all do it.
The “max” makes it nonlinear, which is why we iterate rather than solve in closed form.

🎯 In the examThe prof’s example: “explain the Bellman optimality condition intuitively” — emphasise the now-vs-future split, not the algebra.

Value / policy iteration

Dynamic-programming algorithms that, when the model is known, repeatedly apply the Bellman update until they converge to the optimal value function and policy.

dynamic programmingmodel knownconverges

SayKnow the rules → compute the best plan directly.

▸MoreLess

Value iteration: repeatedly apply the Bellman update to V until it stops changing, then read off the greedy policy.
Policy iteration: alternate evaluating the current policy and improving it greedily — fewer but heavier steps.
Defining assumption: you must KNOW P and R (the full model).
Guaranteed to converge to the optimal policy for a finite discounted MDP.

ExampleThe lesson’s robot corridor converges to V(1)=5.39, V(2)=7.1, V(3)=9 with policy “always go right”.

Reinforcement learning

Learning an optimal policy from sampled experience when the model is unknown (e.g. Q-learning), trading off exploration against exploitation.

model-freeQ-learningexplore vs exploitε-greedy

SayDon’t know the rules → learn by trying.

▸MoreLess

Model-free: the agent never builds P explicitly; it learns values directly from experience tuples (s, a, r, s′).
Q-learning nudges Q(s,a) toward r + γ·maxₐ′ Q(s′,a′) — a sampled Bellman update.
Exploration vs exploitation: you must try new actions to learn but exploit known-good ones to score (e.g. ε-greedy).
Common distinction: off-policy (Q-learning) vs on-policy (SARSA).

🎯 In the examKey contrast with DP: RL has no model and learns from interaction.

Monte Carlo Tree Search

A planning method that, given a simulator, estimates action values by sampling many look-ahead roll-outs and selectively growing the search toward promising branches (selection via UCB).

select·expand·simulate·backpropUCBAlphaGo

SayCan simulate → imagine many futures, pursue the good ones.

▸MoreLess

Four steps per iteration: Selection → Expansion → Simulation (roll-out) → Backpropagation.
UCB balances trying under-explored moves vs exploiting moves that look good (the explore/exploit theme again).
Needs a generative model / simulator, not the full probability tables — a middle ground between DP and RL.
Powers AlphaGo / AlphaZero when paired with learned value and policy networks.

🎯 In the examBe able to name the four steps and say what UCB is for.

Behavioral Decision Making

How people really decide — predictably imperfectly.

Open lecture ↗

1Failures of rationality

Normative → descriptive (the pivot)

The shift from normative models (how an agent should decide — L1–4) to descriptive ones (how people actually decide). A DSS does not operate in isolation, so it must model human decision-making, not only optimal decisions.

normative → descriptiveDSS not in isolationmodel the human“are humans rational?”

SayA DSS advises a human, so it must model how that human really decides.

▸MoreLess

Normative models (decision theory, game theory, MDPs/RL) say how agents should decide — enough only if systems acted in isolation.
But a DSS’s recommendations are interpreted by humans who may ignore or misinterpret them and have cognitive limits.
So a DSS must account for not just the optimal decision but actual human decision-making behaviour.
Behavioral decision theory is far less structured than normative theory — a set of important ideas, not one clean axiom system.

🎯 In the examFrame the normative→descriptive pivot — it is the conceptual hinge of the whole syllabus.

Failures of rationality

A set of empirical experiments showing humans systematically violate expected-utility theory: preference reversal, the Allais paradox (violating the independence axiom) and the Ellsberg paradox (ambiguity aversion).

preference reversalAllais → independence axiomEllsberg → ambiguityrisk vs ambiguity

SayReal people break the EU axioms in repeatable, named ways.

▸MoreLess

Preference reversal: people choose A over B yet price B above A → no single stable utility U fits both tasks (preferences aren’t representation-invariant).
Allais paradox: typical choices (A≻B, D≻C) violate the independence axiom — preferences shouldn’t depend on a shared/irrelevant alternative.
Ellsberg paradox: people prefer the urn with known proportions → they separate risk (known probabilities) from ambiguity (unknown) and are ambiguity-averse.
Two readings: humans are irrational, or the classical rationality assumptions are too strong — either way it reshapes DSS design.

ExampleEllsberg: bet on “black” then on “red” — most people pick the 49/51 known urn both times, which is internally inconsistent.

🎯 In the examName the three violations and what each breaks; stress risk-vs-ambiguity for Ellsberg.

2Bounded rationality & heuristics

Bounded rationality

Herbert Simon’s principle that agents are limited by cognitive resources, time and incomplete information, so perfect optimisation is often infeasible; they optimise only over the set of feasible (computable) strategies C.

Simonlimited resourcesNP-hard optimisationπ ≈ argmax over C

SayOptimise within your limits, not over everything.

▸MoreLess

Agents are bounded by cognitive resources, time constraints and incomplete information.
Optimal decisions are often computationally infeasible (combinatorial optimisation, planning, equilibrium computation — many are NP-hard).
Classical π* = argmaxₚ E[U] becomes π ≈ argmax_{π∈C} E[U] over computable strategies C.
Critique: it still requires optimising and doesn’t say which rules humans actually use → that motivates heuristics next.

🎯 In the examAttribute to Simon; give the feasible-set view and note it doesn’t say which rules humans use.

Procedural rationality

Rubinstein’s proposal to judge the decision procedure, not just the outcome: a decision rule is a mapping f : I → A from available information to actions, so rationality depends on the computational structure used.

Rubinsteinprocedure not outcomecues c(s)rule-based, not optimisation

SayJudge how the decision is computed, not only the result.

▸MoreLess

Classical and bounded rationality both focus on outcomes; Rubinstein focuses on the procedure.
A state s is described by a set of cues c(s); a heuristic maps cue-sets to actions, h(c(s)) ≈ argmaxₐ U(O(a,s)).
It replaces optimisation with rule-based computation.
This frames the specific decision rules studied next — the heuristics.

🎯 In the examDefine a decision rule f : I → A and the procedural (vs outcome) view; name Rubinstein.

Heuristic decision rules

The specific simple rules that replace optimisation with sequential, rule-based search over cues: satisficing, take-the-best, the recognition heuristic, and elimination-by-aspects.

satisficingtake-the-bestrecognitionelimination-by-aspects

SayFour named short-cuts that decide without optimising.

▸MoreLess

Satisficing (Simon): pick the first action whose utility clears an aspiration level θ — early stopping; depends on the order of actions.
Take-the-best: evaluate cues in order, stop at the first cue that discriminates — depends on the order of cues.
Recognition heuristic: if you recognise one option and not the other, choose the recognised one (e.g. infer the larger city by name recognition).
Elimination-by-aspects (Tversky): pick a cue by importance, eliminate options that fail it, repeat — sequential filtering.

🎯 In the examName the four rules and each one’s distinguishing feature (satisficing = order of actions; take-the-best = order of cues).

3Cognitive biases

From heuristics to biases

Because a heuristic only approximates the optimum it carries an error ε(s)=U(a*)−U(h(c(s)))≥0; a bias is when this approximation error is systematic (directional) rather than random.

approximation error εrandom: E[ε]=0bias: E[ε]≥0directional · predictable

SayA bias is a heuristic’s error that points the same way every time.

▸MoreLess

Heuristics cut computational, memory and information cost — but they approximate, so ε(s)=U(a*)−U(h(c(s))) > 0.
A random error averages out: E[ε]=0; a bias does not: E[ε]≥0.
So biases are directional, predictable and statistically regular deviations.
Behavioral decision theory studies these regularities empirically.

🎯 In the examGive the formal contrast E[ε]=0 (random) vs E[ε]≥0 (bias) — that is why biases are predictable.

Cognitive biases (taxonomy)

A large catalogue of systematic biases grouped into six families; not all stem from heuristics, but all are predictable enough that a DSS can be designed to counter them.

6 categoriesanchoring · availabilityrepresentativeness · base-rateframing · sunk-cost

SayDozens of biases, six families — repeatable, so you can design around them.

▸MoreLess

Judgement under uncertainty: availability, representativeness, anchoring, over/optimism bias.
Probabilistic reasoning: base-rate neglect, conjunction fallacy, gambler’s fallacy, confirmation, illusion of control.
Memory & attention: hindsight, salience, recency, the framing effect, selective perception.
Temporal: present bias, hyperbolic discounting, status-quo bias, sunk-cost fallacy, planning fallacy.
Social/attribution: attribution error, in-group, authority, halo, self-serving. Decision/choice: loss aversion, endowment, decoy, preference reversal, ambiguity aversion.

🎯 In the examName the six categories; for any bias, define it + give an example (the define→apply pattern).

4Beyond biases

Ecological rationality

Gigerenzer’s view that there is no universally optimal decision rule — rationality is relative to the environment, and a heuristic’s quality is its expected performance over an environment distribution.

Gigerenzerenvironment-relativeno universal best ruleless-is-more

SayA rule is rational for an environment, not in the abstract.

▸MoreLess

Classical optimisation assumes complete information, stable environments and accurate models; real ones are uncertain, noisy and non-stationary.
Under uncertainty, simple heuristics can generalise better than highly optimised procedures, which may overfit (“less-is-more”).
Quality is judged ecologically: Perf(h,E) = E_{s∼E}[U(h(s))] over an environment distribution E.
Some heuristics are well-adapted to specific environments; none is universally best.

🎯 In the examState that rationality is environment-relative and give the less-is-more / overfitting argument; name Gigerenzer.

Fast-and-frugal heuristics & trees

The mechanism behind ecological rationality: fast-and-frugal heuristics use sequential cue search, limited cue integration, early stopping and no global optimisation. The classic example is the fast-and-frugal tree (FFT).

sequential searchearly stoppingno optimisationFFT · medical triage

SayCheck a few cues in order, exit as soon as one decides.

▸MoreLess

Four defining features: sequential information search, limited cue integration, early stopping, no global optimisation.
Ecological rationality is the criterion; fast-and-frugal heuristics are the mechanisms (take-the-best, recognition are examples).
An FFT is a decision tree with sequential binary cue checks and a possible exit at every node — decide after inspecting only a subset of cues.
Medical triage: chest pain? → admit; else abnormal ECG? → admit; else high BP? → medium risk; else discharge — no probabilistic aggregation.

🎯 In the examList the four fast-and-frugal properties and describe an FFT (early exit, only a subset of cues).

Naturalistic decision making (RPD)

Klein’s account of how real experts decide under time pressure in dynamic, uncertain settings without explicit optimisation. The Recognition-Primed Decision (RPD) model formalises it.

Kleinexperts under time pressurerecognise → simulate → executesequential, not optimised

SayExperts recognise, simulate one option, and act — no full comparison.

▸MoreLess

NDM shifts focus from environment structure to real-world expert behaviour under time pressure.
How do experts decide well without optimisation? Answer: recognition + experience + mental simulation.
RPD model: (1) recognise a familiar situation → (2) retrieve a plausible action → (3) mentally simulate its consequences → (4) execute if satisfactory.
Options are evaluated sequentially, not optimised simultaneously — so RPD is ecological and connects to satisficing.

🎯 In the examGive the four RPD steps and note it is sequential/satisficing-like, not optimisation; name Klein.

5Prospect theory

Prospect theory

Kahneman & Tversky’s model of preference under risk: it replaces a stable utility U(x) with a value function V(x−r) over gains/losses relative to a reference point r, so an identical outcome can be a gain or a loss depending on context.

Kahneman & Tverskyreference point rrelative not absoluteS-shaped

SayValue is measured from a reference point, not from total wealth.

▸MoreLess

It targets the deeper failure: U may not be stable across contexts — the issue is how outcomes are represented & evaluated, not just computed.
Reference dependence: the same outcome is a gain or a loss depending on r (x ↦ V(x−r)).
S-shaped value function: V(x)=x^α for x≥0, −λ(−x)^β for x<0, with α,β∈(0,1) — concave for gains, convex for losses.
It complements process models (bounded rationality, heuristics): “Behavioral DT = Computation + Representation”.

ExampleA “+€1000 raise” feels like a loss if you expected +€2000 — the reference point flips the sign.

🎯 In the examWrite U(x) → V(x−r); stress reference dependence and the S-shape (concave gains / convex losses).

Loss aversion & probability weighting

The two further departures of prospect theory: losses loom larger than equal gains (|V(−x)| > V(x), λ>1), and objective probabilities are replaced by a nonlinear weighting w(p).

loss aversion λ>1endowment / status-quoweighting w(p)small probs overweighted

SayLosses hurt about twice as much, and we distort the odds.

▸MoreLess

Loss aversion: the value curve is steeper for losses, |V(−x)| > V(x) for x>0 (λ≈2) — losses hurt more than equal gains please.
It explains the endowment effect, status-quo bias and risk aversion in gains.
Probability weighting: p → w(p), nonlinear — small probabilities over-weighted, medium/high under-weighted (why we buy both lottery tickets and insurance).
Decision value PU(L)=Σ w(pᵢ)V(xᵢ) ≠ EU(L); limitation: prospect theory models valuation, not the decision process.

🎯 In the examState λ>1 (loss aversion) and the w(p) shape; note PU ≠ EU and that PT covers valuation, not process.

DSS Architecture & Design

Build a system that supports a human — doesn’t replace them.

Open lecture ↗

Formal view of a DSS

The view of a DSS as a function D : 𝓘×𝓢×𝓤 → 𝓐 from information, world-state and user to a recommended action — its output is advice, leaving the final decision to the human.

augment ≠ replaceuser is an inputoutput = recommendation

SayIt recommends; the human decides and stays accountable.

▸MoreLess

𝓘 = available information/data · 𝓢 = state of the world/problem · 𝓤 = the user (skills, preferences, context) · 𝓐 = the action/recommendation space.
Output ≠ decision: the system recommends; the human decides and is accountable.
Including 𝓤 is exactly what makes it a support system rather than an autonomous agent.
It generalises the decision rules of L1 into a system-level mapping.

🎯 In the examWrite the signature and stress that 𝓤 (the user) is an input and the output is a recommendation.

4-part architecture

The canonical DSS structure — a data, a knowledge, a model and a user-interface subsystem joined by a workflow — built to separate representation from computation.

separate representation from computationmodularswap the model

SayKeep data, knowledge, reasoning and UI as separate parts.

▸MoreLess

Data subsystem: stores and serves the raw inputs/observations.
Knowledge subsystem: domain rules, constraints and expert knowledge.
Model subsystem: the reasoning/computation engine — where the L1–L5 paradigms live.
User interface: how recommendations and their rationale are communicated — critical for trust and usability.
Modularity payoff: you can swap the model without rebuilding the data layer or the UI.

🎯 In the examList the four components and the “separate representation from computation” principle.

Model paradigms

The families of model that can fill the model subsystem — knowledge-based, model-driven, probabilistic, machine-learning, sequential or hybrid — i.e. every earlier lecture reused as a building block.

every lecture returnsaccuracy vs transparency

SayEach earlier lecture is a plug-in paradigm for the model.

▸MoreLess

Knowledge-based: explicit rules / expert systems — transparent but brittle.
Model-driven: optimisation and decision-theoretic models (L1).
Probabilistic: Bayesian / decision networks (L2).
Learning: machine-learning models fit from data; Sequential: MDP/RL planners (L4); plus hybrids.
Trade-offs everywhere: accuracy vs transparency vs data needs vs robustness — no single best choice.

🎯 In the examMap each paradigm back to its lecture and note the accuracy/transparency trade-off.

Human factors & trust

The design constraints coming from the user — cognitive load, framing, and trust calibration: matching the user’s reliance to the system’s real reliability (avoiding over- and under-reliance).

trust calibrationover-reliance = automation biasunder-reliance = algorithm aversiontransparency

SayMatch the user’s reliance to the system’s real reliability.

▸MoreLess

Trust calibration: ideally reliance tracks actual reliability — trust the system when it’s right, override it when it’s wrong.
Over-reliance / automation bias: blindly following the system, including its errors.
Under-reliance / algorithm aversion: ignoring a system that is actually better than you.
Design levers: transparency, explanations, communicating uncertainty, and managing cognitive load and framing.
This sets up Lecture 7, where trust and reliance become things you must measure.

🎯 In the examDefine trust calibration and name the two failure modes (over- and under-reliance).

DSS Evaluation

Did it actually help? — judge the system, not just the model.

Open lecture ↗

Why evaluation is hard

Assessing whether a DSS genuinely improves decisions means judging the whole human-plus-system — which is socio-technical and multi-dimensional — not just the model’s predictive accuracy.

good DSS ≠ good predictorsocio-technicalmulti-dimensional

SayYou evaluate the human+system, not a classifier in isolation.

▸MoreLess

Why it’s hard: it’s socio-technical (a human+model team), outcome vs process, and inherently multi-dimensional.
Predictive metrics (accuracy, etc.) are necessary but far from sufficient.
You must also check calibration, cost/utility (net benefit), fairness, robustness to shift, explainability, usability and adoption.
And the human-DSS interaction itself — the team can be worse than either part alone.

🎯 In the examLead with “good DSS ≠ good predictor” and enumerate the extra evaluation dimensions.

Confusion-matrix metrics

Metrics built from the 2×2 table of true/false positives & negatives — precision=TP/(TP+FP), recall=TP/(TP+FN), specificity, F1 — where plain accuracy misleads when one class is rare.

precisionrecall / sensitivityspecificityF1accuracy is lying

SayOn a 1% disease, “always healthy” is 99% accurate yet useless.

▸MoreLess

Accuracy = (TP+TN)/total — but “always predict the majority” scores high on imbalanced data (the “accuracy is lying” trap).
Precision: of those flagged positive, how many really are (the cost of false alarms).
Recall / sensitivity: of the real positives, how many you catch (the cost of misses).
Specificity = TN/(TN+FP); F1 = harmonic mean of precision & recall; balanced accuracy averages recall across classes.
Which to optimise depends on the costs of the two error types — a decision-theoretic choice.

ExampleRare disease (1% prevalence): predicting “healthy” for everyone is 99% accurate but has 0% recall — useless.

🎯 In the examDefine precision vs recall and explain when accuracy misleads.

Calibration & ROC

Calibration = predicted probabilities match observed frequencies (a reliability diagram on the diagonal); ROC/AUC compares performance across all thresholds and net-benefit analysis weighs the cost of errors.

reliability diagramcalibration vs discriminationROC / AUCnet benefit (DCA)

SayA “70% chance” should actually happen ~70% of the time.

▸MoreLess

Calibration vs discrimination: a model can rank cases well (high AUC) yet output badly-scaled probabilities.
Reliability diagram: plot predicted vs observed frequency; on the diagonal = well-calibrated.
ROC curve: true-positive vs false-positive rate across all thresholds; AUC summarises it threshold-free.
Net benefit / decision-curve analysis turns a threshold into expected utility by weighing false-positive vs false-negative costs — the utility thread from L1.

🎯 In the examDistinguish calibration (probabilities honest) from discrimination (ranking / AUC).

Fairness & robustness

The concerns beyond accuracy — distribution shift (covariate/label/concept), out-of-distribution robustness, fairness criteria (often mutually incompatible) and explanation quality.

covariate / label / concept shiftOODfairness criteria incompatibleexplainability

SayAccurate here and now ≠ robust and fair everywhere.

▸MoreLess

Distribution shift: covariate shift (inputs change), label shift (base rates change), concept shift (the input→output relation changes).
Robustness / OOD: behaviour on inputs unlike the training data, and whether it fails gracefully.
Fairness criteria — demographic parity, equalized odds, calibration-within-groups — are provably incompatible in general; you cannot satisfy all at once.
Explainability: explanations must be faithful and actually understood — perceived understanding ≠ real understanding.

🎯 In the examName the three shift types and note that fairness criteria are mutually incompatible.

Human–DSS interaction

Evaluation of the human-DSS team — which can perform worse than either part alone — distinguishing trust from reliance and reliance from agreement.

team can be worseappropriate relianceRCT / A-B / longitudinalSUS

SayAppropriate reliance = follow when right, override when wrong.

▸MoreLess

Complementarity is not automatic — a strong model + a human can underperform either one solo.
Trust (an attitude) ≠ reliance (a behaviour) ≠ agreement (did they follow it this time).
Appropriate reliance = following when the system is right and overriding when it is wrong (the calibrated trust from L6).
Methods: RCTs, A/B tests, simulation studies, longitudinal field studies; plus usability (e.g. SUS) and adoption.

🎯 In the examStress “the team can be worse than either alone” and the trust / reliance / agreement distinctions.

🧵 Threads the ideas that recur

💰 Utility

The value every decision tries to maximise.

L1 defines → L4 reward → L5 distorted → L7 net benefit

🎲 Uncertainty

We reason over probability distributions, never certainties.

L1 risk → L2 networks → L4 transitions → L7 calibration

👥 Other agents

Others’ goals interact with yours.

L3 games → L6 whose payoff? → L7 human+DSS team

🧠 Humans aren’t rational

The fact that flips theory into real systems.

L5 documents → L6 designs around → L7 measures

⚖️ Optimal ≠ effective

An optimal answer a human won’t accept is useless.

L4 optimal → L6 trust/adoption → L7 best predictor ≠ best DSS

🔁 The big shape

The arc of the whole course, in one breath.

should-decide → do-decide → build → judge

the one-sentence course

How we should decide (1–4) → how we do decide (5) → how to build the support (6) → how to judge it (7).

🧮 Decision rules lecture 1 quick reference

Rule	What it does	Attitude
Maximin	Maximise the worst-case outcome	Pessimistic / cautious
Maximax	Maximise the best-case outcome	Optimistic
Hurwicz	Blend worst & best with optimism coefficient α	Tunable
Minimax-regret	Minimise the largest regret vs the best you could’ve done	Avoids “if only…”
Laplace	Assume all states equally likely → maximise expected utility	Indifferent
Expected utility	Probabilities known → maximise Σ P·U	Rational under risk