Decisions → support systems
Before any maths, the professor sets up one chain of ideas. Everything in the course hangs off it.
The whole course is built on a little cascade of three concepts. Each one builds on the last:
Decision
The behaviour of choosing among alternatives — and the result of that choice. Ordering at a restaurant is both the act of choosing and the dish you get.
Decision-making
The process of arriving at a decision from a set of stimuli — the events or conditions that trigger you to choose.
Support system
Any system — a program, or even rules written on paper — whose goal is to improve decision-making.
We care about decision support systems (DSS). To improve decision-making, we first have to understand decisions: what they are, how they're made, how they should be made, and how to write them down mathematically. That's why a course on "support systems" spends its time on decision theory.
A decision support system is any system whose goal is to improve decision-making processes… these are not intended to replace humans, but rather to support humans.
That word support matters. A DSS isn't meant to replace the doctor or the manager — it's meant to help them decide better. Which leads straight to the first big question of the field…
What is decision theory?
The study of how agents make — or should make — decisions in a decision-making setting.
Two words in that definition are doing all the work, and the professor highlights both.
"Make" vs "should make"
Studies how real humans actually decide, biases and mistakes included. How do doctors really choose to treat? How do analysts really decide to invest?
Starts from assumptions (e.g. "the agent is rational") and derives the optimal behaviour those assumptions imply. This course is mostly normative.
We don't want systems that are copies of humans, because sometimes humans make wrong decisions. We want systems that can help us improve.
But we still study the descriptive side a little — because a support system has to support a human, and you can't support someone if you don't understand how they reason.
One field, many settings
Decision theory is huge. Which sub-field you're in depends entirely on the shape of the setting — how many agents, and how they relate:
| The setting | The sub-field |
|---|---|
| One agent vs. the environment | Decision theory ← we start here |
| Many agents, each self-interested | Non-cooperative game theory |
| Many agents forced to cooperate | Coalitional game theory |
| Many agents + one central planner | Social choice theory |
| An agent learning by interacting | Reinforcement learning |
Same theory, different "instantiations" of the setting. Lectures 2–3 move into game theory.
The model & the decision matrix
We take the simplest setting — one agent vs. the environment — and turn it into maths.
The professor models everything abstractly, with just three pieces:
The doctor is the agent; the world of patients is the environment. We model the doctor by the actions they can take: treat with medicine A, treat with medicine B, or do not treat. The states are: sick with this disease, sick with that one, or not sick at all.
The heart of it: you choose the row (your action); the world chooses the column (the state). You only ever control half of what happens. The outcome sits where your row meets the world's column.
| state s₁ | … | state sₘ | |
|---|---|---|---|
| action a₁ | O₁₁ | … | O₁ₘ |
| ⋮ | ⋱ | ||
| action aₙ | Oₙ₁ | … | Oₙₘ |
A decision matrix: n actions × m states. (Only works when both sets are finite — otherwise the table is infinite.)
A concrete one: the insurance decision
Should a DSS advise you to buy fire insurance for your house? Two actions, two states:
| 🔥 Fire | No fire | |
|---|---|---|
| Buy insurance | No house, +100 000€ | Keep house (paid premium) |
| No insurance | No house, +100€ | Keep house, +100€ |
Insurance is brilliant in the Fire column and a mild waste in the No fire column. So which row do you commit to — before you know which column the world picks?
Notice what's missing: nothing tells us how likely a fire is. Everything from here depends on one question:
Do you know the probabilities of the states?
• No idea → Decision under ignorance (§5)
• You have a probability distribution → Decision under risk (§6)
This setting [ignorance] seems simpler, because we don't deal with probabilities — but in fact it is harder, in the sense that there is no single optimal behaviour.
Rationality & utility
Normative theory needs a starting assumption. That assumption is rationality — and it's more precise than the everyday word.
Step 1 — preferences that make sense
We assume the agent can rank outcomes by preference, and that this ranking is a pre-order: it obeys two common-sense rules.
If you prefer bananas to apples, and oranges to bananas, then reasonably you'll prefer oranges to apples.
Step 2 — pre-order vs. linear order
A pre-order allows incomparable things. A linear order (like the numbers) doesn't — any two items can always be compared.
I can say a cheeseburger is better than salmon — but I cannot say whether an airplane is better than a cheeseburger. They're just not comparable.
For decision-making we add completeness: inside our setting, every pair of outcomes must be comparable. If you can't compare two outcomes, you haven't finished specifying the problem.
Step 3 — collapse it into a number: utility
Reflexivity + transitivity + completeness, together, let us line the outcomes up and assign each a real number U such that U(oᵢ) ≤ U(oⱼ) exactly when oⱼ is (weakly) preferred to oᵢ. Drop any one of the three and this becomes impossible.
The utility function is the central notion. Once you've derived it, you can forget about the preference — and even forget the outcomes. Algorithms work on the utility function.
An agent is rational if it acts to maximise its utility. So the messy insurance outcomes ("lose the house but get 100 000€"…) collapse into bare numbers we can compute with:
| 🔥 Fire | No fire | |
|---|---|---|
| Buy insurance | 1 | 4 |
| No insurance | −100 | 5 |
This agent ranks the outcomes 5 ▸ 4 ▸ 1 ▸ −100. "No insurance, no fire" is best; "no insurance, fire" is catastrophic.
Utilities are personal: I might rate cheeseburger above salmon while you do the opposite. Different agents → different utility functions. There's no universal "correct" number.
One subtlety the professor flags: the scale
Ordinal
Only the order matters. The numbers are arbitrary — add 1000 to all of them and it's the same utility function. No averaging allowed.
Cardinal
The distances between numbers carry meaning. Add 1000 and you break it. You can average — required for expected utility (§6).
Expected utility (§6) takes a weighted average of utilities — so it needs a cardinal scale. On an ordinal scale an average is meaningless, because the spacing between the numbers means nothing. State this in one sentence and you've earned marks.
Decision under ignorance
You know the actions, the states and the utilities — but nothing about how likely each state is. "Simpler" to set up, but genuinely harder to solve.
First tool: dominance
(Preference was between outcomes; dominance is between actions.) Action aj weakly dominates ai if, in every state, aj's outcome is at least as good (by utility). It strongly dominates if it's also strictly better in at least one state.
B weakly dominates A when, no matter the actual state of the environment, action B always gives you an at-least-as-preferable outcome.
The golden rule: a rational agent should never pick a dominated action. Here's the menu — order before you know if the chef is any good:
| Good chef | Bad chef | |
|---|---|---|
| Monkfish | 4 | 1 |
| Hamburger | 3 | 3 |
| No main course | 2 | 2 |
No main course (2, 2) is strongly dominated by Hamburger (3, 3) — the burger wins in both columns, so a rational agent deletes it. But dominance can't choose between Monkfish and Hamburger: each is better in a different column.
Dominance is a very weak rule — it deletes the obviously bad, but rarely names a winner. To go further you must add an extra assumption about the agent's attitude, and each assumption gives a different rule. There is no single best one — only the requirement that every rule must agree with dominance.
The rules — a playground 🎛️
Below is the professor's medication example, live. You feel ill but don't know the cause. Pick a rule and watch which action it selects, and why. Edit any number to test your understanding.
Click a rule — the winning action lights up green and the reasoning appears below.
| Bacterial | Viral | Stress | Worst case | |
|---|---|---|---|---|
| Probability | — | |||
| Take antibiotics | – | |||
| Take anti-fever | – | |||
| No medication | – |
Maximin
Maximise the worst case. Take each action's minimum, then pick the biggest. The cautious, risk-averse agent.
Maximax
Maximise the best case. Take each action's maximum, then pick the biggest. The optimist.
Minimax Regret
Regret = your utility − the best possible in that state. Minimise your worst regret. Regret is a loss function — big in ML.
Averaging (OWA)
Compromise: weight the outcomes and average. Weights are NOT probabilities — under ignorance there are none. They only encode optimism.
1. OWA weights are not probabilities — under ignorance there are none. They encode the agent's optimism, nothing more.
2. The indifference principle (give every state 1/|S|, then maximise expected utility) quietly invents probabilities. Popular, but contested: if you truly know nothing, why assume every state is equally likely?
Under ignorance there is no single best rule. Each is justified by a different assumption, and they can disagree (try it in the playground!). The only universal law: stay coherent with dominance.
Decision under risk
Now the agent has a probability distribution over the states. Suddenly everything gets clean — and there's one rule to rule them all.
Two flavours of that p: frequentist (the real long-run frequency) or subjective (the agent's degree of belief). Both feed the same machine.
The expected utility of an action is the probability-weighted average of the utilities it can produce — "how much utility do I expect on average if I take it?"
Under ignorance: many rules, no winner. Under risk: exactly one rule everyone agrees on.
Expected Utility Maximisation — pick the action with the highest EU.
Run the medication example again with probabilities 0.05 / 0.15 / 0.8 (hit Expected Utility in the playground):
| Bacterial (.05) | Viral (.15) | Stress (.8) | EU | |
|---|---|---|---|---|
| Antibiotics | 1 | −1 | −1 | −0.9 |
| Anti-fever | 0.5 | 0.5 | −0.5 | −0.3 |
| No medication | −1 | −1 | 0 | −0.2 ✓ |
The winner is No medication — even though it's never best in any single column! Because Stress is overwhelmingly likely (0.8), EU rewards what's safest in the world that probably happens. That counter-intuitive result is exactly the kind of thing the professor likes you to explain.
EU averages utilities — so it needs a cardinal scale (§4). On an ordinal scale the average is meaningless.
Why is EU "the" rule? Two theorems
It isn't arbitrary. Two famous results prove a rational agent must behave like an EU-maximiser. The professor won't ask you to prove them — just to state and discuss them.
A lottery is a probability distribution over utilities; under risk, every action is a lottery. If preferences over lotteries satisfy four axioms — completeness, transitivity, continuity, independence — then there's a utility function U (unique up to positive affine transformation) with L ≤ M ⟺ EU(L) ≤ EU(M). In words: the agent provably acts as if maximising expected utility.
The catch: VNM assumes the probabilities are already given. But where does p come from?
Starting only from preferences over actions (plus axioms like the "sure-thing principle"), with finitely many states, there exists a unique subjective probability Q over states and a utility function such that preference matches EU comparison. So the probabilities aren't an input — they're derived from how the agent chooses. That's the "subjective" in subjective EU.
Savage's Q is the agent's belief and may differ from the true probability. A rational agent's beliefs should converge to the truth (De Finetti's "Dutch book" argument), but Savage alone doesn't guarantee it — and the assumption that a single, knowable probability even exists isn't universally accepted (hence imprecise / multiple-priors models).
The Exam Lab
Knowing the theory isn't enough — you have to answer his way. Here's how the exam works, the answer skeleton, then four full worked examples in Lecture-1 territory.
Written exam on the course contents — mandatory, you must score ≥ 18 to pass. Then an optional essay: an individual academic document on a topic related to the course, discussed orally, worth up to +4 points. Good news: he will not ask you to prove theorems — he wants understanding, not derivations.
① Define the concept "in your own words but as precisely as possible," using the exact vocabulary (utility, dominance, EU, cardinal/ordinal). ② Apply it to his scenario — show the computation/reasoning. ③ Decide & motivate — state what the rational agent does (or who's right) and why.
Define, in your own words but as precisely as possible, expected utility and the rule of expected-utility maximisation. Then an agent faces three states — Bacterial (p=.05), Viral (p=.15), Stress (p=.8) — with utilities Antibiotics (1, −1, −1), Anti-fever (.5, .5, −.5), No medication (−1, −1, 0). Which action should a rational agent take? Motivate.
p. The expected utility of action a is the probability-weighted average of its outcome-utilities, EU(a)=Σ p(s)·U(O(a,s)). EU maximisation says a rational agent picks the action with the greatest EU. It needs a cardinal scale — averaging ordinal utilities is meaningless.Explain the difference between decision under ignorance and decision under risk. Under ignorance, is there a single "best" decision rule? Motivate.
Define weak and strong dominance. Then for Monkfish (Good 4, Bad 1), Hamburger (3, 3), No main course (2, 2): identify dominated actions, and say whether dominance alone determines the choice. Motivate.
aⱼ weakly dominates aᵢ if U(O(aᵢ,s)) ≤ U(O(aⱼ,s)) for every state s; it strongly dominates if the inequality is strict for at least one state. A rational agent never picks a dominated action.State the Von Neumann–Morgenstern theorem: its assumptions, what it establishes, and its main limitation. Discuss.
≤ over lotteries satisfies completeness, transitivity, continuity (if L ≤ M ≤ N, some mix of L and N is indifferent to M) and independence (mixing both sides with a third lottery preserves the preference).U exists, unique up to positive affine transformation, with L ≤ M ⟺ EU(L) ≤ EU(M). So an agent obeying the axioms provably behaves as an EU-maximiser, and its scale is cardinal.
• What's the cascade decision → decision-making → DSS, and why "support" not "replace"?
• Define a utility function; which three assumptions make it possible?
• Why does expected utility require a cardinal scale?
• Give the menu matrix: which action does dominance remove, and why can't it finish the job?
• State maximin, maximax, minimax regret — what attitude does each encode?
• Why is the indifference principle "contested"?