Defining Success: The Four Layers of Goals

To build a successful AI system, developers must untangle and align goals at different levels.

  • Organisational Goals: High-level business or social objectives (e.g., increasing profit, saving lives, or improving social justice).
  • System/Feature Goals: Specific outputs the system provides (e.g., “Detect cancer in scans” or “Provide music recommendations”)
  • User Outcomes: How well the system serves the users need (e..g, saving the user time or helping them make better decision)
  • Model Properties: Technical quality metrics of the ML model (e..g, accuracy, inference time, or training cost). These are often disconnected from high-level organisational success.

The Relationship Between World and Machine

Requirements Engineering (RE) is the process of deciding precisely what to build. It is widely considered the hardest part of software development because errors here cripple the system more than bugs in code.

Code Definition

  • The World (Environment): The physical context of users where the software is deployed.
  • The Machine (Software): The code and ML components we build
  • Shared Phenomena: The interface between the two. Inputs (sensors like Lidar, cameras, GPS) and Outputs (actuators like engine signals)

REQ, ASM and SPEC

  1. Requirements (REQ): Desired states of the environment (e.g., “The vehicle must not veer off the lane”).
  2. Specifications (SPEC): What the software must implement, expressed via shared phenomena (e.g., “Identity lane markings and generate steering commands”).
  3. Assumptions (ASM): Properties of the environment we assume to be true. These bridge the gap between SPEC and REQ (e.g., “Sensors provide accurate data,” “The steering wheel is functional”)

Tip

A system failure occurs if the Assumptions are violated, even if the Specification is perfectly implemented.

Why AI System Fail: Requirement and Assumption Violations

Failures in ML systems often stem from a poor understanding of environmental assumptions rather than coding bugs.

  • Concept Drift: The environment evolves over time, meaning the underlying data distribution changes and original assumption may no longer hold
  • Adversarial Attack: Malicious actors may deliberately try to violate system assumptions (e.g., putting stickers on stop signs to confuse a vision model)
  • Feedback Loops: The system acts on the environment, which then changes the data it receives, potentially reinforcing biases.

Case Study: Lufthansa 2904 Crash

  • REQ: Enable reverse thrust only when the plane is on the ground.
  • SPEC: Enable reverse thrust if wheels are turning or weight is sensed on gears.
  • ASM: Wheels turn if and only if the plane is on the ground.
  • Failure: On a rainy day, hydroplaning prevented the wheels from turning even though the plane was on the ground. The assumption was violated, the software overrode the pilot, and the plane crashed.

Quality Requirements (Non-Functional Requirements)

While Functional Requirements describe what a system does, Non-Functional Requirements (NFRs) describe how it operates

  • Fairness: Ensuring the system operates without system prejudice. Bias can enter at any stage: data collection, processing, or evaluation
  • Explainability: The extent to which a human can understand the internal mechanics of the AI’s decisions.
  • Safety: The absence of conditions that make the system dangerous.
  • Robustness/Fault Tolerance: The ability to continue operating when components fails.

The Process of Establishing Requirements

Step-by-Step Execution

  1. Identify Entities: List all environmental entities and ML components.
  2. State Requirements (REQ): Define the desired effect on the environment.
  3. Identify the Interface: Determine the shared phenomena (inputs/outputs).
  4. Identify Assumptions (ASM): List what must be true about the environment.
  5. Develop Specifications (SPEC): Write the software instructions.
  6. Verification: Check if .
  7. Iterate: If the logic fails, go back to step 1.

Elicitation Techniques

  • Stakeholder Interviews: Talk to everyone affected (users, owners, regulators).
  • Ethnographic Studies: Passively observe users in their actual environment.
  • Personas: Create fictional users to explore needs from different perspectives.
  • Prototypes: Show stakeholders a mock-up to identify misunderstandings early.