December 16, 2025

Expert-in-the-Loop AI for Sewer Inspections: Supporting Engineering Decisions

Elakkiya Ramarajan

Lead Data Scientist

Abstract design ith text: Expert-in-the-Loop AI for Sewer Inspection: Why Engineers Still Matter

Over the past few years, I’ve worked on AI systems that have processed millions of frames of sewer CCTV footage. In that time, I’ve seen defects that were genuinely critical, and others that only looked that way. Spider webs mistaken for cracks. Shadows mistaken for leaks. It’s a reminder that underground infrastructure is messy, and interpretation matters.

That experience shapes how we think about AI at VAPAR.

Our models can review and classify inspection footage at scale. They can identify cracks, roots, deposits, and infiltration far faster than any manual process. But we have consistently found that the strongest outcomes happen when AI supports engineers, rather than trying to replace them.

Designing AI for real-world inspection

Sewer inspection is not a laboratory problem. Lighting varies, flow conditions change, materials age differently, and every network has its own construction history. Trying to automate every edge case is not only inefficient, but it also ignores the value of engineering judgement.

Instead, we design AI as the first pass. The system processes footage, surfaces potential defects, and produces structured outputs. Engineers then review those results, apply context, and make decisions that reflect asset criticality, risk tolerance, and budget realities.

The accuracy paradox in infrastructure data

In sewer CCTV inspection, the data is never evenly distributed. Most pipes are not about to fail. Many have minor defects, historical scars, or cosmetic issues that look concerning on video but rarely pose an immediate risk. Truly critical defects are rare, but they are the ones that drive collapses, emergency callouts, and unplanned capital spend.

In one typical inspection program, an asset owner might inspect thousands of pipe segments in a year. The majority will fall into acceptable or moderate condition. A much smaller subset will contain severe cracking or fracturing, deformation, or infiltration that materially increases risk.

This is where the accuracy paradox becomes very real.

An automated system could label most defects as non-critical and still achieve a high overall accuracy score. On paper, it looks impressive. In reality, it may be missing the handful of pipes that matter most. From an asset management perspective, that is a failure, not a success.

We see this often in early-stage AI deployments across the industry. A model performs well on the “common cases” but struggles with the rare, high-consequence defects. Yet those rare defects are exactly what engineers are trying to find when they commission CCTV programs.

At VAPAR, we have seen examples where a small longitudinal crack in a concrete pipe appears visually similar to a superficial mark. In isolation, the pixel pattern looks minor. But when that defect appears repeatedly near joints, aligns with known groundwater issues, and sits on a critical trunk main, the risk profile changes entirely. No accuracy metric captures that context on its own.

Why inspection metrics must reflect engineering reality

For that reason, we do not optimise our models around accuracy alone. Instead, we focus on how the system behaves in real inspection workflows.

Precision matters because engineers do not want to waste time reviewing hundreds of false positives. If every shadow or debris line is flagged as a severe defect, trust in the system erodes quickly.

Recall matters because missing a critical defect has far greater consequences than reviewing an extra false alert. A single missed collapse-prone pipe can trigger emergency works that cost more than an entire inspection program.

We have seen this trade-off play out in real projects. When recall is pushed too high, engineers are flooded with noise. When precision is pushed too far, subtle but important defects are filtered out. Neither extreme is acceptable in practice.

This is where expert-in-the-loop design becomes essential. In VAPAR, engineers can review how defects have been classified, apply their knowledge of pipe material, age, location, and consequence of failure, and adjust outcomes accordingly. A hairline crack in a low-risk lateral is treated differently from the same crack in a high-consequence main.

Those adjustments are not workarounds. They are part of the system by design. They ensure the outputs align with how asset owners actually assess risk, plan rehabilitation, and justify investment.

Metrics that support decisions, not just models

Ultimately, inspection data only has value if it supports confident decisions. Engineers need to trust that critical defects are surfaced, consistently graded, and traceable back to the footage that supports them.

By combining AI-driven consistency with engineering judgement, we aim to produce inspection outcomes that reflect how infrastructure is managed in the real world, not just how models are scored in isolation.

For asset owners, that means fewer surprises, better forward works planning, and decisions that stand up to scrutiny when budgets, regulators, and communities are involved.

Engineers remain the decision-makers

AI can process more data, faster, and more consistently than people ever could. But engineers bring interpretation, prioritisation, and accountability. In infrastructure, those qualities are not optional.

Our goal is not full autonomy. It is to give engineers better visibility, better evidence, and more time to focus on forward works planning and rehabilitation decisions. When AI removes repetitive effort, the result is better decisions and more resilient networks.