Computer Vision at SafetyCulture — Opportunity Solution Tree

📋 Opp 1 — Compliance

Photo evidence is captured but never meaningfully verified

Managers can't review every inspection photo at scale. Non-compliant evidence is routinely missed. 'Passed' inspections don't reflect real conditions.

💰 AusPost $86K 💰 Husqvarna $82K 👥 Prince · DEKRA · Toyota

🔥 High Priority

Tier 2 Quality Operations H6

JTBD

"When I need to verify photo evidence at scale, help me catch non-compliance without reviewing every image."

Solutions

Solution A

AI Visual Pass/Fail against reference images

Validate Medium Bet

🧪 Experiment Test Gemini Flash accuracy against reference images on 50 real SC inspection photos. Target: ≥80% precision before user testing.

Solution B

Configurable compliance rules per inspection question

Explore Small Bet

🧪 Experiment Concept test with Prince Retail & DEKRA: can a manager define "what good looks like" in under 5 mins? What level of control do they need?

Solution C

Manager review queue with AI pre-screening

Explore Medium Bet

🧪 Experiment Interview: does the manager want to review AI-flagged photos, or just trust the AI verdict? Map their current review workflow first.

💸 Opp 2 — Quality

Photo quality problems cause real financial damage

Workers submit photos that fail client standards. No real-time validation before they leave site. Result: rejected work, payment disputes, cash-flow damage.

🚨 £1M dispute ⚠️ Critical

🔥 High Priority

Tier 1 Quality Copilot H5 H6 Scenario A

JTBD

"When my worker is on site, validate that their photo meets the client standard before they leave."

Solutions

Solution A

Real-time quality check before photo is submitted

Validate Small Bet

🧪 Experiment Prototype: show worker a "your photo needs to show the depth gauge clearly" prompt. Test with field service workers — does it help or frustrate?

Solution B

Client standard profiles — required elements per job type

Explore Small Bet

🧪 Experiment Interview critical signal customer (£1M disputes): what are the specific photo requirements per job? Can they be encoded as rules?

Solution C

Pre-departure alert — flag inadequate photos before site exit

Explore Small Bet

🧪 Experiment Concept test: would a "3 photos need re-taking before you leave" end-of-inspection alert change worker behaviour? Desirability test with 4 customers.

⚡ Opp 3 — Reporting

Reporting a visual issue requires too much manual effort

Frontline workers must write full issue forms from scratch. The photo they just took doesn't fill any field. Issues go unreported or are inconsistently documented.

✅ CEA live 🎙️ Voice: in CEA milestone 👥 Coles · Thermosash · CBG · AusPost

✅ In Progress

Tier 1 Compliance Copilot H1 H2 Scenario B

JTBD

"When I see something wrong on site, help me report it quickly without the documentation overhead."

Solutions

Solution A

AI Issue Creation — photo → structured issue

🟢 CEA Live Small Bet ✓

🧪 Experiment CEA with Coles, Thermosash, CBG, Scott's Miracle Gro. Measure: issue creation time, completion rate, field accuracy vs manual. Target: CEA→GA by Q2 2026.

Solution B

Video capture → AI-generated structured issue

Explore Medium Bet

🧪 Experiment Concept test with Thermosash (actively demoed by Recovision). Would video better capture the issue context than a photo? When does video win?

Solution C

Photo → inspection answer auto-fill (in-template)

Explore Medium Bet

🧪 Experiment Interview: does the worker want AI to suggest the inspection answer from the photo, or just from the issue? Map the in-inspection CV interaction moment separately.

🕐 Opp 4 — Continuity

Inspection photos have no continuity between visits

Each inspection is a fresh snapshot. No way to see what the site/asset looked like last time, identify deterioration, or track conditions over time. Reactive capex instead of planned.

💰 Network Rail $363K 💰 Siemens $227K 👥 Royal Caribbean

🔍 Explore

Tier 2 Operations Copilot H7

JTBD

"When my team returns to the same site or asset, show me what has changed since last time."

Solutions

Solution A

Sentinel — QR-anchored time-series visual monitoring

Prototype Exists Medium Bet

🧪 Experiment Show Sentinel prototype to Royal Caribbean / Network Rail. Key question: do they scan a QR code per asset, or is location enough? What triggers a "visit"?

Solution B

Before/after comparison view per inspection question

Explore Medium Bet

🧪 Experiment Concept test: side-by-side photo comparison in the inspection UI — do managers notice deterioration faster? Test with Siemens (tool condition tracking use case).

Solution C

Asset visual history timeline in asset record

Explore Large Bet

🧪 Experiment Interview Network Rail & Royal Caribbean: do they make refurb decisions from the asset record or from inspection reports? Where does the photo need to live?

🔄 Opp 5 — Patterns

Repeat violations aren't surfaced across inspections

Managers at 280+ locations can't see which items are flagged for the 2nd or 8th time. Each inspection is treated as isolated. Repeat risks go unnoticed until they become serious incidents.

👥 280+ locations 👥 Volvo $47K

📊 Med Priority

Tier 2 Operations Copilot H7 Scenario A

JTBD

"When I review my monthly report, surface which items have been flagged before and how many times."

Solutions

Solution A

Repeat flag indicators on inspection items (count badges)

Validate Small Bet

🧪 Experiment Prototype: "⚠️ This item has been flagged 4 times in the last 3 months" shown in the inspection. Does a repeat badge change manager escalation behaviour in usability test?

Solution B

Cross-inspection violation history panel per location

Explore Medium Bet

🧪 Experiment Interview 280+ location retailer: do they want this in the inspection itself, or in a separate analytics/reporting view? Who reviews it — the auditor on-site or the manager remotely?

Solution C

Location-level violation trend analytics dashboard

Explore Large Bet

🧪 Experiment Concept test with Volvo Cars & biscuits-bouvard: if SC showed "Top 5 recurring violations this month across 50 locations," would this change their inspection program design?

📁 Opp 6 — Organisation

Photo organisation blocks reporting and rollouts

Inspections export 800+ randomly named files with no link to defect, question, or asset. Manual sorting hours are a 'stopper' preventing full rollout of inspection programs.

💰 IKEA $1M 💰 DHL $646K 🚫 Stopper

⚡ Med Priority

Tier 3 Operations Copilot H6

JTBD

"When I download my inspection photos, organise them so I can write my report without spending hours sorting."

Solutions

Solution A

Smart export — photos grouped by defect / question / location

Validate Small Bet

🧪 Experiment Interview bisindustries.com APAC + IKEA EMEA: what is their ideal folder/file structure? How much of their pain is naming vs sequencing vs metadata? Can we replicate their workflow in a prototype?

Solution B

Automated photo naming — site · question · timestamp

Explore Small Bet

🧪 Experiment Usability test: compare current unnamed export vs auto-named export. Does naming alone solve the problem or is matching to defects the core pain?

Solution C

Report-ready photo bundles with defect metadata attached

Explore Medium Bet

🧪 Experiment Concept test with DHL AMER: if SC exported a folder per defect with photo + context, would that replace their manual PowerPoint report process?

🔢 Opp 7 — OCR

Data locked in photos has to be manually typed

Workers photograph meters, labels, serial numbers, and barcodes in the field — then manually transcribe the data into inspection fields. Slow, error-prone, and ignores the photo as a data source.

👥 Coles 👥 CHEP

⚡ Med Priority

Tier 3 Quality Copilot H3 H4

JTBD

"When I capture a reading, code, or identifier in the field, put it into the platform without me typing it."

Solutions

Solution A

OCR auto-fill for meters, labels & serial numbers

Validate Small Bet

🧪 Experiment Test Gemini Flash OCR accuracy on Coles odometer photos (David Bailey dataset — 136 images). Target: ≥90% read accuracy at varying lighting/angles before customer testing.

Solution B

Barcode / QR code scan → auto-populate inspection field

Validate Small Bet

🧪 Experiment Concept test with Hazeldenes: barcode scan → field validation UI. Does the "human-in-the-loop check" model (flag only mismatches) match their mental model?

Solution C

Count recognition — pallets, items, inventory from photo

Explore Medium Bet

🧪 Experiment Explore CHEP pallet counting use case: interview to understand required accuracy level. Is ±5% acceptable or does it need to be exact for billing/inventory purposes?

⚠️ Opp 8 — Hazards

Workers can't get expert-level hazard identification without a specialist present

Less experienced workers miss risks that experts would catch immediately. Competitors Safety Pulse and Echo Portal are actively winning deals on this capability today.

💰 nextcenturi $121K "game changer" 🏆 Safety Pulse · Echo Portal winning

⚡ High Competitive Risk

Tier 1 Compliance Copilot H1 H2 Scenario B

JTBD

"When a worker photographs a job site, automatically identify the hazards visible so any worker gets expert-level safety analysis."

Solutions

Solution A

AI hazard detection overlay on captured photos

Validate Small Bet

🧪 Experiment Test hazard detection accuracy using Gemini Flash on nextcenturi.com job site photos. Benchmark against what a safety expert would identify. What is the minimum viable accuracy?

Solution B

Hazard checklist pre-population from photo analysis

Explore Small Bet

🧪 Experiment Concept test: "AI spotted 3 hazards and pre-filled your checklist" — does the worker trust and act on it? Or does it create over-reliance risk? Test liability framing with 3 safety-focused customers.

Solution C

Risk score surfaced to supervisor based on site photo analysis

Explore Large Bet

🧪 Experiment Interview Fortis + unnamed evaluator: is supervisor-level risk scoring more valuable than worker-level hazard prompting? Who is the primary buyer for this use case?

🎙️ Opp 9 — Capture Speed

Inspections are slow because workers must stop, type, and navigate — voice & photo shortcuts are missing

Workers conducting walkthroughs spend more time documenting than observing. A key structural problem: templates force sequential completion, but workers observe in physical walk order — voice + AI decouples the two, letting workers narrate freely while AI routes to the right fields. New use case confirmed: ambient meeting recording (toolbox talks, site pre-starts) → auto-populated templates.

🚨 Thermosash deal risk 🏆 GoAudits · Keen Research 👥 Coles · nextcenturi 🚇 Transport for London — "game changer" ×2 🎙️ qal.com.au — ambient walk narration → template

🔥 High Priority

Tier 1 Compliance Copilot Quality Copilot H9 H10 Scenario A Scenario B

JTBD

"When I'm conducting a walkthrough, help me capture findings using voice and photos so I spend less time typing and never miss documenting something."

Solutions

Solution A

Voice-narrated inspection — AI transcribes speech into answers in real time

Validate Small Bet

🧪 Experiment Concept test with field-heavy customers (Coles, nextcenturi, Transport for London, qal.com.au): would voice narration during walkthrough change completion rate? Does AI transcription accuracy meet the bar? Test in noisy environments. Key probe: does free-form narration in walk order — mapped to template fields by AI — reduce friction vs sequential form entry?

Solution B

One-tap photo + voice moment capture → structured finding

Explore Medium Bet

🧪 Experiment Prototype: "hold camera + speak what you see → AI fills the form". Measure time-to-complete vs current method. Target: ≥30% faster than manual. Test with 4 inspection-heavy customers.

Solution C

AI inspection co-pilot — live guidance, missed item prompts & voice summary

Explore Large Bet

🧪 Experiment Interview: when does a worker want the AI to guide vs just capture? Does real-time prompting ("you haven't photographed the electrical panel yet") feel helpful or intrusive? Map the trust/autonomy threshold.

Solution D

Ambient meeting recording → template auto-fill (toolbox talks, pre-starts)

Explore Large Bet

🧪 Experiment Thermosash use case: record full toolbox talk / site pre-start, AI auto-populates the relevant template. SC "Voice-to-Voice hackathon" built a proof of concept. Interview: do safety managers currently transcribe meetings manually? What's the time cost? Trust question: what level of AI accuracy is needed before they'd stop reviewing every field?

🧤 Opp 10 — Hands-Free

Workers in PPE, gloves, or at height cannot type — and SafetyCulture is physically unusable for them today

For workers at height, wearing gloves, or carrying equipment, typing into a phone is not just slow — it is impossible. Voice must be the only viable input modality. This is a physical access problem, not a preference. Competitive urgency: Ideagen Mazlan shipped explicit PPE voice (Dec 2025) with L'Oréal, NASA, Tesla as reference customers. SC has named customers waiting. 90% paper persistence persists in PPE-heavy environments despite digital rollout — voice is the missing unlock.

🏗️ BAI Communications — gloved tower workers 🚒 Fire Rescue Victoria — PPE, hands occupied 📱 iOS P0: voice-to-text for PPE scenarios 🚇 Transport for London — confirmed feasibility blocker 💰 $15.3M · 45 customers (Twine Dec 2025–Mar 2026) ⚠️ Ideagen Mazlan shipped PPE voice — Dec 2025

🔍 Discovery

⚠️ Competitor shipped — accelerate

Tier 1 H11 Scenario A

JTBD

"When I'm working with gloves on, wearing PPE, or at height, I can't type — help me capture findings and complete inspections using only voice."

Solutions

Solution A

Hands-free voice narration — one-tap mic, speak findings, AI fills the form

Explore Small Bet

🧪 Experiment Direct probe Q6e with BAI Communications, Fire Rescue Victoria, and Transport for London participants. Do they currently skip documentation entirely due to PPE? Would single-tap voice change that? Test in simulated gloved conditions. Add trust/accuracy probe: if transcription gets it wrong 1-in-5 times in noise, would they still use it? Note: TfL flagged abuse risk (vague dictation) — test fidelity guardrails in prototype.

Solution B

Wearable or Bluetooth trigger — voice-activate SC without touching the screen

Explore Medium Bet

🧪 Experiment Interview: which industries have the highest gloves/PPE density? Map the hardware constraints. Is the phone even accessible, or does this require a watch/earpiece integration path?

Solution C

Always-on ambient mode — AI listens and logs as the worker moves through site

Explore Large Bet

🧪 Experiment Long-term vision: Athena-style ambient listening (Gemini + LiveKit) running in background during site walkthrough. Requires robust STT in industrial noise (RNNoise + Deepgram Nova-3 + Whisper.cpp offline). Interview: would workers accept always-on listening at work?

🔀 Opp 11 — Sequential Mismatch

Inspection templates are sequential but field observation is not — workers are forced to fight the form

Digital inspection forms force workers to answer questions in template order — but physical sites don't work that way. Workers notice issues in the order they encounter them, not the order a template was built. The current flow forces scrolling through entire templates or completing findings from memory after the walkthrough. At scale (200-site facilities managers, 4,000 weekly routes), this sequential mismatch compounds inspection time and drives abandonment. Voice changes this entirely: workers narrate observations in physical order and AI maps each to the correct template field — decoupling physical observation from digital form order. This is a structural redesign of how inspections work, not a convenience feature.

📊 Twine intelligence — Dec 2025–Mar 2026 🎙️ Josh Roby, qal.com.au — "just an admin exercise" 🏢 200-site facilities managers · 4,000 weekly routes

🔍 Discovery

Tier 1 Compliance Copilot Quality Copilot H12 Scenario B

JTBD

"When I'm walking a site, let me report what I see in the order I see it — don't make me hunt through a form to find the right question."

Solutions

Solution A

Voice-to-field AI routing — narrate in walk order, AI maps to template fields

Explore Medium Bet

🧪 Experiment Prototype test: give workers a 20-question template and a simulated site walk. Compare completion time and accuracy between sequential form entry vs free-form voice narration with AI field-mapping. Key metrics: time-to-complete, field accuracy rate, worker preference. Test with facilities managers running 200+ site portfolios. Does AI correctly route "crack in north wall panel" to the structural integrity section?

Solution B

Smart template reordering — AI re-sequences questions based on GPS / floor plan position

Explore Large Bet

🧪 Experiment Interview: do workers follow a consistent physical route on repeated inspections? If yes, can we learn the walk pattern and reorder template questions to match? Probe with 4,000-weekly-route customers. Feasibility: does SC have location/zone data granular enough to map questions to physical areas?

Solution C

Post-walk AI reconciliation — complete walkthrough first, AI builds the inspection from notes

Explore Medium Bet

🧪 Experiment Concept test: worker does an unstructured site walk, captures voice notes and photos freely. AI assembles a completed template draft post-walk for review. Interview qal.com.au and Twine-flagged customers: would they trust AI to populate 80% of an inspection from raw notes? What review/edit workflow is needed? Does this reduce the "admin exercise" feeling Josh Roby described?

📹 Opp 12 — CCTV Integration

Fixed cameras capture safety incidents every day but none of it flows into SafetyCulture workflows

Organisations have extensive CCTV infrastructure — yard cameras, building surveillance, elevator and HVAC monitoring — that continuously captures safety-relevant events. But this footage sits in siloed video systems with no connection to SC. When a near-miss, spill, or PPE violation is caught on camera, someone must manually watch the footage, then separately create an issue in SC. At scale this doesn't happen: incidents go unrecorded, patterns go undetected, and the organisation's most comprehensive visual evidence source is completely disconnected from its safety management system. CV-powered camera integration could automatically detect incidents in video feeds and generate SC issues, actions, or alerts — turning passive surveillance into an active safety loop.

🛒 Coles — yard CCTV incidents unlinked 🍽️ Marley Spoon — kitchen/warehouse cameras 🏗️ Oliver Aponte — HVAC/elevator monitoring 📦 CHEP — warehouse/yard surveillance

🔍 Discovery

Tier 2 Compliance Copilot Operations Copilot H13 Scenario B

JTBD

"When a safety incident is captured on our existing cameras, automatically create an issue in SafetyCulture so nothing caught on film goes unrecorded."

Solutions

Solution A

CCTV-to-issue pipeline — CV detects incidents in camera feeds and auto-creates SC issues

Explore Large Bet

🧪 Experiment Interview Coles, Marley Spoon, and Oliver Aponte: how many safety-relevant events does CCTV capture per week that never become SC issues? What's the current workflow — does anyone review footage proactively, or only after an incident? Map the camera infrastructure: IP cameras, NVR systems, cloud vs on-prem. Feasibility: can we access RTSP/ONVIF feeds, or do we need vendor partnerships (Milestone, Genetec, Verkada)?

Solution B

Clip-and-attach — manually flag a CCTV moment and push it into SC as evidence

Explore Small Bet

🧪 Experiment Lightweight MVP: SC integration that lets a safety manager clip a CCTV timestamp and push the frame/clip into an SC issue or action. Test with CHEP and Coles — would a simple "clip to SC" button in their existing VMS reduce the gap between incident capture and documentation? What's the current time from CCTV observation to SC issue creation?

Solution C

Zone-based alerting — CV monitors defined areas and triggers SC actions on anomaly detection

Explore Large Bet

🧪 Experiment Prototype: define safety zones in camera view (e.g. forklift exclusion zone, PPE-required area). CV model detects violations and auto-generates SC alerts. Test with Oliver Aponte HVAC/elevator monitoring — are there repeatable zone-based safety rules? Probe: what false-positive rate is tolerable before users disable alerts? Interview Marley Spoon on kitchen safety zones.

🏷️ Opp 13 — Asset Recognition

Workers can't identify asset types from photos — manual tagging is slow, inconsistent, and often skipped

When workers photograph equipment, infrastructure, or site elements during inspections, the system has no idea what it's looking at. Asset type identification relies entirely on manual entry — workers must select from dropdown lists, type asset names, or scan QR codes that may not exist. This creates incomplete asset records, inconsistent naming, and missed linkages between inspections and the assets they cover. At scale, the problem compounds: thousands of photos of pumps, panels, valves, and vehicles sit unclassified. CV-powered asset recognition could identify equipment type from the photo itself — auto-tagging, linking to asset registers, and enabling fleet-wide condition tracking without manual data entry.

📦 CHEP — pallet/asset tracking at scale 🏗️ Oliver Aponte — HVAC/elevator asset ID 🛒 Coles — equipment across 800+ stores 📢 Broadly requested across customer base

🔍 Discovery

Tier 2 Operations Copilot Quality Copilot H14 Scenario A

JTBD

"When I photograph a piece of equipment on site, automatically identify what type of asset it is so I don't have to search through lists or type it manually."

Solutions

Solution A

Photo-to-asset-type classifier — CV identifies equipment category from inspection photo

Explore Medium Bet

🧪 Experiment Build a classifier prototype using SC's existing 2.3B image corpus. Start with the top 20 most-inspected asset types (fire extinguishers, electrical panels, HVAC units, forklifts, exit signs). Test with CHEP pallet images and Oliver Aponte HVAC photos. Key metric: top-3 classification accuracy. Interview: what granularity do customers need — "fire extinguisher" vs "ABC dry chemical 4.5kg"? Is category-level enough to be useful?

Solution B

Auto-link to asset register — recognised asset type maps to customer's asset hierarchy

Explore Medium Bet

🧪 Experiment Interview Coles and CHEP: do they maintain an asset register in SC today? How do they currently link inspection findings to specific assets — QR codes, manual selection, or not at all? Prototype: after CV identifies "electrical panel", suggest matching assets from the customer's register. Test whether auto-linking reduces inspection time and improves asset record completeness. What happens when the same asset type appears multiple times on one site?

Solution C

Fleet-wide condition tracking — aggregate recognised asset photos to surface deterioration patterns

Explore Large Bet

🧪 Experiment Builds on Opp 4 (Visual Continuity) and Solution A above. Once assets are recognised and linked, can we aggregate condition photos across a fleet to answer "which HVAC units across all sites show corrosion?" Interview Oliver Aponte: would fleet-level asset condition visibility change maintenance scheduling? Test with CHEP: across all warehouses, which pallet types show the most damage? Requires Solution A accuracy ≥85% to be viable at fleet scale.

SC Copilot — Computer Vision Opportunity Solution Tree

SC Copilot — AI that guides frontline workers through the full work cycle using vision, voice & action

Photo evidence is captured but never meaningfully verified

Photo quality problems cause real financial damage

Reporting a visual issue requires too much manual effort

Inspection photos have no continuity between visits

Repeat violations aren't surfaced across inspections

Photo organisation blocks reporting and rollouts

Data locked in photos has to be manually typed

Workers can't get expert-level hazard identification without a specialist present

Inspections are slow because workers must stop, type, and navigate — voice & photo shortcuts are missing

Workers in PPE, gloves, or at height cannot type — and SafetyCulture is physically unusable for them today

Inspection templates are sequential but field observation is not — workers are forced to fight the form

Fixed cameras capture safety incidents every day but none of it flows into SafetyCulture workflows

Workers can't identify asset types from photos — manual tagging is slow, inconsistent, and often skipped