The Simple Data Checklist to Prepare Before Talking to Any AI Claims Vendor
Published by:
Iyinoluwa Oyekunle
The Simple Data Checklist to Prepare Before Talking to Any AI Claims Vendor
Il s'agit d'un texte à l'intérieur d'un bloc div.

A lot of AI claims projects stall for a very boring reason:

The vendor asks, “Can you share a sample of your data?”
IT isn’t sure what’s available, Claims isn’t sure how clean it is, and Legal isn’t sure what can be shared.

This checklist is designed so that, in about a day, your team can get “AI ready enough” to have a strong, practical conversation with an AI vendor like Curacel’s AI claims infrastructure without a 6-month data project. 

If you want a broader view of what this can unlock, you can also read our guide on Claims Management: AI-Powered Insurance and Automation.

1. Know Your Core Datasets

You don’t need a perfect data warehouse. You just need clarity on what exists, where it lives, and who owns it.

Claims data (per claim)

Confirm which fields you can pull reliably:

  • Claim ID, Policy ID

  • Date of loss, date reported

  • Cause/type of loss (collision, theft, inpatient, outpatient, etc.)

  • Reserve and paid amounts

  • Current status (open, closed, repudiated, pending)

  • Line of business (motor, health, property, travel…)

This is the backbone for any AI-powered triage, analytics or fraud detection.

Policy data

Next, note the policy attributes you can link to those claims:

  • Policy / contract ID

  • Product type (e.g., motor comprehensive, HMO plan, SME package)

  • Coverage summary (what’s in / out at a high level)

  • Limits, deductibles, co-pays

  • Policy start and end dates

  • Customer type (individual / corporate)

You don’t need every clause of the wording; just the key fields your teams already use to make decisions.

Supporting documents

Most of the “mess” in claims sits in documents and media. Make a quick inventory of where you store:

  • Photos / videos (accident scenes, damaged parts, hospital pictures)

  • PDFs (policies, medical reports, police reports)

  • Medical reports and encounter notes

  • Invoices, estimates, proof of payment

  • Adjuster or investigator reports

Also note: are these linked to claim IDs in your system, or scattered across drives and email?

This mix of structured (claims + policy) and unstructured (documents + images) is what an AI vendor will use to automate document handling, risk scoring and fraud flags.

2. Do a Plain-Language Data Quality Check

Before anyone says “data lake”, do something much simpler: pull ~100 recent claims from one important product line (often motor or health) and inspect them.

A one-day, 100-row review with the right people beats a six-month data cleanup project.

Ask three basic questions:

a) Missing or obviously wrong values?

Look for:

  • Negative amounts where they make no sense

  • Claims with no policy ID

  • Loss dates after closure dates

  • Zero reserves on clearly large claims

Capture a short list of “things we know are off”.

b) Inconsistent product and coverage names?

Check if the same thing appears under multiple names, for example:

  • “Motor Comp”, “Motor Comprehensive”, “Comprehensive Motor”

Inconsistency doesn’t stop AI, but it does require quick mapping or normalisation. Knowing this upfront saves time later.

c) Messy formats?

Scan for:

  • Multiple date formats across rows

  • IDs with extra spaces or prefixes

  • Obvious duplicate IDs

You’re not fixing it all now. You’re building a realistic picture of “what our data really looks like” for the vendor.

3. Map Volumes, Variability & Real Examples

Now give your future partner enough signal to size effort, infrastructure and ROI.

Volumes (ballpark is fine)

For each major product line, estimate:

  • Claims per month

  • Average documents per claim (e.g., 3–5 for simple motor, 10–20 for complex health)

  • Rough split of low-touch vs high-touch claims

Even “about 1,500–2,000 motor claims a month” is useful.

Variability and seasonality

Note:

  • Seasonal spikes (rainy season, flu season, holidays, floods, etc.)

  • Any unusual periods (new product launch, regulation change, major event)

  • Typical handling times for different claim types

This helps a vendor suggest where automation will have the biggest impact first.

Real labelled examples

Prepare a small, curated set of claims with simple labels:

  • 5–10 smooth, straightforward claims

  • 5–10 edge cases / disputes

  • 5–10 confirmed fraud cases

For each, add 1–2 sentences:

  • Why it was easy or painful

  • What signal should have triggered earlier (if any)

This tiny “training set” is gold for designing triage flows, fraud scores and routing rules that match your real world. For a deeper dive into how this kind of pattern-spotting works in practice, see our article on Anomaly Detection in Claims: How AI is Transforming Insurance Operations.

4. Line Up Governance and Access Early

The quickest way to slow down an AI project is to surprise Legal, Compliance or IT.

Capture the basics before you jump on vendor calls.

Who approves data sharing?

List the key roles:

  • Legal / compliance approver

  • Data protection / privacy lead

  • IT / security owner

Knowing these names early means you can loop them in properly once a proof-of-concept is on the table.

How can data be shared?

Clarify what’s acceptable in your organisation:

  • Encrypted exports (e.g., SFTP or secure upload)

  • Read-only cloud storage (e.g., a secure bucket)

  • Use of anonymisation / pseudonymisation where necessary

You’re not choosing tools yet, just setting guardrails.

Minimum security expectations

Write a short one-pager covering:

  • Encryption in transit and at rest

  • Role-based access and audit logs

  • Alignment with relevant standards and local data protection laws

Now, when a vendor proposes a data flow, you can immediately check: “Does this fit inside our rules?”

One-Page Checklist to Paste into Notion

A shared checklist turns ‘Are we ready for AI?’ into a clear yes, no, or ‘almost there’.

Copy this into Notion, Confluence or a shared doc and tick through as a team:

Datasets

  • Core claims, policy and document sources listed

  • Owners and locations for each dataset identified

Data Quality

  • ~100 recent claims sampled

  • Obvious missing / wrong values noted

  • Inconsistent product / coverage names listed

  • Date and ID format issues captured

Volumes & Variability

  • Approximate claims per month by product line

  • Rough documents per claim

  • Key seasonal spikes / unusual periods noted

Examples

  • 5–10 smooth claims with brief notes

  • 5–10 edge cases / disputes with notes

  • 5–10 confirmed fraud examples with notes

Governance & Access

  • Named contacts for data sharing approval

  • Accepted transfer methods documented

  • Basic security expectations written down

If you can tick most of this, you’re already ahead of most AI initiatives.

What Changes When You Bring This to a Vendor Call

When you show up with this prep:

  • You skip AI theatre and go straight to: “Given our data and volumes, what can we automate first?”

  • Vendors can propose realistic timelines and ROI, not hand-wavy estimates.

  • You de-risk the project for IT, Legal, Compliance and the C-suite because it’s grounded in the data you already have.

Book a Data Readiness Review & Demo

Offer: Share a sample of this checklist and a few example claims with us, and in one call we’ll show you:

  • How ready your data is for AI claims automation

  • Where the biggest gaps and quick wins are

  • Which 1–2 workflows to automate first

Book a demo and AI data readiness review →

Avez-vous aimé lire ceci ?

Abonnez-vous à notre newsletter pour recevoir du contenu hebdomadaire

Merci ! Votre candidature a été reçue !
Oups ! Une erreur s'est produite lors de l'envoi du formulaire.
Partagez cet article :