How do we know it’s accurate? Inside ConsultAI’s approach to analysis quality

Jun 30, 2025
2 min read

Updated: Jul 15, 2025

If you’ve ever had to analyse thousands of consultation responses by hand, you’ll know just how complex - and subjective - the task can be. Getting it right means more than just capturing what people said. It means identifying what matters, understanding nuance, and producing outputs that stand up to scrutiny.

So how do we make sure ConsultAI delivers accurate, trustworthy results?

We use a range of techniques to assess accuracy

At ConsultAI, we don’t rely on a single metric to measure performance.

Instead, we apply a combination of:

Manual benchmarking against officer-written summaries
Coverage checks to assess whether all key issues are being captured
Qualitative review of how well themes reflect the original data
User feedback from planning officers and policy teams
Real-world validation, including Planning Inspectorate-approved reports

This layered approach helps us understand not just whether the outputs are correct, but whether they are useful, defensible, and insightful in context.

What about data science techniques like the F1 score?

As practitioners will know, domain knowledge, local knowledge and context are essential when using automation tools in consultation. The F1 score, used in the UK Government’s experimental AI evaluations, is a useful metric for evaluating how accurately a system classifies predefined categories, balancing precision (how often it’s right) and recall (how much it captures).

But it only tells part of the story.

- It doesn’t measure whether the right themes are being defined in the first place.

- It doesn’t assess whether outputs generate meaningful insight

- And it doesn’t reflect the nuance of real consultation data — where comments often span multiple, overlapping or emerging issues.

That’s why we treat F1 as one input, not the whole picture. We combine it with qualitative review, traceability checks, and practical assessment of whether the classification genuinely helps users deliver better engagement and evidence

Approved by inspectors, trusted by officers

Perhaps the strongest proof of ConsultAI’s accuracy is this:

Our AI-generated reports are the first and only to have been accepted by the Planning Inspectorate as part of Regulation 19 submissions. -

That means our outputs weren’t just technically sound; they were good enough to support statutory planning decisions. It’s a first-of-its-kind milestone for AI in consultation analysis and planning, and a testament to the rigour and trust built into the platform.

In summary

We combine manual benchmarking, coverage checks, and real-world validation
Our reports have already passed regulatory scrutiny.
Officers remain in control, with full traceability and oversight.

Want to see how ConsultAI performs on your own data? Book a demo