top of page
Search

How do we know it’s accurate? Inside ConsultAI’s approach to analysis quality


If you’ve ever had to analyse thousands of consultation responses by hand, you’ll know just how complex - and subjective - the task can be. Getting it right means more than just capturing what people said. It means identifying what matters, understanding nuance, and producing outputs that stand up to scrutiny.


So how do we make sure ConsultAI delivers accurate, trustworthy results?


We use a range of techniques to assess accuracy


At ConsultAI, we don’t rely on a single metric to measure performance.

Instead, we apply a combination of:


  • Manual benchmarking against officer-written summaries

  • Coverage checks to assess whether all key issues are being captured

  • Qualitative review of how well themes reflect the original data

  • User feedback from planning officers and policy teams

  • Real-world validation, including Planning Inspectorate-approved reports


This layered approach helps us understand not just whether the outputs are correct, but whether they are useful, defensible, and insightful in context.


What about data science techniques like the F1 score?


As practitioners will know, domain knowledge, local knowledge and context are essential when using automation tools in consultation. The F1 score, used in the UK Government’s experimental AI evaluations, is a useful metric for evaluating how accurately a system classifies predefined categories, balancing precision (how often it’s right) and recall (how much it captures).


But it only tells part of the story.

- It doesn’t measure whether the right themes are being defined in the first place.

- It doesn’t assess whether outputs generate meaningful insight

- And it doesn’t reflect the nuance of real consultation data — where comments often span multiple, overlapping or emerging issues.


That’s why we treat F1 as one input, not the whole picture. We combine it with qualitative review, traceability checks, and practical assessment of whether the classification genuinely helps users deliver better engagement and evidence



Approved by inspectors, trusted by officers


Perhaps the strongest proof of ConsultAI’s accuracy is this:

  • Our AI-generated reports are the first and only to have been accepted by the Planning Inspectorate as part of Regulation 19 submissions. -


That means our outputs weren’t just technically sound; they were good enough to support statutory planning decisions. It’s a first-of-its-kind milestone for AI in consultation analysis and planning, and a testament to the rigour and trust built into the platform.


In summary


  • We combine manual benchmarking, coverage checks, and real-world validation

  • Our reports have already passed regulatory scrutiny.

  • Officers remain in control, with full traceability and oversight.


Want to see how ConsultAI performs on your own data? Book a demo



 
 
 

Comments


Post: Blog2_Post
bottom of page