How do you spot an AI-generated verbatim!

May 15th, 2024

In our last blog, we discussed the potential implications AI technology would have on the market research industry from the researcher’s perspective, and I’ve been both excited and concerned about the approach of AI-driven models. Excited because AI has the potential to make creation of surveys and analysis of data far more efficient. Concerned, however, that AI could make our already difficult job of detecting fraud even more challenging. While combating fraud has always been a moving target, the arrival of AI means the target will surely be moving again!

As a data-quality measure, I always encourage my clients to add a thoughtful, engaging verbatim or two in their surveys…simply to help us combat and remove unengaged respondents or (occasionally) the determined cheaters! As I review open-ends, I always wonder to myself…is there a chance these were written with the help of (or entirely) with AI? Would I be able to identify them if they were?

Then…several weeks ago, it happened! An AI-written response appeared so clearly in the data that I was able to derive some takeaways about how to identify them and put my team on the lookout!

Here are the characteristics that may help in your own identification of them. Any one alone might not be a deal-breaker, but in combination, these are surely suspicious:

·  Verbatim length: Looking at the average length of the response, AI-driven responses tend to be longer (sometimes much longer) than the average response. How long is “average?” It depends on the question posed, but it often stands apart from others in length.

·  Oddly perfect grammar and spelling: While it can be a red-flag of overseas fraud if common words are woefully and consistently misspelled, a characteristic of AI-driven responses is PERFECT grammar and spelling. While real humans mistype an occasional word, AI driven responses are oddly flawless!

·  Heavy use of transition words:

–   instead

–   meanwhile

–   more often

–   rather

–   as a result

These are excellent words for thoughtfully written essays…but very few survey responses contain them. AI-driven responses contain a plethora of these transitional phrases.

·  Prevalence of technical buzzwords: AI responses rely on technical jargon for context. Once a response identifies the “context” in a technical framework, AI responses lean heavily on these words…much more often than natural responses.

·  Focus on general rather than personal point-of-view: When most other responses contain highly personal viewpoints (i.e. “At my last doctor’s appointment, the wait was 25 minutes.”) an AI-driven response is very generalized (“On average, wait times in my area are 15 to 20 minutes, depending on demand.”)

·  Repeating the question within the response: A friend of mine, who is a screenwriter, once told me it is a sign of poor dialog when a character repeats a question just posed by another character. (Watch for this at your next movie! It happens more often than you realize). While this may be common on the silver screen, humans rarely repeat the question in answering one naturally. In a survey environment as well, real humans never take the time to repeat the question in their response. When these show up in your verbatims – beware!

For or against it, AI technology has entered the research industry and we must take care to be on the lookout for those that would use AI for fraud, while still looking for ways for AI to ultimately improve how we gather and understand insights.

Ryan Jay – Founder & CEO

Stay tuned for more information about AI tools coming from Outsized Insights in the coming months. Need help with your current projects? We’d love to help – drop us a line at Bids@outsizedinsights.com

Related Posts

Scroll to Top