When someone asks ChatGPT "what's the best cruelty-free cleaning product?" or Google AI Mode for "climate-certified backpacks," something specific happens behind the scenes. The AI agent doesn't browse the web the way you do. It doesn't look at a brand's sustainability page, read the paragraphs, and notice the Leaping Bunny logo in the footer.
It reads structured data fields. And right now, there's no way for it to check whether those fields are telling the truth.
How agents actually find certifications
AI shopping agents pull product information from structured data — specifically JSON-LD markup embedded in a page's HTML. This is code that's invisible to human visitors but readable by machines.
Here's what a properly structured certification looks like:
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Foaming Hand Soap Tablets",
"brand": "Example Brand",
"hasCertification": [
{
"@type": "Certification",
"name": "Leaping Bunny",
"certificationBody": "Coalition for Consumer Information on Cosmetics",
"certificationStatus": "active"
}
]
}
The hasCertification property is part of the schema.org vocabulary — the shared language that Google, Bing, and other platforms use to understand web content. When an AI agent encounters this markup, it can match the product to queries like "cruelty-free cleaning products" because it reads the certification as a structured attribute, the same way it reads the price or the brand name.
When this markup is missing — and it almost always is — the agent has to rely on scraping paragraph text from the page. That's less reliable, less consistent, and often missed entirely.
The problem: agents can't verify what they read
Here's where it breaks down. There is nothing in the schema.org specification that requires the certification data to be real.
A brand can add hasCertification: "Leaping Bunny" to their product markup without actually holding the certification. The agent reads it, uses it in recommendations, and moves on. No platform checks. No verification step. No cross-reference against the actual Leaping Bunny registry.
The agent knows what Leaping Bunny is — it has training data about the certification, what it means, who issues it. But knowing that a certification exists and verifying that a specific brand holds it are completely different capabilities. It's the difference between knowing what a driver's license is and being able to check whether someone actually has one.
This isn't just a theoretical risk. There are real-world scenarios where the gap creates problems:
Expired certifications that linger. A brand certifies in 2023, adds the markup, then doesn't recertify in 2025. The structured data stays on the page. The agent still reads it as active. Nobody updates it.
Scope gets overstated. A brand is certified cruelty-free for their skincare line. They add the certification markup to all their product pages, including products that aren't covered. The certification is real, but it's applied too broadly.
Self-created labels look identical. A brand adds hasCertification: "Clean & Safe Certified" — a name they made up. In structured data, it looks exactly the same as a legitimate third-party certification. An AI agent can't tell the difference.
Unregulated terms get treated as certifications. Claims like "organic," "clean," or "natural" are not regulated in most product categories. A brand can add these as certification attributes without any third-party backing, and an AI agent will surface them alongside legitimate certifications.
What agents see vs. what they miss
We've scanned brands that hold 5+ third-party certifications. Every single one had zero certification data in structured formats that AI agents can read.
The certifications are on the website. They're mentioned in paragraphs on the sustainability page. The logos are in image files. But AI agents read structured data fields, not paragraphs and images. When there's no JSON-LD, no schema markup, and no Shopify metafields, the certifications are effectively invisible to AI commerce.
Shopify — where most DTC brands sell — doesn't support hasCertification in its native product schema. Brands would need to add custom JSON-LD manually or use a third-party app. Almost none do.
The result: brands that invested time and money in earning real certifications get the same AI visibility as brands that did nothing. Or worse — brands that self-declare unverified claims in their markup can actually outperform certified brands, because at least their data is structured.
Every industry that hit this problem built a verification layer
This pattern has played out before.
In healthcare, hospitals and insurers needed to verify whether a doctor actually held the credentials they claimed. The result was the NPI Registry (NPPES) — a central database where provider credentials can be checked in real time.
On the internet, browsers needed to verify whether a website was legitimate before allowing encrypted connections. The result was SSL Certificate Authorities — independent organizations like DigiCert and Let's Encrypt that verify website identity. That's why you see the padlock icon when you shop online.
In UK gambling, researchers found that operators who encoded their regulatory license data as structured schema markup got significantly higher visibility in AI search results. AI systems treated structured compliance data as an authority signal.
Product certifications in e-commerce have no equivalent. No central verification database. No way for an AI agent to cross-reference a brand's certification claim against the certification body's actual registry. That's the gap.
What the fix looks like
The infrastructure for verification already exists in the schema.org specification. The Certification type supports fields like certificationBody, certificationStatus, and certificationIdentification. There's even support for W3C Verifiable Credentials — cryptographic proofs that a certification is authentic.
What's missing is the data layer between the certification bodies (who know which brands are actually certified) and the product pages (where the structured data lives). Certification bodies maintain registries, but those registries aren't in machine-readable formats. Brands hold certifications but don't know how to express them as structured data. AI agents need structured data to make recommendations but have no way to verify it.
That's what we're building at mazenta. We work with certification bodies to structure their registry data, then inject verified certification data into product pages — so the structured data an AI agent reads is backed by the certification body that issued it, not self-declared by the brand.
The goal isn't to replace trust. It's to give AI agents something verifiable to reference, so that the brands who actually earned their certifications get discovered.
mazenta scan is live and in early access. If you want to see how your certifications show up (or don't) to AI shopping agents, get a demo.