Reliable AI is not the same thing as confident AI. A system can sound correct and still be impossible to examine. It can produce a useful answer and still leave no trail a human can follow. In regulated work, that difference matters.

We start from a simple constraint: if an AI system is going to be used in a setting where the output affects a decision, the system has to show its work. The answer should point back to the source that supports it. The generated text should be checked against the context it claims to use. Where the domain has hard rules, those rules should be explicit artefacts, not buried in a prompt.

This is the architecture behind the company. Retrieval should carry attribution. Detection should test whether an answer is supported. Rules should run as inspectable code when the domain requires them. Documentation should come from the system and the work around it, not from a compliance exercise written after the fact.

Reliability is a property of the mechanism, not the tone of the answer.

KR Labs was founded in 2024 by Ádám Kovács and Gábor Recski, NLP researchers from the TU Wien Data Science Unit and the HLT-Budapest group. Between 2023 and 2024 the company was incubated at the TU Wien Innovation Incubation Center. The founding work was on hallucination detection and retrieval architectures that carry their evidence with them.

That origin still shapes how we run the company. We are small on purpose. The people who write the papers also maintain the libraries, run the audits, and work on production deployments. There is no handoff between a research team that knows the method and a delivery team that has to turn it into client work. The same hands stay on the problem.

We are bootstrapped, with no investors. That is not a slogan about independence; it is an operating choice. It lets us keep the company oriented around a narrow technical standard instead of chasing a broader platform story. We build systems that can be inspected, and we keep the company inspectable enough that the work has authorship.

The commercial shape follows from that. We do not want to sell an abstract maturity exercise or a black-box API detached from the system it enters. We begin with the system in front of us: its data, retrieval, prompts, evaluation, controls, documentation, and deployment path. The first question is not what can be sold. It is what has to be true for the system to survive examination.

This is where regulated industries make the problem concrete. Finance, legal, healthcare, public sector, and research organisations are not only asking whether an AI system performs in a demo. They need to know who can examine the output, what evidence the system keeps, how unsupported claims are caught, and which human decision points remain visible.

The EU AI Act makes this a legal problem as well as a technical one. The Act is not the only constraint, but it increasingly governs whether a system can be adopted at all. Alongside it sit GDPR and the sector-specific rules already in force: MiFID II for financial services, MDR for medical devices, the upcoming European Health Data Space for clinical data, and the public-procurement transparency requirements for state contracts.

The law names the obligations, but the pressure is practical before it is legal. Someone inside the organisation has to decide whether the system can be trusted, where the human review belongs, and what evidence will be available when the answer is challenged.

Their constraint is not feature parity. It is whether the system holds up under examination.

Defensible by design is the practical answer. Architectures that produce evidence the audience can hand to a regulator without retrofitting. Retrieval that points back to a passage. Generation that carries token-level provenance. Rules that compile to inspectable code, not opaque prompts.

Advisory firms often end at the recommendation. Generic AI vendors often begin and end at the API. KR Labs was built for the loop between the two: diagnose the system, build or repair the mechanism, test it against the domain, and leave behind evidence the client can use.

The audit is how we enter the system. It is not the whole philosophy. It is the first operating step because it stops us from pretending to know a system before we have examined it. 3 weeks, 2 senior practitioners, a written report. The audit answers a specific question about a specific system.

Can this be deployed and defended under audit, and if not, what changes first?

From the audit, clients move into one of three follow-ons. Clients can stop at any point, and most run several stages with the same two people across them. The full sequence, with timelines and deliverables, lives on our Practice page.

We publish what we learn under permissive licences. AI infrastructure that cannot be inspected cannot be trusted, and publishing is how a research culture stays honest. The architectures we deploy with clients are the same ones the rest of the field can read, fork, and audit.

Build the evidence into the system.

Start with an audit, or read about the technology stack behind the evidence trail.