[practice]

We audit the system you have, then build what works.

A research-led delivery team for AI systems that need evidence, evaluation, and a defensible deployment path. Audit first. 3 weeks, 2 senior practitioners, and a written report that names the finding in the first paragraph. From there, build, rebuild, or walk away.

Start with an audit Read about the stack we ship

How an engagement runs

4 stages, in order. Fixed scope, fixed price band, defined deliverable. Stop at any stage. Most clients run several, with the same 2 practitioners across all of them.

01 3 weeks

diagnose [audit]

We examine the system you run, or the one you're building. Retrieval, grounding, evaluation, documentation, the full trail you need to be audit-ready. Tested against real queries from your domain, not generic benchmarks.

input
Your RAG answers and sources.

output
A written report. Findings with evidence, architecture recommendations, prioritised next steps.
02 2 to 6 weeks

prioritise [consult]

Targeted work on the priorities the audit surfaced. Rewriting retrieval, designing evaluation, writing documentation a regulator will actually read. 2 practitioners, the same 2 who ran your audit.

input
The audit report and the priorities you want to act on.

output
Architecture specifications, evaluation plans, technical documentation that survives external review.
03 6 to 12 weeks

build [mvp]

We build the system. Runs on your data, in your deployment context, against your acceptance tests. Open-source architectures where they fit (VerbatimRAG, LettuceDetect, RuleChef). Custom components where they do not.

input
Your data, your deployment context, your acceptance tests.

output
A working system in your environment, with the technical documentation a deployer needs.
04 Ongoing

operate [deployment]

Move the MVP into production. Post-market monitoring, regular evaluation, incident response. The operational discipline required for high-risk AI. We stay on at the depth you ask for, from quarterly architecture review to embedded support.

input
The MVP, ready to move to production.

output
A defensible production system, plus the people who built it, on call when it matters.

A report you can defend,
in 3 weeks

The audit ends with a written finding you can take to a board, a regulator, or a deployment decision. Here is what is in it, who runs it, and what happens after.

duration: 3 weeks

week 1 Intake, access, scope confirmation.

week 2 Review, testing against real queries from your domain.

week 3 Writing the report and walking your team through the findings.
team: 2 senior practitioners
output: 1 written report

scope
One AI system, or one well-defined slice of a larger estate. Retrieval, grounding, evaluation, the model layer, the technical-documentation trail. We do not produce a market scan.
who shows up
2 senior practitioners, at least one from our core R&D team. No junior analysts double-booked across accounts. The people who run your audit are the people who run your engagement, if there is one.
what we deliver
A written report of 20 to 40 pages. Executive summary that names the finding in the first paragraph, evidence, architecture recommendations, prioritised next steps, and an appendix of the test cases.
what happens after
A working session with your technical and compliance leads to walk through the report. Most clients move into a consult or MVP. Some take the report and act on it internally. We get paid for the audit either way.

Projects on the record

We work with regulated EU enterprises. Finance, legal, healthcare, public sector. Most engagements stay commercial-confidential. These three don't.

deploy TU Wien

A custom deployment of the Verbatim Platform on TU Wien's computing infrastructure. Data pipelines tuned for scientific papers; requirements analysis and user testing with TU Wien researchers; ongoing maintenance, training, and support. Built on the open-source VerbatimRAG architecture.

For TU Wien researchers and students.
research Verbatim-KG

Extends VerbatimRAG from text-only retrieval to knowledge-graph question answering and KG population from unstructured documents. A research prototype tested with researchers at three Viennese universities. A roadmap towards maximum verifiability for AI-assisted academic research.

With WU SemSys. Funded by the Vienna Business Agency.
contribute CLEAR

Hybrid rule-based and ML methods for German legal named-entity recognition, applied to the transparent anonymisation of legal text. TU Wien researchers use KR Labs's RuleChef to make the logic explainable and auditable. Human-in-the-loop learning over a hybrid architecture, instead of opaque end-to-end models.

A research consortium led by m2n. With the Austrian Ministries of Justice and Finance, the Austrian Parliament, TU Wien, and Uni Wien.

Why us, not a Big-4 advisory

A readiness slide deck does not make an AI system defensible. At KR Labs, the team that diagnoses the problem is the team that fixes it.

what they sell Big-4 advisory

deliverable: A readiness deck.
team: Rotating junior analysts.
continuity: Diagnose-build handoff between teams.
architecture: Recommendations without remediation.

what we ship [KR] Labs

deliverable: A working system.
team: Senior practitioners only, founders included.
continuity: Same team from diagnosis to deployment.
architecture: Architectures we publish and maintain as open-source.

3 weeks, a written report, a real decision.

Tell us what needs reviewing. We set up a 30-minute call and propose a start date right away.

Start with an audit Read about the stack we ship