[practice]
We audit the system you have, then build what works.
A research-led delivery team for AI systems that need evidence, evaluation, and a defensible deployment path. Audit first. 3 weeks, 2 senior practitioners, and a written report that names the finding in the first paragraph. From there, build, rebuild, or walk away.
How an engagement runs
4 stages, in order. Fixed scope, fixed price band, defined deliverable. Stop at any stage. Most clients run several, with the same 2 practitioners across all of them.
-
diagnose [audit] We examine the system you run, or the one you're building. Retrieval, grounding, evaluation, documentation, the full trail you need to be audit-ready. Tested against real queries from your domain, not generic benchmarks.
inputYour RAG answers and sources.
outputA written report. Findings with evidence, architecture recommendations, prioritised next steps.
-
prioritise [consult] Targeted work on the priorities the audit surfaced. Rewriting retrieval, designing evaluation, writing documentation a regulator will actually read. 2 practitioners, the same 2 who ran your audit.
inputThe audit report and the priorities you want to act on.
outputArchitecture specifications, evaluation plans, technical documentation that survives external review.
-
build [mvp] We build the system. Runs on your data, in your deployment context, against your acceptance tests. Open-source architectures where they fit (VerbatimRAG, LettuceDetect, RuleChef). Custom components where they do not.
inputYour data, your deployment context, your acceptance tests.
outputA working system in your environment, with the technical documentation a deployer needs.
-
operate [deployment] Move the MVP into production. Post-market monitoring, regular evaluation, incident response. The operational discipline required for high-risk AI. We stay on at the depth you ask for, from quarterly architecture review to embedded support.
inputThe MVP, ready to move to production.
outputA defensible production system, plus the people who built it, on call when it matters.
A report you can defend,
in 3 weeks
The audit ends with a written finding you can take to a board, a regulator, or a deployment decision. Here is what is in it, who runs it, and what happens after.
- duration
-
3 weeks
- week 1 Intake, access, scope confirmation.
- week 2 Review, testing against real queries from your domain.
- week 3 Writing the report and walking your team through the findings.
- team
-
2 senior practitioners
- output
-
1 written report
- scope
One AI system, or one well-defined slice of a larger estate. Retrieval, grounding, evaluation, the model layer, the technical-documentation trail. We do not produce a market scan.
- who shows up
2 senior practitioners, at least one from our core R&D team. No junior analysts double-booked across accounts. The people who run your audit are the people who run your engagement, if there is one.
- what we deliver
A written report of 20 to 40 pages. Executive summary that names the finding in the first paragraph, evidence, architecture recommendations, prioritised next steps, and an appendix of the test cases.
- what happens after
A working session with your technical and compliance leads to walk through the report. Most clients move into a consult or MVP. Some take the report and act on it internally. We get paid for the audit either way.
Projects on the record
We work with regulated EU enterprises. Finance, legal, healthcare, public sector. Most engagements stay commercial-confidential. These three don't.
-
deploy TU Wien A custom deployment of the Verbatim Platform on TU Wien's computing infrastructure. Data pipelines tuned for scientific papers; requirements analysis and user testing with TU Wien researchers; ongoing maintenance, training, and support. Built on the open-source VerbatimRAG architecture.
-
research Verbatim-KG Extends VerbatimRAG from text-only retrieval to knowledge-graph question answering and KG population from unstructured documents. A research prototype tested with researchers at three Viennese universities. A roadmap towards maximum verifiability for AI-assisted academic research.
-
contribute CLEAR Hybrid rule-based and ML methods for German legal named-entity recognition, applied to the transparent anonymisation of legal text. TU Wien researchers use KR Labs's RuleChef to make the logic explainable and auditable. Human-in-the-loop learning over a hybrid architecture, instead of opaque end-to-end models.
Why us, not a Big-4 advisory
A readiness slide deck does not make an AI system defensible. At KR Labs, the team that diagnoses the problem is the team that fixes it.
- deliverable
- A readiness deck.
- team
- Rotating junior analysts.
- continuity
- Diagnose-build handoff between teams.
- architecture
- Recommendations without remediation.
- deliverable
- A working system.
- team
- Senior practitioners only, founders included.
- continuity
- Same team from diagnosis to deployment.
- architecture
- Architectures we publish and maintain as open-source.
3 weeks, a written report, a real decision.
Tell us what needs reviewing. We set up a 30-minute call and propose a start date right away.