Engineering practice certification-compliance

An AI analyst for a declarations database: a working prototype we built at our own cost

We built a prototype at our own cost: query a declarations database in plain words, not SQL. A live demo and an honest map of what works and what doesn't.

Delivered
AI analyst over a declarations database: working prototype

By the summer of 2024 the client had built up a large body of documents from certification registries: declarations, certificates, history by applicant and product. Shortly before that we had moved the heaviest tables into ClickHouse, and a search that used to run for minutes started answering in a few seconds. The client’s next question followed naturally: could we bolt AI onto this?

We chose not to answer with a presentation. We built a prototype instead and put it on a test server so the client could touch it. The idea is simple. An operator writes a query in human language, and the model turns it into a result set from the declarations database. The work ran on the studio’s initiative and at our cost, so the budget conversation could turn on a live result rather than nice slides.

Snapshot

Industry Product certification, B2B market analytics
End client Certificate Analytics
Engagement R&D prototype on the studio’s initiative, at our own cost
Project type Natural-language queries over a declarations database on top of the existing platform
Work done Built the prototype, trained the model on real questions, demoed on a test server, ran it in with the client
Effort No agreed estimate: research work by the studio
Team Prototype built by Anton Hersun; no engineer assigned to a production version
Tech stack Laravel platform · ClickHouse · trainable LLM on a separate GPU server · query web interface
Status Stalled at demo level: the model showed a result, no order to take it to production was signed

The problem

The client had a working analytics platform and a large mass of structured data. The request was vague, as it usually is at the start: “bring AI into the company, a ChatGPT,” one for the sales department, one for the technical department. There were no specifics under that, and at the start that is fine.

We opened with an honest question back: what function exactly do you want, and where should it live, in the analytics portal, as a widget in the CRM, or as a chatbot in a messenger. For the technical department we already had a starting piece by then, so the easiest way to move the conversation was to show it working.

How we did it

1. Built the prototype and gave access to it.
We put a working version on a test server with a login for the client. An operator writes a question in plain words, the model builds a result set from the declarations database, and returns a table. We attached a PDF of real example queries to the demo so the client could see the edges of what was possible, not a rehearsed script.

2. Trained the model on real questions.
This is not “we plugged in an API and it worked.” The model is trainable: to understand the operators’ live language, it has to be trained further on actual phrasings. We loaded into it what we had collected over a month of testing, and kept training it during the demo itself. One telling episode from the chat: the phrase “current month” the model did not understand at all at first, because it has no built-in knowledge of today’s date. About an hour went into teaching it to handle dates. After that it correctly resolved phrasings like “documents dated after the first of June,” and showed on its own that elevator declarations come in every day.

By the end of the run-in the prototype handled fairly complex composite queries too: take a city, an applicant, and the count of their documents, keep only those who imported a given product category, add the date of the last registration, sort by document count, collapse “LIMITED LIABILITY COMPANY” to “LLC,” and format the dates. All of that in one Russian sentence, with no SQL.

3. Showed honestly where the model fails.
The core engineering point of a product like this: the model is confidently wrong. In the demo it set strict equality where a contains-search was needed, mixed up columns, and started pulling data from the applicant’s address instead of the right field. At times you had to literally bargain with it to get a correct result set. Without a “model proposes, human verifies” mode and continued training on edge cases, a tool like this turns quickly into a source of plausible nonsense. We built that into the conversation from the start rather than leaving it for “we’ll sort it out later.”

4. Mapped out what production needs.
The demo ran into honest limits. The model had no memory between queries: ask “peaches, Moscow” and then “now the analytics on them,” and it would not tie the two into one context, which is already a separate, more advanced level. A production version needed a separate powerful server and a person who keeps training the model like a student: “the dates took me an hour to teach it.” We said plainly that there is a lot of research and experiment here, the result is not guaranteed in advance, and it should go into the budget in small steps, not as one large line item called “AI functionality.”

How it was received

The reaction ran exactly as it should on raw R&D. The client described the first run straight: “an interesting modern thing, outputs nonsense for now of course.” Part of the “nonsense” turned out to be a mismatch of expectations: he was asking about certificates, while under the hood there was only the declarations database. After a couple of training iterations the tone shifted: “it solves complex queries,” “this came out nicely.” A few weeks later, once the result sets had been brought to what he wanted to see, the verdict got very short: “great.”

The client even wanted to show the prototype to partners. We reopened access to the test server for a while (between demos we shut it down so it would not burn resources) and warned that without a short instruction the model can talk nonsense to partners too.

Results

What we got Value
Working demo Query the declarations database in plain words instead of SQL, on live data
Map of the possible What the model already does, where it is confidently wrong, what it lacks
Production requirements Separate server, a human trainer, an “AI proposes, operator confirms” mode, budget in small steps
Order status Stalled at demo level: the client did not order a production version, no engineer assigned

The main outcome here is not a shipped feature but the illusions removed and a working starting point. The client saw what the technology can actually do on his data and what finishing it would cost. The topic stayed in review: a year later, in August 2025, we separately looked for off-the-shelf solutions for a chatbot on this task, and reported honestly that we found nothing mature enough for their specifics. So our own prototype remains the closest thing to the goal.

For us this is a normal way of working. We are willing to put our own time into an AI experiment so the client has something to touch rather than a slide, plus a sober decision about budget.

Team

  • Anton Hersun, Xaver Pro: built the prototype, trained the model on real questions, ran the demo and the run-in with the client

The AI track stays at prototype stage: no separate engineer was assigned to a production version, and no order was signed. We ran the R&D at our own cost.

Screenshots and materials

Prototype on a closed test server, no public artifacts.

Thinking about bolting AI onto your own database or processes, but unsure where it pays off and where it becomes an expensive toy? We will build a prototype on your data, show what actually works and where the model lies with confidence, and help you work out how much to budget. The first review is free.

Build an AI prototype →


Scroll to Top