— Contents +
The agency has a QA process. It runs after handoff, on staging, before anything goes to the client. They are good at it. For the first year or so of the partnership, their QA was finding things in every build.
Not the serious things. Not layout collapses or broken forms or missing pages. Those got caught in our own review before anything left our hands. What the agency kept finding was the other tier of mistakes — the kind that look minor in isolation and cumulatively tell a story about how carefully a team has reviewed its own work.
A phone number rendered as plain text, not linked with tel:. An internal link pointing to the old client domain instead of the new one. A placeholder image left in the “Meet the Team” section — the generic person silhouette that ships with the starter template. A page title in all-lowercase because the developer’s copy-paste sourced from the brief, not from the agreed heading convention. A page still named “Sample Page” in the menu, because that cleanup had been on the list and fallen off it during the final push.
None of these are hard to fix. Each one takes under a minute when you know it is there.
The problem was not the fixes. The problem was the discovery. The agency was discovering these things after handoff. Which meant that every build we marked as complete and delivered to staging carried a handful of visible, fixable mistakes that the agency’s reviewer would find in the first ten minutes of a fresh walkthrough. Every time.
“They keep shipping us things that don’t look finished.” That is a reasonable conclusion to draw. We were drawing the same conclusion from the inside and finding it uncomfortable.
What was actually happening
The developer who builds a site for two weeks does not see it clearly by the end.
This is not a skill deficiency. It is a human constant. Extended exposure to a work-in-progress dulls the ability to see it as a new viewer would. The developer who has opened the homepage several hundred times in two weeks does not notice the placeholder image anymore. It has become part of the furniture. Their eyes pass over it on the way to the thing they are actually checking.
A fresh reviewer sees it immediately.
The developer who built it and the reviewer who sees it for the first time are not looking at the same site. They are looking at the same pixels, with very different prior exposure.
— Working note · 2024
The standard response to this problem is a pre-handoff checklist. Write down the things that tend to get missed. Run through it before handing off. Checklists work, up to a point. The point where they stop working is when the list gets long enough, and the project gets rushed enough, that a tired developer starts checking boxes by recall rather than by inspection. “Phone numbers — yeah, I did those.” They didn’t, or they did them on three pages and missed one. But the box is checked.
A document is only as good as the attention given to reading it. Attention is not reliably available at the end of a two-week build sprint.
What we needed was something that did not depend on the developer’s attention. Something that would check the known categories of mistakes systematically and block handoff if it found them — regardless of how tired the developer was, regardless of how well they remembered doing things.
Building the gate
We built a WordPress plugin. Internal name: Site Checker. It runs a battery of checks across the site before a staging URL is sent to the agency. The output is a pass/fail report. A single fail on any check blocks the handoff.
The check categories evolved over several iterations, but the stable version covers roughly forty rules across six groups:
# Site Checker — check categories (simplified)
phone_format:
rule: all phone numbers must use tel:+1XXXXXXXXXX
scope: all pages, all widgets, header, footer
fail_on: plain text numbers, tel: without country code
internal_links:
rule: no links pointing to previous client domains or placeholder domains
scope: all pages, navigation menus, footer
fail_on: off-domain hrefs that are not intentional outbound links
content_language:
rule: no Cyrillic characters in published content
scope: all rendered text, widget labels, custom field values
fail_on: any Cyrillic codepoint outside wp-admin UI strings
placeholder_assets:
rule: no unmodified template placeholder images
scope: media library usage on published pages
fail_on: known placeholder filenames, known placeholder dimensions with no alt text
page_titles:
rule: heading case follows site convention (Title Case)
scope: H1 and H2 elements on all published pages
fail_on: all-lowercase headings, inconsistent capitalisation within page
standard_pages:
rule: WordPress sample page and default "Hello world!" post removed
scope: all pages and posts
fail_on: "sample-page" slug, "hello-world" slug present and published
The Cyrillic rule needs an explanation.
Our development team writes internal notes in Russian. That is not a problem — internal notes stay internal. The problem is copy-paste. A developer writes a working note in Russian, pastes a fragment somewhere to use as a placeholder while building, and forgets to replace it before handoff. On a WordPress site with Elementor, that fragment can end up serialised in a widget’s content field where it is not immediately visible in the editor — it renders on the front end but only if you are looking at the right viewport at the right scroll depth.
The agency found this on two sites. Both times, the Cyrillic text was in a widget label that a client would eventually see. One of them was in a contact section. We added the rule. Site Checker now checks every rendered text output, including Elementor serialised fields, for Cyrillic codepoints. It finds things that a visual review would miss.
The API key discovery was different in kind from the other checks. During a review of one project’s AutoQA configuration workbook, Site Checker’s content scanner found what appeared to be an exposed OpenAI API key in a comment field. It was a test key from an early AI-assisted QA experiment, left in a public-facing page’s custom fields during development and never removed. The key had not been activated in production, but it was visible to anyone who viewed source. We invalidated it and added credential-pattern scanning to the check set. Structural QA tools sometimes surface things that were never their original scope.
What the gate changed
The first quarter after deploying Site Checker, the agency’s QA comments on convention-level issues dropped sharply.
Not to zero. The agency’s QA runs on a broader surface than ours does — they include content accuracy, brand alignment, and client-side requirements that we do not have full visibility on. Those comments continued, and correctly so. But the ten-minute discovery items — the plane text phones, the placeholder images, the sample pages — those essentially stopped.
The shift was not just in the comment count. It was in the nature of the conversation. Before Site Checker, the review opening was often a pass through the basics — conventions, format, cleanup. After Site Checker, the opening skipped that tier entirely. The agency’s reviewers began their reviews where the substantive work was, because the baseline was already cleared.
A new developer we brought in partway through a project sprint received a link to the Site Checker report rather than a verbal walkthrough of the conventions. The report was a cleaner onboarding artefact than anything we had written, because it was specific to the actual state of the site they were finishing. “There are two failing checks: phone numbers on the contact page, and one H2 that’s all lowercase in the services section.” That is better than a general checklist, because it tells them exactly what to look at.
When we onboard a new developer, Site Checker tells them what is wrong with this site. Not what tends to go wrong with sites in general — what is wrong with the one they are about to hand off.
— Working note · 2025
The most durable effect was on the baseline assumption the agency brought to incoming handoffs. Before Site Checker, the working assumption was something like: “the developer has done their best, and we should expect to find convention-level issues.” A defensive scanning mode. After several consecutive clean handoffs, the assumption shifted: “the basic structure will be correct; let’s go straight to the substantive review.” That shift is worth more than any individual fix, because it means the agency’s review time is being spent on things only they can see, not on things we should have caught ourselves.
What the gate does not do
Site Checker does not replace the agency’s QA. It is not intended to.
The agency’s QA covers things we cannot check: whether the content is accurate to the client’s actual business, whether the imagery is on-brand, whether the page structure makes sense for the marketing goals of the campaign, whether there are accessibility issues specific to the client’s user base. None of that is in our gate. All of that is in theirs.
Site Checker is a structural gate for a specific, well-defined category of problem: the kind of mistake that is embarrassing to ship, belongs clearly to our layer of the build, and is reliably checkable by a program. Nothing more.
The boundary matters. A gate that tries to do too much becomes slow, brittle, and avoided. If every edge case and one-off judgment call flows into the check set, eventually the report is full of false positives and developers start dismissing it. A gate that does one thing well and only one thing builds trust in its signal.
We have resisted the temptation to add checks that are contextual. “Does the hero image meet the client’s brand guidelines?” is not a check we can write. “Is there a phone number on the contact page that does not use the tel: format?” is. We write the second kind and leave the first kind to humans.
When to turn the gate off
There are phases in development where the gate should be bypassed. Early prototyping, client-facing wireframe review, research branches where the point is to explore rather than deliver — running a fail-zero gate during these phases would be counterproductive. The prototype is supposed to have placeholder images. The wireframe is not supposed to have final phone numbers.
We suppress Site Checker during development environments flagged as prototypes and re-enable it at the point where a build is being prepared for staging handoff. The flag is manual. Developers can turn it off, and they have the judgment to know when they should. The gate is not a control mechanism — it is a discipline mechanism. The distinction is important for the team to maintain.
The mechanism only works if the team believes in it. A gate that feels like an obstacle gets circumvented. A gate that feels like a useful tool gets used. We have managed to keep it in the second category by being disciplined about what goes into it: only checks that are clearly ours to own, only rules that have a documented reason for existing, only failures that a developer will recognise as genuine.
The lesson for the agency
If you work with development subcontractors, the question to ask is not “do they do QA?” The answer is always yes, in some form. The better question is: “who owns which gate, and at what stage does each gate run?”
A contractor who does a visual review before handoff and calls it done is asking your QA team to be their final line of defence against structural mistakes. Your reviewers will find those mistakes. They will fix them, or they will send comments, or they will learn to expect them. All of those outcomes have a cost that is invisible in any individual project and substantial across a portfolio.
A contractor with a pre-handoff structural gate — even a simple one — is coming to you with a different offer. “We have already cleared the baseline. Your review can start at the layer that matters.” That is a different working relationship, and a meaningfully better use of your team’s attention.
The gate does not need to be sophisticated to be useful. A checklist that is actually enforced is better than a tool that is sometimes bypassed. A ten-rule automated check that reliably runs is better than a forty-rule checklist that gets skimmed at the end of a sprint. The point is not the sophistication. The point is the reliability.
The embarrassing mistakes — the placeholder images, the plain-text phones, the sample pages — do not indicate a team that doesn’t care. They indicate a team that has not structured its process to catch them before they leave. Structuring the process to catch them is not expensive. The first version of Site Checker was two hundred lines of PHP. The return, in reduced review load and in the quality of the working relationship, showed up within the first quarter.
The gate we should have had on day one would have changed the first year.