Engineering practice Multi-vertical

A two-byte diagnosis: a seven-month bug closed in one evening

Seven months of broken Excel downloads: fine on disk, unopenable in the browser. Diagnosed in one evening to two stray CRLF bytes from the code.

1h delivered
Broken XLSX diagnosis: 2 bytes of CRLF, 1 hour of work

Some bugs live for months precisely because they belong to no one. The developers look at the code: the code is clean. The admins look at the server: the server ships files the way it should. The bug sits on the seam and outlasts both sides. Here that seam came down to two bytes.

Snapshot

End-client sector children’s recreation: a summer-camp site with online booking of passes
End client a studio client; an admin panel for season reports
Engagement DevOps retainer for the studio: tracker plus a shared chat
Project type diagnosing an application bug on the seam between code and server
Work done 1 report service (Yii 2 / PHP 7.4), 1 server (nginx + Apache)
Project date 5 Sep 2024 – 11 Sep 2024 (6 days)
Effort 1h tracked
Team 2 specialists (sysadmin-engineer · project manager)
Tech stack Yii 2 · PHP 7.4 · nginx · Apache · dd / xxd / file
Delivered cause found in an evening: 2 stray CRLF bytes from the code; a workaround delivery path live since 11.09.2024

The problem

In early 2024 the studio moved its client’s sites, a children’s summer camp, off a third-party host and onto its own server. After the move one function in the booking admin broke: the season reports. The “generate document” button hands back an Excel file that won’t open. The same report saved to the server’s disk and pulled over FTP opens without a single error.

The request came in the client’s own words, no polish on it: “We found no problems in the code, and it feels like it gets corrupted somewhere at the moment of download into the browser. Either nginx mangles it on the way out, or something else. Can you take a look, analyse it?” By then the problem had been running since February: seven months. The camp’s staff had been given FTP access and were pulling reports off the server by hand. Survivable, but a chore.

Why this is hard

A bug like this costs you not in hours of work but in how long it hangs around. It doesn’t take the site down, so it never lands in the queue of urgent outages. The developers checked their own ground and honestly found nothing. On the old host the same code worked, so “it must be the server”. The admin looks at the server and the server is fine. Everyone is right, and the file is still broken. From that point a studio usually has two roads: rewrite the export blind, or ferry reports over FTP forever. Whoever pays for support needs a third one: pin down the guilty layer inside a finite, known-in-advance window.

How we did it

1. Fact first, hypotheses after. We took two copies of one report: the one the browser downloaded and its twin from the server’s disk. The downloaded one was 2 bytes longer. We trimmed those 2 bytes with dd ... bs=1 skip=2, and the “broken” file opened. The file utility confirmed it: the trimmed copy was Microsoft Excel 2007+, the original was faceless data. We pulled the bytes themselves and read them in xxd: 0d0a. That’s CRLF, a line break. The task statement shrank from “something gets corrupted somewhere” to “someone is prepending rn before the file body.”

2. We found the guilty layer with controlled deliveries, not arguments. The engineer dismissed the nginx theory (compression, response rewriting) right away, from experience: nginx does nothing of the kind to a response body, which the later tests bore out. Same story for Apache. Two tests settled it. The same delivery code on a neighbouring project on the same server, but on PHP 8.2, returned the file intact. And a three-line PHP script that grabs a finished XLSX off disk and ships it past the framework returned an intact file from the very same host. The disk is clean, the web servers are clean, PHP is clean. One suspect left: the Yii delivery chain.

3. A run of “didn’t work” is also a result. In parallel the studio’s developers were testing pointed theories. MIME type application/zip instead of xlsx: no. A slash in Content-Disposition (a full path instead of a filename, found along the way and fixed): the download name became correct, the file still broken. sendFile() with no manual headers at all: broken. PHP’s auto_detect_line_endings setting: no. Each miss narrowed the circle: not the headers, not the configuration, the CRLF is injected somewhere between sending the headers and the file body.

4. We picked the fix by the cost of the question. We could have dug through the framework’s guts on PHP 7.4 hunting the line that prints the extra break. Instead the engineer proposed a workaround: a separate shim script outside Yii that takes an identifier, grabs the finished file off disk, and ships it to the browser, with the interface handing out a link to the shim. And he named it for what it was: “a crutch, yes. but it’ll work.” Five days later the developers confirmed in chat: “The little crutch works))”. Reports download from the admin again, and the FTP merry-go-round ended.

Incidents and response

Downtime: zero. Diagnosis ran on a live admin panel: experiments were driven on a test season without touching real bookings, and the access granted for the check the engineer returned the same day, reminding the client himself: “you can take the admin access away now.” There were no destructive actions on the server at all: all the byte-level surgery was done on copies of the files.

Results

Metric Value
Age of the problem at the time of the request ~7 months (since February 2024)
Time to localise the guilty layer one evening: started at 15:30, culprit identified by 18:39
Effort on the tracker 1h
Physical size of the defect 2 bytes (0d0a, CRLF) before the XLSX body
Layers eliminated disk → nginx → Apache → PHP version → framework
Status workaround delivery live since 11.09.2024, confirmed by the developers

In plain terms: for seven months the service shipped unreadable reports, and no one could say whose problem it was. In one evening the question closed: corrupted files differ from healthy ones by exactly two bytes, the code prepends them rather than the server, and while the developers decide whether to fix the framework’s delivery chain, reports download through the shim. The studio’s client has a working admin again. The studio has a reasoned answer on what to fix next.

Team

  • sysadmin-engineer (studio) — byte-level diagnosis, layer elimination, the workaround-delivery scheme
  • project manager (studio) — parallel hypotheses (nginx compression, auto_detect_line_endings), coordination with the studio’s developers

The in-code checks were run by developers on the client side: the diagnosis ran four hands, in one chat, in one evening.

Anton Hersun, Xaver Pro — project manager.

A familiar story: a file “gets corrupted who-knows-where,” and a service lives on a workaround for months? Send us a description of the symptoms. We’ll take a look, name the guilty layer, and come back with an estimate in hours. The review is free.

Request a file diagnosis →

Scroll to Top