Engineering practice Multi-vertical

Migrating an infected NextCloud: 400 GB with no docker in 7 hours

The studio's old cloud caught a virus through docker. Not a cleanup but a new server: a clean NextCloud, no containers, 400 GB moved nightly. 7h tracked.

7h delivered
Migrating an infected NextCloud: 400 GB, no docker, 7h

Internal file storage is invisible infrastructure. While it’s alive, nobody thinks about it: developers drop site dumps in, backup scripts file their archives there, everyone’s busy. People remember it when it starts falling over. Migrating a service like this always has two opposing demands: there’s a lot of data, the downtime window is small. Here a third demand was added. The old server hosted malware that could not come along.

Snapshot

End-client sector digital agency / web studio
End client a studio with a fleet of 12+ VDS, Russia (under NDA)
Engagement retainer DevOps support, direct contract
Project type migrating an internal NextCloud from an infected server to a clean one
Work done ~400 GB of files + database + all accounts; upgrade across 5 major versions
Project date 10 Jun 2024 – 4 Sep 2024 (86 days; active migration — nightly windows 23 Aug – 2 Sep)
Effort 7h tracked; another 6h — support up to version 32 across 2024–2025
Team 2 specialists (sysadmin-engineer · project manager)
Tech stack NextCloud 24→29 · PHP · MySQL (utf8mb4) · occ CLI · Zabbix
Delivered cloud on a new server with no docker, NextCloud 29, domain switched, backups and scheduler reconfigured, old server retired

The problem

The studio’s NextCloud cloud is a working tool, not an archive. Fresh backups of client sites land there daily so developers can spin up any project locally. And on that same server a virus had taken up residence, one the previous administrator “couldn’t dig out”: processes periodically ate the CPU, the cloud went unreachable, and the working practice came down to “restart the VDS and usually it’s fine.” Once, after a restart like that, the cloud came back with a 500.

The client framed the request themselves, right in the tracker task: fighting the virus is “long, painful, and expensive,” it’s simpler to deploy the latest NextCloud on a new server and move users and files over. Two hard conditions. First: no docker. By the previous admin’s account, the infection had flown in through containers. Second, verbatim: “The main thing is that the virus doesn’t migrate to the new server along with the files.”

The nastiest part of migrations like this isn’t the volume, it’s the combination. 400 GB over an 11 MB/s link is 11 hours of copying, and all that time somebody wants to use the storage. You can’t shut the cloud off for a day: developers lose project deployment. You can’t copy it live and switch over either: files uploaded during the copy get lost. And dragging the old executable code along with the data means moving the virus too, the very thing the whole exercise was meant to be rid of. Any one of the three misses turns planned work into an incident.

How we did it

1. A new server instead of a cleanup. We recommended this ourselves, rather than admitting defeat. You can dig malware out of a running system forever: hours go by, and there’s no guarantee you’ve cleaned all of it. A clean server with a data move costs a predictable amount and closes the question entirely. While the client bought the server, we kept the old one alive with a brace. We added to a system script the automatic killing of find processes once more than five had piled up: those were exactly what drove the server into high load average. 15 minutes of work, and the cloud survived to the migration with no outages caused by the virus.

2. From the old server, only data, not one byte of code. On the new server we deployed NextCloud straight onto the system, from a clean distribution of the latest version: no containers, on plain PHP with MySQL (database recreated from scratch, utf8mb4). Exactly three things moved over from the old machine: the data directory, config.php, and a database dump. The executable code, the prime suspect in the infection, never reached the new server at all.

3. Nightly windows instead of a “big switchover.” We started copying the 400 GB in the evening after sign-off. The longest window was 11h 56m, and we gave the client that figure in advance, along with the arithmetic: 11 MB/s × 400 GB ≈ 11 hours. By 8 a.m. the old cloud came back up. By day developers worked as usual. The final database sync ran on a separate evening, after which the old cloud went into maintenance mode so the database wouldn’t diverge while the team moved to the new address.

4. A stepped upgrade 24 → 25 → 26 → 27 → 28 → 29. The plan “install the latest version and feed it the old database” broke on a fact: NextCloud doesn’t upgrade by skipping major versions. We found this out on the spot and re-planned in an evening: we went through all 5 steps via occ from the console, with a client check after the first. On versions 28/29 the platform disabled 8 incompatible apps (calendar, contacts, deck, groupfolders, notes, among others). We sent the full list to the client immediately, without waiting for the question. The verification from their side: “Access is intact, files show up, they download.”

5. We took root away from the developers and gave back exactly one command. After the migration the client asked to remove the developers’ root access to the cloud server. We made a separate user with rights to only the one data folder, and a sudo wrapper that runs a single command: occ files:scan against a specific path. The backup-upload scripts work, with no access to other files or to the NextCloud tree. 1.5 hours, closed in two days.

Results

Metric Value
Data migrated ~400 GB + database + all accounts
NextCloud version 24 → 29 in one pass; in support — up to 32
Downtime in working hours nightly windows by agreement; by 8 a.m. the cloud came back up
Longest window 11h 56m (planned, the figure given in advance)
Tracked hours 7h — migration; 1.5h — access without root; 4.5h — upgrades to 32
Post-migration verification client check: accounts in place, files open and download
Virus stayed on the old server; the server was shut down and decommissioned

In short: the cloud moved to a clean server, came up 5 major versions higher, and kept working by day throughout the migration. The malware stayed on the old machine. We held it powered off for a week in case something was forgotten, then wiped it. Since then the cloud has lived without docker and upgrades on schedule.

The support tail is a separate part of the story. In March 2025 the upgrade to version 31 required raising PHP from 8.2 to 8.3. The web updater failed at the backup stage, so we went through the console on an agreed weekend. Along the way the theme and one cache plugin dropped off. The client learned both facts from a report, not from user complaints. There too was a line you rarely hear from a contractor closing a task: “stable is version 30, 31 is beta, so bear in mind, oddities are possible.” In October 2025 the cloud reached 32. Total since the migration: 24 → 32, no losses and no firefighting.

Process

Phase When Result
Diagnosis and brace 10–13 Jun 2024 old server stabilized before the move: auto-kill of excess find (0.25h)
New server prep 23 Aug clean NextCloud distribution, no docker (0.5h)
Nightly copy 26–27 Aug 400 GB moved overnight, cloud worked by day (1.5h)
Database move, upgrade to 25 27–28 Aug cloud lives on the new server, client checked files and access (1.5h)
Upgrade to 29 29 Aug 4 major versions in an evening, report on disabled apps (1h)
Finish 31 Aug – 4 Sep backups, mail, scheduler, certificate, domain switch, old server in shutdown (2.25h)
Support tail Sep 2024 – Dec 2025 access without root for developers; upgrades 29→30→31→32 (6h)

Two months passed between the task being filed (10 June) and the active migration: the client was buying a server and agreeing priorities, while our brace held the old cloud through that time. The sum of hours is less than the calendar because all the work ran in short nightly and evening windows around a live service.

Team

  • sysadmin-engineer (studio) — migration, stepped upgrades, sudo access, support up to version 32
  • project manager (studio) — window sign-off, hours control in the tracker
  • on the client side — head of department: verification after the move, DNS, priorities

If you too have a service that’s frightening to touch (infected, several major versions behind, or running in “we restarted it and it’s fine” mode), send us a brief or your current technical documentation. We’ll size up the volumes, links, and migration windows and come back with a fixed estimate in hours. The migration estimate is free.

Discuss your migration →

Scroll to Top