Ivan Micai
← Back to blog

Anime Upscaling: a self-hosted pipeline to bring DVD anime back to life in 4K

Published on

I have a decent stack of old anime DVDs sitting on a shelf and a modern TV that makes any 480i source look like finger painting. The wish was obvious: get those episodes up to a resolution and framerate that 2026 won’t laugh at, without spending months hand-encoding.

The first nudge came from Akita on Rails’ post on upscaling old anime with AI (PT) — well worth reading, it’s where I figured out how video2x does the heavy lifting on the quality jump. The catch is that his flow handles one video at a time. To work through a full collection (hundreds of episodes), I needed reusable pipelines, a job queue, a UI, and — maybe most importantly — a final optimization stage with FFmpeg so the output wouldn’t blow up the storage.

That’s the gap Anime Upscaling fills, and it’s now open on GitHub: https://github.com/IvanMicai/anime-upscaling.


What it is

A self-hosted app (Docker Compose) running a three-stage pipeline behind a queue, with a web UI to create, save, and dispatch jobs:

Upscale  →  Interpolation  →  Optimization
(video2x)    (RIFE)             (FFmpeg)

Bring it up with docker compose up -d and open http://localhost:4750. It runs CPU-only for tinkering, but it really shines on NVIDIA — the docker-compose.nvidia.yml overlay wires up the Container Toolkit and the pipeline drives the GPUs in parallel.


The pipeline

Each stage is an isolated worker with its own flags and log stream. One stage’s output becomes the next stage’s input, in canonical folders (data/inputdata/outputdata/interpolateddata/optimized).

1. Upscale — video2x

Bumps the resolution with an ML model. The wrapper exposes three processors: realesrgan (default, with realesr-animevideov3 / realesrgan-plus-anime models), realcugan, and libplacebo (Anime4K-v4 via shaders). 2x/3x/4x scale and configurable denoise levels.

2. Interpolation — RIFE

Doubles or triples the framerate by generating intermediate frames. Supported models include rife-v4.6, rife-v4.26, and rife-UHD. There’s a configurable --scene-thresh so it doesn’t invent frames across scene cuts — the classic giveaway of bad interpolation.

3. Optimization — FFmpeg

This is the stage that earns the word “pipeline” in the name. Without it, video2x output is huge. Default: libx265 10-bit, animation tune, with four quality presets mapped as ultra/alta/media/baixa (CRF 16/19/22/26). When GPU_VENDOR is set, it switches to hevc_nvenc / h264_amf / hevc_qsv and uses the hardware encoder:

ffmpeg -hwaccel cuda -i in.mkv \
  -c:v hevc_nvenc -preset fast -rc vbr \
  -crf 19 -pix_fmt yuv420p10le out.mkv

Typical result: a final file visually almost indistinguishable from the raw upscale, but at a fraction of the size — actually viable for a full library on the NAS.


Why it isn’t just a script

Most of the work wasn’t running video2x — it was everything around it. The features that justify a UI:

  • Saved, reusable pipelines. Define “DVD anime → 4K HEVC 10-bit CRF 19” once and fire it across an entire season’s worth of episodes.
  • Multi-GPU queue with configurable streams. GPU_COUNT × STREAMS_PER_GPU slots, dispatched interleaved across GPUs (job 1 → GPU 0, job 2 → GPU 1, and so on).
  • GPU health monitor. Periodically probes nvidia-smi -L; if the driver wedges (Xid 119, GSP RPC timeout — anyone with a 50-series RTX knows the drill), dispatch is gated before zombie processes pile up. I learned this one the hard way in the previous post, rescuing my TrueNAS.
  • Skip detection + ETA. Before running, the pipeline detects stages whose output already exists and skips them. While running, it computes time-remaining from real frame/fps numbers — not a blind guess.
  • Live tabular log viewer. Columns: timestamp / source (GPU0-S1, FFMPEG-1, PIPELINE) / level (OK, ERRO, SKIP, WARN, STEP) / message, streamed over WebSocket. You can filter by GPU when one wedges and the other doesn’t.
  • File browser. Inspect input/, output/, interpolated/, optimized/ directly in the UI — no SSH-into-the-server.

Stack

ComponentTech
FrontendNext.js (port 4750)
APIGo (port 4751)
Upscalervideo2x (real-esrgan / real-cugan / libplacebo)
Frame interp.RIFE (rife-v4.6, rife-v4.26, rife-UHD)
Final encoderFFmpeg (libx265 10-bit default, NVENC/AMF/QSV optional)
DeployDocker Compose + optional NVIDIA overlay

How to run it

cp .env.example .env
mkdir -p data/input data/output data/optimized data/interpolated data/temp

# CPU only
docker compose up -d --build

# With NVIDIA
docker compose -f docker-compose.yml -f docker-compose.nvidia.yml up -d --build

Open http://localhost:4750, drop videos into data/input/, and outputs land in data/optimized/ (or in the other folders depending on the pipeline). Before exposing it on a network, swap AUTH_PASSWORD and AUTH_SECRET in .env — the single-password gate is only meant for home use behind VPN/HTTPS.


Wrap-up

It’s open on GitHub under MIT: https://github.com/IvanMicai/anime-upscaling. If you also have a stash of anime DVDs on a shelf or a 50-series RTX sitting idle, you can stand this up in a weekend and grind through the whole collection without becoming a full-time encoder.

Ideas welcome: new upscale models, quality presets tuned for other styles (not just animation), Sonarr/Radarr integration to run as an automatic post-process. Issue or PR — anything is fair game.

Credit again to Akita on Rails for the original push. If you want to swap notes on upscaling, encoders, or homelab stuff in general, give me a shout. 🦥