Skip to content

dev mode: vendored esbuild 0.23.1 leaks kernel zone memory on macOS, causes kernel panic #3508

@CVjay

Description

@CVjay

Summary

trigger.dev@4.4.5 (CLI 4.4.3) vendors @esbuild/darwin-arm64@0.23.1. After ~10 hours of trigger dev watch mode on a Mac Studio (M1 Max, macOS 26.3, kernel xnu-12377.81.4), the long-running esbuild service process accumulates kqueue/knote handles in kernel zone data.kalloc.1024, eventually exhausting the zone and causing a kernel panic.

Repro environment

  • macOS 26.3 (25D125), Darwin 25.3.0, T6000 (M1 Max)
  • Mac Studio, ~24 GB free RAM at panic time
  • trigger.dev@4.4.3 dev --skip-update-check running as a launchd LaunchAgent (KeepAlive=true)
  • The project has Telegram (gramjs), Supabase, and Sentry SDKs in src/triggers/ — sourcemap upload via sentryEsbuildPlugin

Panic data (verbatim from panic-full-2026-05-01-102441.0002.panic)

panic(cpu 1 caller 0xfffffe002fae9d58): zalloc[3]: zone map exhausted while
allocating from zone [data.kalloc.1024], likely due to memory leak in zone
[data.kalloc.1024] (20G, 21273136 elements allocated) @zalloc.c:4534

Largest zone at panic:

data.kalloc.1024            20G         0B
APFS_INODES                106M       361K
compressor_segment          43M      4558K

Process record for the leaker (procname: esbuild, pid 1302):

residentMemoryBytes: 1200313088   // 1.20 GB user-space RES
userTimeTask:        7809.37 s    // ≈ 130 min user
systemTimeTask:    24586.82 s     // ≈ 410 min kernel  ← 76% in kernel
pageFaults:        3 507 440      // 3.5M

Per-process uptime ≈ 38758 s (≈ 10.7 h). System time / user time ≈ 3.15 is the smoking gun — for a JS bundler we expect the inverse ratio. 21M × 1024 B ≈ 21 GB lines up with kernel zone size, consistent with each leaked handle being padded into a kalloc.1024 slot.

Root cause hypothesis

esbuild 0.25.0 shipped a fix described as “Fix a memory leak when using watch mode” (Feb 2025). trigger.dev 4.x pins ^0.23.0, so all dev-mode users on macOS with long-running watch sessions hit this.

The data.kalloc.1024 zone in XNU is commonly the home for kqueue knote/filterops/vnode-watch structures, which lines up with esbuild’s fsevents-based watcher.

Why it bit hard here

KeepAlive=true on the launchd job restarted the leaker after the panic, so the cycle repeated on every reboot. The current process I’m using as a workaround (PID 48774, esbuild service not yet spawned) had already crossed the same telemetry profile (1 GB RSS, sys-time:user-time ≈ 3.5:1 at 22 h uptime) before I intervened.

Workaround in place

External wrapper (bash) that:

  1. Spawns a watchdog that samples vendored esbuild RSS every 5 min.
  2. kill -TERM of the wrapper tree if RSS > 1.5 GB or wrapper uptime > 6 h.
  3. Switches launchd KeepAlive to { SuccessfulExit=false, Crashed=true } and ThrottleInterval=600 so launchd restarts cleanly without dogpiling.

Happy to share the wrapper if that helps other users, but the real fix is upstream.

Ask

Please bump esbuild to ≥ 0.25 (preferably 0.27+) in trigger.dev 4.x — or expose a knob to allow users to override via pnpm/npm overrides. Right now overriding from the consumer side breaks the trigger.dev internal API contract.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions