Page MenuHomeWikiOasis

Job running keeps blowing stuff up
Closed, ResolvedPublic

Description

In a nutshell every 5 minutes when jobs are ran there is a load spike up to often 7/8 cores, or sometimes even above 8, which needs to be moved to a dedicated job runner eventually to avoid it impacting the rest of the site.

Event Timeline

Zippybonzo lowered the priority of this task from Urgent to Normal.

In a nutshell we either need to optimise how jobs are ran or switch to dedicated job runners

Zippybonzo raised the priority of this task from Normal to Unbreak Now!.Jan 27 2026, 8:32 PM

Given the recent outage caused by what I can only assume was fpm running out of resources, this is definitely a UBN issue now

I'm guessing that's why both SkyWiki and WikiOasis had loading issues a few days ago.

On a side note, it seems jobs handled by TimedMediaHandler don't blow up the servers. Strange I must say...

They do — transcoding is one of the most intensive operations

In T56#679, @Zippybonzo wrote:

They do — transcoding is one of the most intensive operations

I don't have any issues even hours or days after transcoding, and video works very finely. If anything, there could be other things in conjunction with TimedMediaHandler causing these blow ups. Sure, TMH is one of the most resource-intensive, but that alone hasn't caused any issues whatsoever. Other resource-intensive tasks alongside TMH might be the issue.

It's more a case of when the jobs do run they are very intensive; although now mwtask11 is online job running is less intensive, but the cross continent latency for mwtask11 does still slow it down.

Should now be resolved with the switch to running jobs on mwtask11 and running the web server on wikioasis11