My blog is getting quite stale, so I am going to post about a bug I fixed in BungeeCord recently regarding scheduled tasks, relating to shutdown issues in HubMagic and RedisBungee. But before we do that, let's take a quick look at the scheduler.
The BungeeCord scheduler
The BungeeCord scheduler is a simple task running API that runs tasks asynchronously and supports repeating tasks. To handle this, it uses a special class called
BungeeTask is the actual runnable used by BungeeCord.
After task execution is deemed to be complete, it cancels the task with the scheduler. This is where we find our bug.
When a task is cancelled via the ScheduledTask's
cancel() method, the
BungeeTask sets its running flag to false. So far, so good. It will be noticed and the task will be cancelled as usual. But in between this time and the task cancelling, the task may have been cancelled!
That's right, we have a condition where the task is a zombie and tries to cancel itself, only to find it was already cancelled. The actual error is in
BungeeScheduler, which tries to find the task by its ID but fails as the task was never found. No null check is performed on the removed task. Java sees a reference to a null object (to its
cancel() method) and throws a non-specific
I was able to fix this issue rather easily. First, I made
cancel() methods always map to an corresponding scheduler
cancel(taskId), which is only called once, then made
cancel(ScheduledTask) always call
cancel() on the provided task.
The code is slightly misleading.
cancel() calls the scheduler's
cancel(taskId) method, which removes the task and calls
cancel() on it. However, since the
running boolean is
false, we don't NPE.
- It is fairly clear that the cancellation of scheduled tasks was never tested. The relevant code had not been touched for almost a year. This is further attested to by
tasksByPluginissue that was reported by vemacs.
- Silly bugs like this are low-hanging fruit and are harming the BungeeCord ecosystem greatly. A extensive bugcheck over the code base should be initiated.