Skip to content

RecurringTask#enqueue does not report rescued Job::EnqueueError to Rails.error #746

@whitemerry

Description

@whitemerry

SolidQueue::RecurringTask#enqueue rescues Job::EnqueueError and forwards only error.message (a String) to subscribers of the enqueue_recurring_task.solid_queue notification. The exception object itself does not reach the payload, and the rescue does not call Rails.error.report.

Error reporters such as sentry-rails hook into job errors by prepending ActiveJob::Base#perform_now and capturing exceptions that escape from there — this is the public ActiveJob lifecycle they observe. Recurring task enqueueing runs in the scheduler thread before any job is enqueued and before perform_now is ever invoked, so it is outside that surface. With the rescue swallowing the wrapped Job::EnqueueError, the only signal that survives is the LogSubscriber log line.

This differs from the gem's default policy elsewhere: Engine#initializer "solid_queue.app_executor" sets on_thread_error to ->(exception) { Rails.error.report(exception, handled: false) }, and the README recommends the same pattern for custom error handlers.

Observed in production

During a brief MySQL primary failover the database accepted reads but rejected writes (--super-read-only). Each scheduler's INSERT into solid_queue_recurring_executions was rejected with Trilogy::ProtocolError: 1290, which Solid Queue wrapped in Job::EnqueueError and rescued. The notification fired with enqueue_error set to the error message string, LogSubscriber produced one log line per task, and the recurring ticks for that window were not enqueued. Sentry Crons reported the corresponding ticks as Missed; no Issue was created.

Location

solid_queue-1.4.0/app/models/solid_queue/recurring_task.rb:79-102:

def enqueue(at:)
  SolidQueue.instrument(:enqueue_recurring_task, task: key, at: at) do |payload|
    # ...
  rescue RecurringExecution::AlreadyRecorded
    payload[:skipped] = true
    false
  rescue Job::EnqueueError => error
    payload[:enqueue_error] = error.message   # exception object dropped
    false
  end
end

The other_adapter branch (lines 86-90) stores job.enqueue_error&.message in the same shape.

Reproduction

# frozen_string_literal: true

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  gem "rails",       "~> 8.0"
  gem "solid_queue", "1.4.0", require: false
  gem "sqlite3"
end

require "rails"
require "active_record/railtie"
require "active_job/railtie"
require "solid_queue"
require "minitest/autorun"
require "logger"
require "tmpdir"

ENV["DATABASE_URL"] = "sqlite3::memory:"

class TestApp < Rails::Application
  config.root           = Dir.mktmpdir
  config.eager_load     = false
  config.logger         = Logger.new($stdout)
  config.secret_key_base = "x" * 32
  config.active_job.queue_adapter = :solid_queue
end

Rails.application.initialize!

# pool: 1 so `PRAGMA query_only` persists on the connection that does the INSERT.
ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:", pool: 1)

ActiveRecord::Schema.verbose = false
ActiveRecord::Schema.define do
  create_table "solid_queue_blocked_executions", force: :cascade do |t|
    t.bigint "job_id", null: false
    t.string "queue_name", null: false
    t.integer "priority", default: 0, null: false
    t.string "concurrency_key", null: false
    t.datetime "expires_at", null: false
    t.datetime "created_at", null: false
    t.index [ "concurrency_key", "priority", "job_id" ], name: "index_solid_queue_blocked_executions_for_release"
    t.index [ "expires_at", "concurrency_key" ], name: "index_solid_queue_blocked_executions_for_maintenance"
    t.index [ "job_id" ], name: "index_solid_queue_blocked_executions_on_job_id", unique: true
  end

  create_table "solid_queue_claimed_executions", force: :cascade do |t|
    t.bigint "job_id", null: false
    t.bigint "process_id"
    t.datetime "created_at", null: false
    t.index [ "job_id" ], name: "index_solid_queue_claimed_executions_on_job_id", unique: true
    t.index [ "process_id", "job_id" ], name: "index_solid_queue_claimed_executions_on_process_id_and_job_id"
  end

  create_table "solid_queue_failed_executions", force: :cascade do |t|
    t.bigint "job_id", null: false
    t.text "error"
    t.datetime "created_at", null: false
    t.index [ "job_id" ], name: "index_solid_queue_failed_executions_on_job_id", unique: true
  end

  create_table "solid_queue_jobs", force: :cascade do |t|
    t.string "queue_name", null: false
    t.string "class_name", null: false
    t.text "arguments"
    t.integer "priority", default: 0, null: false
    t.string "active_job_id"
    t.datetime "scheduled_at"
    t.datetime "finished_at"
    t.string "concurrency_key"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
    t.index [ "active_job_id" ], name: "index_solid_queue_jobs_on_active_job_id"
    t.index [ "class_name" ], name: "index_solid_queue_jobs_on_class_name"
    t.index [ "finished_at" ], name: "index_solid_queue_jobs_on_finished_at"
    t.index [ "queue_name", "finished_at" ], name: "index_solid_queue_jobs_for_filtering"
    t.index [ "scheduled_at", "finished_at" ], name: "index_solid_queue_jobs_for_alerting"
  end

  create_table "solid_queue_pauses", force: :cascade do |t|
    t.string "queue_name", null: false
    t.datetime "created_at", null: false
    t.index [ "queue_name" ], name: "index_solid_queue_pauses_on_queue_name", unique: true
  end

  create_table "solid_queue_processes", force: :cascade do |t|
    t.string "kind", null: false
    t.datetime "last_heartbeat_at", null: false
    t.bigint "supervisor_id"
    t.integer "pid", null: false
    t.string "hostname"
    t.text "metadata"
    t.datetime "created_at", null: false
    t.string "name", null: false
    t.index [ "last_heartbeat_at" ], name: "index_solid_queue_processes_on_last_heartbeat_at"
    t.index [ "name", "supervisor_id" ], name: "index_solid_queue_processes_on_name_and_supervisor_id", unique: true
    t.index [ "supervisor_id" ], name: "index_solid_queue_processes_on_supervisor_id"
  end

  create_table "solid_queue_ready_executions", force: :cascade do |t|
    t.bigint "job_id", null: false
    t.string "queue_name", null: false
    t.integer "priority", default: 0, null: false
    t.datetime "created_at", null: false
    t.index [ "job_id" ], name: "index_solid_queue_ready_executions_on_job_id", unique: true
    t.index [ "priority", "job_id" ], name: "index_solid_queue_poll_all"
    t.index [ "queue_name", "priority", "job_id" ], name: "index_solid_queue_poll_by_queue"
  end

  create_table "solid_queue_recurring_executions", force: :cascade do |t|
    t.bigint "job_id", null: false
    t.string "task_key", null: false
    t.datetime "run_at", null: false
    t.datetime "created_at", null: false
    t.index [ "job_id" ], name: "index_solid_queue_recurring_executions_on_job_id", unique: true
    t.index [ "task_key", "run_at" ], name: "index_solid_queue_recurring_executions_on_task_key_and_run_at", unique: true
  end

  create_table "solid_queue_recurring_tasks", force: :cascade do |t|
    t.string "key", null: false
    t.string "schedule", null: false
    t.string "command", limit: 2048
    t.string "class_name"
    t.text "arguments"
    t.string "queue_name"
    t.integer "priority", default: 0
    t.boolean "static", default: true, null: false
    t.text "description"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
    t.index [ "key" ], name: "index_solid_queue_recurring_tasks_on_key", unique: true
    t.index [ "static" ], name: "index_solid_queue_recurring_tasks_on_static"
  end

  create_table "solid_queue_scheduled_executions", force: :cascade do |t|
    t.bigint "job_id", null: false
    t.string "queue_name", null: false
    t.integer "priority", default: 0, null: false
    t.datetime "scheduled_at", null: false
    t.datetime "created_at", null: false
    t.index [ "job_id" ], name: "index_solid_queue_scheduled_executions_on_job_id", unique: true
    t.index [ "scheduled_at", "priority", "job_id" ], name: "index_solid_queue_dispatch_all"
  end

  create_table "solid_queue_semaphores", force: :cascade do |t|
    t.string "key", null: false
    t.integer "value", default: 1, null: false
    t.datetime "expires_at", null: false
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
    t.index [ "expires_at" ], name: "index_solid_queue_semaphores_on_expires_at"
    t.index [ "key", "value" ], name: "index_solid_queue_semaphores_on_key_and_value"
    t.index [ "key" ], name: "index_solid_queue_semaphores_on_key", unique: true
  end

  add_foreign_key "solid_queue_blocked_executions", "solid_queue_jobs", column: "job_id", on_delete: :cascade
  add_foreign_key "solid_queue_claimed_executions", "solid_queue_jobs", column: "job_id", on_delete: :cascade
  add_foreign_key "solid_queue_failed_executions", "solid_queue_jobs", column: "job_id", on_delete: :cascade
  add_foreign_key "solid_queue_ready_executions", "solid_queue_jobs", column: "job_id", on_delete: :cascade
  add_foreign_key "solid_queue_recurring_executions", "solid_queue_jobs", column: "job_id", on_delete: :cascade
  add_foreign_key "solid_queue_scheduled_executions", "solid_queue_jobs", column: "job_id", on_delete: :cascade
end

class HelloJob < ActiveJob::Base
  def perform; end
end

# Stand in for a sentry-rails / Bugsnag / Honeybadger ErrorReporter subscriber.
class RecordingSubscriber
  attr_reader :reports

  def initialize
    @reports = []
  end

  def report(error, handled:, severity:, context:, source: nil)
    @reports << { class: error.class.name, message: error.message }
  end
end

class RecurringEnqueueErrorReportingTest < Minitest::Test
  def setup
    @recorder = RecordingSubscriber.new
    Rails.error.subscribe(@recorder)

    @payloads = []
    @subscription = ActiveSupport::Notifications.subscribe("enqueue_recurring_task.solid_queue") do |*args|
      @payloads << ActiveSupport::Notifications::Event.new(*args).payload.dup
    end

    @task = SolidQueue::RecurringTask.new(
      key:        "hello_task",
      class_name: "HelloJob",
      schedule:   "* * * * *",
    )
  end

  def teardown
    ActiveSupport::Notifications.unsubscribe(@subscription)
    Rails.error.unsubscribe(@recorder) if Rails.error.respond_to?(:unsubscribe)
    ActiveRecord::Base.connection.execute("PRAGMA query_only = OFF")
  rescue ActiveRecord::StatementInvalid
    # Connection may already be in a bad state — fine for a single-test script.
  end

  # Mirrors a MySQL primary flipping to `--super-read-only` during a failover:
  # the INSERT performed by RecurringTask#enqueue is rejected at the DB layer,
  # SolidQueue wraps the AR error in Job::EnqueueError, and rescues it.
  def test_recurring_enqueue_error_is_reported_to_rails_error
    ActiveRecord::Base.connection.execute("PRAGMA query_only = ON")

    @task.enqueue(at: Time.now.utc)

    # Sanity: the AS::Notifications event is delivered, but it only carries the
    # error message as a String — the exception object itself is dropped.
    assert_equal 1, @payloads.size
    payload = @payloads.first
    assert_kind_of String, payload[:enqueue_error]
    refute payload.values.any? { |v| v.is_a?(Exception) },
      "Expected the exception object on the payload, but it was not present"

    # The actual bug: error reporters subscribed via Rails.error receive nothing,
    # so Sentry / Bugsnag / Honeybadger silently miss recurring enqueue failures.
    assert_operator @recorder.reports.size, :>=, 1,
      "Expected Rails.error.report to be called at least once, was called " \
        "#{@recorder.reports.size} time(s). Recurring enqueue errors are invisible " \
        "to ActiveSupport::ErrorReporter subscribers."
  end
end

class DiscardingFailingJob < ActiveJob::Base
  discard_on StandardError, report: true

  def perform
    raise StandardError, "boom"
  end
end

Proposed direction

Route the rescued error through Rails.error.report while keeping the existing "do not bubble up" behaviour:

rescue Job::EnqueueError => error
  Rails.error.report(error, handled: true, source: "solid_queue", context: { task: key, at: at })
  payload[:enqueue_error] = error.message
  false
end

handled: true because the rescue intentionally swallows the error; subscribers decide alert severity. The same treatment applies to the other_adapter branch (lines 86-90), where job.enqueue_error is already available as an exception object.

This aligns the recurring-task path with the gem's default on_thread_error behaviour and with the pattern recommended in the README.

Workaround available today

Without changes in Solid Queue, a host application can subscribe to the notification and forward the message to its error reporter manually:

# config/initializers/solid_queue_error_reporting.rb
ActiveSupport::Notifications.subscribe("enqueue_recurring_task.solid_queue") do |*args|
  event = ActiveSupport::Notifications::Event.new(*args)
  next unless event.payload[:enqueue_error]

  Rails.error.report(
    SolidQueue::Job::EnqueueError.new(event.payload[:enqueue_error]),
    handled: true,
    source: "solid_queue",
    context: { task: event.payload[:task], at: event.payload[:at] },
  )
  # or
  # Sentry.capture_message(
  #   "SolidQueue failed to enqueue recurring task",
  #   level: :error,
  #   tags: { component: "solid_queue", task: event.payload[:task] },
  #   extra: {
  #     task: event.payload[:task],
  #     at: event.payload[:at],
  #     enqueue_error: event.payload[:enqueue_error],
  #   },
  # )
end

This makes the failure visible to ErrorReporter subscribers, but the exception is reconstructed from a String — it carries no backtrace, no original class, and no cause chain pointing at the underlying ActiveRecord::StatementInvalid (or whichever error triggered the wrap). Grouping in Sentry falls back to the message string, which fragments across error variants and across changes in upstream error formatting.

Surfacing the rescue from inside RecurringTask#enqueue, where the original exception object is still in scope, avoids those limitations.

Environment

  • solid_queue 1.4.0
  • Rails 8, Ruby 3.4.8
  • MySQL 8 (trilogy adapter); failover flipped the primary to --super-read-only for ~13s
  • sentry-ruby 6.5.0, sentry-rails 6.5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions