Recovering missing Matomo archives after Cron failure

I recently discovered that my Matomo 5.5.0 analytics instance had stopped archiving data for nearly three weeks due to a cron job failure. Here’s how I recovered the data without crashing my low on memory VPS.

The Problem

Matomo’s archiving process converts raw log data into pre-calculated reports. When the cron job stops working, reports don’t load properly in the UI, and you’re left with gaps in your analytics data.

My initial approach was to run a bulk archive command for the entire date range:

php ./console core:archive --force-date-range=2025-09-11,2025-10-08 --force-idsites=[site-id]

This failed within minutes with the dreaded error:

SQLSTATE[HY000]: General error: 2006 MySQL server has gone away

The issue? Processing three weeks of data in a single operation exceeded MySQL’s connection timeout limits on my managed hosting environment.

The Solution: Two-Step Manual Process

After reviewing the Matomo documentation, I developed a reliable two-step approach for each date range:

Step 1: Invalidate Reports

php ./console core:invalidate-report-data --dates=2025-09-15,2025-09-30 --sites=[site-id]

This marks the specified date range as needing re-archiving without actually processing the data yet.

Step 2: Run the Archiver

php ./console core:archive --force-idsites=3 --concurrent-requests-per-website=1 --disable-scheduled-tasks

The archiver picks up the invalidated dates and processes them. The key flags here:

  • --concurrent-requests-per-website=1 limits memory usage
  • --disable-scheduled-tasks prevents additional overhead during archiving

Handling Partial Failures

When processing large date ranges, the archiver sometimes stopped mid-way through. Rather than guessing which dates completed, I checked the Matomo UI to identify gaps, then invalidated and archived only the remaining dates.

This iterative approach proved more reliable than trying to process everything at once.

Key Takeaways

What I learned:

The --force-date-range flag in core:archive doesn’t reliably override Matomo’s default behavior of checking today and yesterday. Instead, explicit invalidation ensures the archiver processes exactly the dates you need.

For managed hosting environments with limited resources, processing historical data in smaller chunks prevents timeout errors.

Preventing future issues:

I’ve since fixed the cron job configuration and verified it runs successfully. Regular monitoring of the archiving status in Matomo’s System Check will help catch failures earlier

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.