Skip to content

feat: persist update schedule across restarts#332

Open
binnichtaktiv wants to merge 4 commits into
rtuszik:devfrom
binnichtaktiv:dev
Open

feat: persist update schedule across restarts#332
binnichtaktiv wants to merge 4 commits into
rtuszik:devfrom
binnichtaktiv:dev

Conversation

@binnichtaktiv

Copy link
Copy Markdown

I have set my update schedule to 3 months. I noticed that it never updated because I either restart my server or the docker container (to update photon for example)

This PR changes the logic and saves the last time it got updated and reads from that file when starting photon-docker. So even if your server or docker container was shut down for weeks it will check and directly update if the set interval has elapsed

I also extended the test because it failed when running it for the first time. All test now succeed. I followed your guideline

I'm not a pro so my code might not be perfect, pls let me know if I should correct anything

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

src/process_manager.py gains a _get_last_update_run_path() helper and writes a JSON timestamp file after each successful update. schedule_updates() is rewritten to read that persisted timestamp, compute the interval from config.UPDATE_INTERVAL suffixes, and either fire run_update immediately or log the next scheduled time. Two tests are updated to monkeypatch config.DATA_DIR to tmp_path.

Changes

Persisted interval-aware scheduling

Layer / File(s) Summary
Imports, path helper, and timestamp write
src/process_manager.py
Adds json and datetime imports, introduces _get_last_update_run_path() to compute the timestamp file location, and writes {"ts": current_time} to that file after a successful update and Photon health verification.
schedule_updates() rewrite and updated tests
src/process_manager.py, tests/test_process_manager.py
Rewrites schedule_updates() to initialise the timestamp file on first run, parse the persisted JSON timestamp with early exit on failure, derive interval_seconds from config.UPDATE_INTERVAL (d/h/m suffixes, daily fallback), start run_update in a daemon thread if the interval has elapsed, and log the next run time otherwise. Tests monkeypatch config.DATA_DIR to tmp_path before invoking schedule_updates().

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: persist update schedule across restarts' directly and clearly summarises the main change—persisting the update schedule to survive application restarts.
Description check ✅ Passed The description is directly related to the changeset, explaining the motivation (update schedule resetting on restarts) and the solution (persisting and reading timestamps from a file).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
src/process_manager.py (1)

281-289: ⚡ Quick win

Eliminate redundant interval parsing.

Each branch parses int(interval[:-1]) twice: once for interval_seconds and once for schedule.every(). Extracting this to a variable improves readability and performance.

♻️ Proposed refactor
        if interval.endswith("d"):
-            interval_seconds = int(interval[:-1]) * 86400
-            schedule.every(int(interval[:-1])).days.do(self.run_update)
+            interval_value = int(interval[:-1])
+            interval_seconds = interval_value * 86400
+            schedule.every(interval_value).days.do(self.run_update)
        elif interval.endswith("h"):
-            interval_seconds = int(interval[:-1]) * 3600
-            schedule.every(int(interval[:-1])).hours.do(self.run_update)
+            interval_value = int(interval[:-1])
+            interval_seconds = interval_value * 3600
+            schedule.every(interval_value).hours.do(self.run_update)
        elif interval.endswith("m"):
-            interval_seconds = int(interval[:-1]) * 60
-            schedule.every(int(interval[:-1])).minutes.do(self.run_update)
+            interval_value = int(interval[:-1])
+            interval_seconds = interval_value * 60
+            schedule.every(interval_value).minutes.do(self.run_update)

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 869763bf-4208-4634-aaf9-700e97d844cd

📥 Commits

Reviewing files that changed from the base of the PR and between e8581cc and b06c560.

📒 Files selected for processing (2)
  • src/process_manager.py
  • tests/test_process_manager.py

Comment thread src/process_manager.py
Comment on lines +245 to +246
with open(_get_last_update_run_path(), "w") as f:
f.write(json.dumps({"ts": time.time()}))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Consider explicit error handling for timestamp persistence.

If the timestamp file write fails (due to permissions, disk space, etc.), the outer exception handler at line 251 will catch it and report "Update run failed unexpectedly", which may be misleading when the actual update succeeded. Consider wrapping the timestamp write in a try-except block with specific logging.

🛡️ Proposed fix to add explicit error handling
                index.drop_backup()
-                with open(_get_last_update_run_path(), "w") as f:
-                    f.write(json.dumps({"ts": time.time()}))
+                try:
+                    with open(_get_last_update_run_path(), "w") as f:
+                        f.write(json.dumps({"ts": time.time()}))
+                except Exception as e:
+                    logger.warning(f"Failed to persist update timestamp: {e}")
            else:
🧰 Tools
🪛 ast-grep (0.43.0)

[info] 245-245: use jsonify instead of json.dumps for JSON output
Context: json.dumps({"ts": time.time()})
Note: Security best practice.

(use-jsonify)

Comment thread src/process_manager.py
Comment on lines +266 to +277
if not os.path.exists(_get_last_update_run_path()):
with open(_get_last_update_run_path(), "w") as f:
f.write(json.dumps({"ts": time.time()}))
last_run = time.time()
logger.info("No last update timestamp found, treating as first run")
else:
try:
with open(_get_last_update_run_path()) as f:
last_run = json.loads(f.read())["ts"]
except Exception:
logger.error("Unable to read last update timestamp")
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add error handling for first-run timestamp write.

The timestamp file write on line 267 lacks error handling. If this write fails, the function continues with last_run = time.time() but without persistence, causing the timestamp to reset on the next restart.

🛡️ Proposed fix to handle write errors
        if not os.path.exists(_get_last_update_run_path()):
-            with open(_get_last_update_run_path(), "w") as f:
-                f.write(json.dumps({"ts": time.time()}))
-            last_run = time.time()
-            logger.info("No last update timestamp found, treating as first run")
+            try:
+                with open(_get_last_update_run_path(), "w") as f:
+                    f.write(json.dumps({"ts": time.time()}))
+                last_run = time.time()
+                logger.info("No last update timestamp found, treating as first run")
+            except Exception as e:
+                logger.error(f"Failed to initialise update timestamp: {e}")
+                return
        else:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if not os.path.exists(_get_last_update_run_path()):
with open(_get_last_update_run_path(), "w") as f:
f.write(json.dumps({"ts": time.time()}))
last_run = time.time()
logger.info("No last update timestamp found, treating as first run")
else:
try:
with open(_get_last_update_run_path()) as f:
last_run = json.loads(f.read())["ts"]
except Exception:
logger.error("Unable to read last update timestamp")
return
if not os.path.exists(_get_last_update_run_path()):
try:
with open(_get_last_update_run_path(), "w") as f:
f.write(json.dumps({"ts": time.time()}))
last_run = time.time()
logger.info("No last update timestamp found, treating as first run")
except Exception as e:
logger.error(f"Failed to initialise update timestamp: {e}")
return
else:
try:
with open(_get_last_update_run_path()) as f:
last_run = json.loads(f.read())["ts"]
except Exception:
logger.error("Unable to read last update timestamp")
return
🧰 Tools
🪛 ast-grep (0.43.0)

[info] 267-267: use jsonify instead of json.dumps for JSON output
Context: json.dumps({"ts": time.time()})
Note: Security best practice.

(use-jsonify)

Comment thread src/process_manager.py
Comment on lines 281 to 293
if interval.endswith("d"):
days = int(interval[:-1])
schedule.every(days).days.do(self.run_update)
logger.info(f"Scheduling updates every {days} days")
interval_seconds = int(interval[:-1]) * 86400
schedule.every(int(interval[:-1])).days.do(self.run_update)
elif interval.endswith("h"):
hours = int(interval[:-1])
schedule.every(hours).hours.do(self.run_update)
logger.info(f"Scheduling updates every {hours} hours")
interval_seconds = int(interval[:-1]) * 3600
schedule.every(int(interval[:-1])).hours.do(self.run_update)
elif interval.endswith("m"):
minutes = int(interval[:-1])
schedule.every(minutes).minutes.do(self.run_update)
logger.info(f"Scheduling updates every {minutes} minutes")
interval_seconds = int(interval[:-1]) * 60
schedule.every(int(interval[:-1])).minutes.do(self.run_update)
else:
logger.warning(f"Invalid UPDATE_INTERVAL format: {interval}, defaulting to daily")
interval_seconds = 86400
schedule.every().day.do(self.run_update)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Validate interval parsing to prevent runtime errors.

The code calls int(interval[:-1]) without validation. If UPDATE_INTERVAL contains non-numeric characters (e.g., "30xd" or "abcd"), this will raise ValueError and crash the scheduler setup. Additionally, negative or zero values should be rejected.

🛡️ Proposed fix to validate interval parsing
        interval = config.UPDATE_INTERVAL.lower()

        if interval.endswith("d"):
-            interval_seconds = int(interval[:-1]) * 86400
-            schedule.every(int(interval[:-1])).days.do(self.run_update)
+            try:
+                interval_value = int(interval[:-1])
+                if interval_value <= 0:
+                    raise ValueError("Interval must be positive")
+                interval_seconds = interval_value * 86400
+                schedule.every(interval_value).days.do(self.run_update)
+            except ValueError as e:
+                logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
+                return
        elif interval.endswith("h"):
-            interval_seconds = int(interval[:-1]) * 3600
-            schedule.every(int(interval[:-1])).hours.do(self.run_update)
+            try:
+                interval_value = int(interval[:-1])
+                if interval_value <= 0:
+                    raise ValueError("Interval must be positive")
+                interval_seconds = interval_value * 3600
+                schedule.every(interval_value).hours.do(self.run_update)
+            except ValueError as e:
+                logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
+                return
        elif interval.endswith("m"):
-            interval_seconds = int(interval[:-1]) * 60
-            schedule.every(int(interval[:-1])).minutes.do(self.run_update)
+            try:
+                interval_value = int(interval[:-1])
+                if interval_value <= 0:
+                    raise ValueError("Interval must be positive")
+                interval_seconds = interval_value * 60
+                schedule.every(interval_value).minutes.do(self.run_update)
+            except ValueError as e:
+                logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
+                return
        else:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if interval.endswith("d"):
days = int(interval[:-1])
schedule.every(days).days.do(self.run_update)
logger.info(f"Scheduling updates every {days} days")
interval_seconds = int(interval[:-1]) * 86400
schedule.every(int(interval[:-1])).days.do(self.run_update)
elif interval.endswith("h"):
hours = int(interval[:-1])
schedule.every(hours).hours.do(self.run_update)
logger.info(f"Scheduling updates every {hours} hours")
interval_seconds = int(interval[:-1]) * 3600
schedule.every(int(interval[:-1])).hours.do(self.run_update)
elif interval.endswith("m"):
minutes = int(interval[:-1])
schedule.every(minutes).minutes.do(self.run_update)
logger.info(f"Scheduling updates every {minutes} minutes")
interval_seconds = int(interval[:-1]) * 60
schedule.every(int(interval[:-1])).minutes.do(self.run_update)
else:
logger.warning(f"Invalid UPDATE_INTERVAL format: {interval}, defaulting to daily")
interval_seconds = 86400
schedule.every().day.do(self.run_update)
if interval.endswith("d"):
try:
interval_value = int(interval[:-1])
if interval_value <= 0:
raise ValueError("Interval must be positive")
interval_seconds = interval_value * 86400
schedule.every(interval_value).days.do(self.run_update)
except ValueError as e:
logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
return
elif interval.endswith("h"):
try:
interval_value = int(interval[:-1])
if interval_value <= 0:
raise ValueError("Interval must be positive")
interval_seconds = interval_value * 3600
schedule.every(interval_value).hours.do(self.run_update)
except ValueError as e:
logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
return
elif interval.endswith("m"):
try:
interval_value = int(interval[:-1])
if interval_value <= 0:
raise ValueError("Interval must be positive")
interval_seconds = interval_value * 60
schedule.every(interval_value).minutes.do(self.run_update)
except ValueError as e:
logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
return
else:
logger.warning(f"Invalid UPDATE_INTERVAL format: {interval}, defaulting to daily")
interval_seconds = 86400
schedule.every().day.do(self.run_update)

Comment thread src/process_manager.py
Comment on lines +295 to +297
if (time.time() - last_run) >= interval_seconds:
logger.info("Update interval elapsed since last run, running update now...")
threading.Thread(target=self.run_update, daemon=True).start()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Prevent concurrent update executions.

When an immediate update is triggered (line 297), there's no mechanism to prevent the scheduled jobs from also firing run_update() concurrently. If the immediate update is still running when the next scheduled interval arrives, both threads will execute run_update() simultaneously, potentially causing race conditions, duplicate work, or resource conflicts.

Consider adding a lock or checking self.state before starting an update.

🔒 Proposed fix to add update locking

Add an update lock as an instance variable in __init__:

def __init__(self):
    self.state = AppState.INITIALIZING
    self.photon_process = None
    self.should_exit = False
    self.update_lock = threading.Lock()
    
    signal.signal(signal.SIGTERM, self.handle_shutdown)
    signal.signal(signal.SIGINT, self.handle_shutdown)

Then guard run_update() at the entry point:

    def run_update(self):
+        if not self.update_lock.acquire(blocking=False):
+            logger.warning("Update already in progress, skipping concurrent run")
+            return
+        try:
            if config.UPDATE_STRATEGY == "DISABLED":
                logger.info("Updates disabled, skipping")
                return
            
            # ... rest of run_update logic ...
+        finally:
+            self.update_lock.release()

@binnichtaktiv

Copy link
Copy Markdown
Author

Damn I didn't see #303

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant