feat: persist update schedule across restarts#332
Conversation
📝 WalkthroughWalkthrough
ChangesPersisted interval-aware scheduling
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
src/process_manager.py (1)
281-289: ⚡ Quick winEliminate redundant interval parsing.
Each branch parses
int(interval[:-1])twice: once forinterval_secondsand once forschedule.every(). Extracting this to a variable improves readability and performance.♻️ Proposed refactor
if interval.endswith("d"): - interval_seconds = int(interval[:-1]) * 86400 - schedule.every(int(interval[:-1])).days.do(self.run_update) + interval_value = int(interval[:-1]) + interval_seconds = interval_value * 86400 + schedule.every(interval_value).days.do(self.run_update) elif interval.endswith("h"): - interval_seconds = int(interval[:-1]) * 3600 - schedule.every(int(interval[:-1])).hours.do(self.run_update) + interval_value = int(interval[:-1]) + interval_seconds = interval_value * 3600 + schedule.every(interval_value).hours.do(self.run_update) elif interval.endswith("m"): - interval_seconds = int(interval[:-1]) * 60 - schedule.every(int(interval[:-1])).minutes.do(self.run_update) + interval_value = int(interval[:-1]) + interval_seconds = interval_value * 60 + schedule.every(interval_value).minutes.do(self.run_update)
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 869763bf-4208-4634-aaf9-700e97d844cd
📒 Files selected for processing (2)
src/process_manager.pytests/test_process_manager.py
| with open(_get_last_update_run_path(), "w") as f: | ||
| f.write(json.dumps({"ts": time.time()})) |
There was a problem hiding this comment.
Consider explicit error handling for timestamp persistence.
If the timestamp file write fails (due to permissions, disk space, etc.), the outer exception handler at line 251 will catch it and report "Update run failed unexpectedly", which may be misleading when the actual update succeeded. Consider wrapping the timestamp write in a try-except block with specific logging.
🛡️ Proposed fix to add explicit error handling
index.drop_backup()
- with open(_get_last_update_run_path(), "w") as f:
- f.write(json.dumps({"ts": time.time()}))
+ try:
+ with open(_get_last_update_run_path(), "w") as f:
+ f.write(json.dumps({"ts": time.time()}))
+ except Exception as e:
+ logger.warning(f"Failed to persist update timestamp: {e}")
else:🧰 Tools
🪛 ast-grep (0.43.0)
[info] 245-245: use jsonify instead of json.dumps for JSON output
Context: json.dumps({"ts": time.time()})
Note: Security best practice.
(use-jsonify)
| if not os.path.exists(_get_last_update_run_path()): | ||
| with open(_get_last_update_run_path(), "w") as f: | ||
| f.write(json.dumps({"ts": time.time()})) | ||
| last_run = time.time() | ||
| logger.info("No last update timestamp found, treating as first run") | ||
| else: | ||
| try: | ||
| with open(_get_last_update_run_path()) as f: | ||
| last_run = json.loads(f.read())["ts"] | ||
| except Exception: | ||
| logger.error("Unable to read last update timestamp") | ||
| return |
There was a problem hiding this comment.
Add error handling for first-run timestamp write.
The timestamp file write on line 267 lacks error handling. If this write fails, the function continues with last_run = time.time() but without persistence, causing the timestamp to reset on the next restart.
🛡️ Proposed fix to handle write errors
if not os.path.exists(_get_last_update_run_path()):
- with open(_get_last_update_run_path(), "w") as f:
- f.write(json.dumps({"ts": time.time()}))
- last_run = time.time()
- logger.info("No last update timestamp found, treating as first run")
+ try:
+ with open(_get_last_update_run_path(), "w") as f:
+ f.write(json.dumps({"ts": time.time()}))
+ last_run = time.time()
+ logger.info("No last update timestamp found, treating as first run")
+ except Exception as e:
+ logger.error(f"Failed to initialise update timestamp: {e}")
+ return
else:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if not os.path.exists(_get_last_update_run_path()): | |
| with open(_get_last_update_run_path(), "w") as f: | |
| f.write(json.dumps({"ts": time.time()})) | |
| last_run = time.time() | |
| logger.info("No last update timestamp found, treating as first run") | |
| else: | |
| try: | |
| with open(_get_last_update_run_path()) as f: | |
| last_run = json.loads(f.read())["ts"] | |
| except Exception: | |
| logger.error("Unable to read last update timestamp") | |
| return | |
| if not os.path.exists(_get_last_update_run_path()): | |
| try: | |
| with open(_get_last_update_run_path(), "w") as f: | |
| f.write(json.dumps({"ts": time.time()})) | |
| last_run = time.time() | |
| logger.info("No last update timestamp found, treating as first run") | |
| except Exception as e: | |
| logger.error(f"Failed to initialise update timestamp: {e}") | |
| return | |
| else: | |
| try: | |
| with open(_get_last_update_run_path()) as f: | |
| last_run = json.loads(f.read())["ts"] | |
| except Exception: | |
| logger.error("Unable to read last update timestamp") | |
| return |
🧰 Tools
🪛 ast-grep (0.43.0)
[info] 267-267: use jsonify instead of json.dumps for JSON output
Context: json.dumps({"ts": time.time()})
Note: Security best practice.
(use-jsonify)
| if interval.endswith("d"): | ||
| days = int(interval[:-1]) | ||
| schedule.every(days).days.do(self.run_update) | ||
| logger.info(f"Scheduling updates every {days} days") | ||
| interval_seconds = int(interval[:-1]) * 86400 | ||
| schedule.every(int(interval[:-1])).days.do(self.run_update) | ||
| elif interval.endswith("h"): | ||
| hours = int(interval[:-1]) | ||
| schedule.every(hours).hours.do(self.run_update) | ||
| logger.info(f"Scheduling updates every {hours} hours") | ||
| interval_seconds = int(interval[:-1]) * 3600 | ||
| schedule.every(int(interval[:-1])).hours.do(self.run_update) | ||
| elif interval.endswith("m"): | ||
| minutes = int(interval[:-1]) | ||
| schedule.every(minutes).minutes.do(self.run_update) | ||
| logger.info(f"Scheduling updates every {minutes} minutes") | ||
| interval_seconds = int(interval[:-1]) * 60 | ||
| schedule.every(int(interval[:-1])).minutes.do(self.run_update) | ||
| else: | ||
| logger.warning(f"Invalid UPDATE_INTERVAL format: {interval}, defaulting to daily") | ||
| interval_seconds = 86400 | ||
| schedule.every().day.do(self.run_update) |
There was a problem hiding this comment.
Validate interval parsing to prevent runtime errors.
The code calls int(interval[:-1]) without validation. If UPDATE_INTERVAL contains non-numeric characters (e.g., "30xd" or "abcd"), this will raise ValueError and crash the scheduler setup. Additionally, negative or zero values should be rejected.
🛡️ Proposed fix to validate interval parsing
interval = config.UPDATE_INTERVAL.lower()
if interval.endswith("d"):
- interval_seconds = int(interval[:-1]) * 86400
- schedule.every(int(interval[:-1])).days.do(self.run_update)
+ try:
+ interval_value = int(interval[:-1])
+ if interval_value <= 0:
+ raise ValueError("Interval must be positive")
+ interval_seconds = interval_value * 86400
+ schedule.every(interval_value).days.do(self.run_update)
+ except ValueError as e:
+ logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
+ return
elif interval.endswith("h"):
- interval_seconds = int(interval[:-1]) * 3600
- schedule.every(int(interval[:-1])).hours.do(self.run_update)
+ try:
+ interval_value = int(interval[:-1])
+ if interval_value <= 0:
+ raise ValueError("Interval must be positive")
+ interval_seconds = interval_value * 3600
+ schedule.every(interval_value).hours.do(self.run_update)
+ except ValueError as e:
+ logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
+ return
elif interval.endswith("m"):
- interval_seconds = int(interval[:-1]) * 60
- schedule.every(int(interval[:-1])).minutes.do(self.run_update)
+ try:
+ interval_value = int(interval[:-1])
+ if interval_value <= 0:
+ raise ValueError("Interval must be positive")
+ interval_seconds = interval_value * 60
+ schedule.every(interval_value).minutes.do(self.run_update)
+ except ValueError as e:
+ logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}")
+ return
else:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if interval.endswith("d"): | |
| days = int(interval[:-1]) | |
| schedule.every(days).days.do(self.run_update) | |
| logger.info(f"Scheduling updates every {days} days") | |
| interval_seconds = int(interval[:-1]) * 86400 | |
| schedule.every(int(interval[:-1])).days.do(self.run_update) | |
| elif interval.endswith("h"): | |
| hours = int(interval[:-1]) | |
| schedule.every(hours).hours.do(self.run_update) | |
| logger.info(f"Scheduling updates every {hours} hours") | |
| interval_seconds = int(interval[:-1]) * 3600 | |
| schedule.every(int(interval[:-1])).hours.do(self.run_update) | |
| elif interval.endswith("m"): | |
| minutes = int(interval[:-1]) | |
| schedule.every(minutes).minutes.do(self.run_update) | |
| logger.info(f"Scheduling updates every {minutes} minutes") | |
| interval_seconds = int(interval[:-1]) * 60 | |
| schedule.every(int(interval[:-1])).minutes.do(self.run_update) | |
| else: | |
| logger.warning(f"Invalid UPDATE_INTERVAL format: {interval}, defaulting to daily") | |
| interval_seconds = 86400 | |
| schedule.every().day.do(self.run_update) | |
| if interval.endswith("d"): | |
| try: | |
| interval_value = int(interval[:-1]) | |
| if interval_value <= 0: | |
| raise ValueError("Interval must be positive") | |
| interval_seconds = interval_value * 86400 | |
| schedule.every(interval_value).days.do(self.run_update) | |
| except ValueError as e: | |
| logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}") | |
| return | |
| elif interval.endswith("h"): | |
| try: | |
| interval_value = int(interval[:-1]) | |
| if interval_value <= 0: | |
| raise ValueError("Interval must be positive") | |
| interval_seconds = interval_value * 3600 | |
| schedule.every(interval_value).hours.do(self.run_update) | |
| except ValueError as e: | |
| logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}") | |
| return | |
| elif interval.endswith("m"): | |
| try: | |
| interval_value = int(interval[:-1]) | |
| if interval_value <= 0: | |
| raise ValueError("Interval must be positive") | |
| interval_seconds = interval_value * 60 | |
| schedule.every(interval_value).minutes.do(self.run_update) | |
| except ValueError as e: | |
| logger.error(f"Invalid UPDATE_INTERVAL value: {interval} - {e}") | |
| return | |
| else: | |
| logger.warning(f"Invalid UPDATE_INTERVAL format: {interval}, defaulting to daily") | |
| interval_seconds = 86400 | |
| schedule.every().day.do(self.run_update) |
| if (time.time() - last_run) >= interval_seconds: | ||
| logger.info("Update interval elapsed since last run, running update now...") | ||
| threading.Thread(target=self.run_update, daemon=True).start() |
There was a problem hiding this comment.
Prevent concurrent update executions.
When an immediate update is triggered (line 297), there's no mechanism to prevent the scheduled jobs from also firing run_update() concurrently. If the immediate update is still running when the next scheduled interval arrives, both threads will execute run_update() simultaneously, potentially causing race conditions, duplicate work, or resource conflicts.
Consider adding a lock or checking self.state before starting an update.
🔒 Proposed fix to add update locking
Add an update lock as an instance variable in __init__:
def __init__(self):
self.state = AppState.INITIALIZING
self.photon_process = None
self.should_exit = False
self.update_lock = threading.Lock()
signal.signal(signal.SIGTERM, self.handle_shutdown)
signal.signal(signal.SIGINT, self.handle_shutdown)Then guard run_update() at the entry point:
def run_update(self):
+ if not self.update_lock.acquire(blocking=False):
+ logger.warning("Update already in progress, skipping concurrent run")
+ return
+ try:
if config.UPDATE_STRATEGY == "DISABLED":
logger.info("Updates disabled, skipping")
return
# ... rest of run_update logic ...
+ finally:
+ self.update_lock.release()|
Damn I didn't see #303 |
I have set my update schedule to 3 months. I noticed that it never updated because I either restart my server or the docker container (to update photon for example)
This PR changes the logic and saves the last time it got updated and reads from that file when starting photon-docker. So even if your server or docker container was shut down for weeks it will check and directly update if the set interval has elapsed
I also extended the test because it failed when running it for the first time. All test now succeed. I followed your guideline
I'm not a pro so my code might not be perfect, pls let me know if I should correct anything