Skip to content

Handle transient GraphQL responses without data in daily update#1

Merged
hoangsvit merged 1 commit intomasterfrom
codex/fix-daily_update-job-failure
Apr 14, 2026
Merged

Handle transient GraphQL responses without data in daily update#1
hoangsvit merged 1 commit intomasterfrom
codex/fix-daily_update-job-failure

Conversation

@hoangsvit
Copy link
Copy Markdown
Member

Motivation

  • The daily_update workflow was failing with repeated Error accessing data element messages when the GraphQL response lacked a top-level data node.
  • The change aims to make SearchUsers resilient to transient API responses and provide clearer logs for diagnosis.
  • Secondary rate-limit responses should back off longer to reduce repeated failures during scheduled runs.

Description

  • Updated SearchUsers in github/github.go to detect when the top-level data node is missing and retry instead of immediately exiting.
  • When data is missing, the code now extracts and logs the API message field (via strPropOrEmpty) to improve failure visibility.
  • Added adaptive backoff: if the message contains secondary rate limit the retry wait is increased to 65 seconds (default was 10 seconds) before retrying.
  • Kept existing retry counting logic and maximum retry behavior intact so persistent failures still abort after maxRetryCount.

Testing

  • Ran gofmt -w github/github.go to format the change successfully.
  • Ran go test ./... and all packages reported no test files and completed without errors.

Codex Task

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Handle transient GraphQL responses without data node

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Handle missing GraphQL data node with retry logic instead of immediate failure
• Extract and log API error messages for improved diagnostics
• Implement adaptive backoff: 65 seconds for secondary rate limits, 10 seconds default
• Preserve existing retry counting and maximum retry behavior
Diagram
flowchart LR
  A["GraphQL Response"] --> B{"data node present?"}
  B -->|Yes| C["Process data"]
  B -->|No| D["Extract error message"]
  D --> E["Log error details"]
  E --> F{"Secondary rate limit?"}
  F -->|Yes| G["Wait 65 seconds"]
  F -->|No| H["Wait 10 seconds"]
  G --> I{"Retry count exceeded?"}
  H --> I
  I -->|No| J["Retry request"]
  I -->|Yes| K["Fatal exit"]
Loading

Grey Divider

File Changes

1. github/github.go 🐞 Bug fix +11/-2

Add adaptive backoff and error logging for missing data

• Enhanced error handling when GraphQL response lacks top-level data node
• Extract and conditionally log API message field using strPropOrEmpty for better diagnostics
• Implement adaptive backoff duration: 65 seconds for secondary rate limit errors, 10 seconds
 default
• Maintain existing retry counting and maximum retry abort logic

github/github.go


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review bot commented Apr 14, 2026

Code Review by Qodo

🐞 Bugs (1)   📘 Rule violations (0)   📎 Requirement gaps (0)
🐞\ ☼ Reliability (1)

Grey Divider


Remediation recommended

1. Backoff not applied on errors 🐞
Description
SearchUsers still retries rootNode["errors"] responses with a fixed 10s sleep and immediately
continues, so the new 65s “secondary rate limit” backoff will never apply on that path. This can
keep scheduled runs retrying too aggressively whenever the API provides the rate-limit signal via
the errors field rather than via a top-level message+missing data shape.
Code

github/github.go[R169-175]

+				if retryCount < maxRetryCount {
+					waitDuration := 10 * time.Second
+					if strings.Contains(strings.ToLower(errorMsg), "secondary rate limit") {
+						waitDuration = 65 * time.Second
+					}
+					time.Sleep(waitDuration)
					continue Pages
Evidence
The code has two distinct retry branches: (1) an earlier errors-present branch that always sleeps
10s and continue Pages, and (2) the new data-missing branch that computes an adaptive wait
duration (65s for "secondary rate limit"). Because the errors branch continues before reaching
the data-missing logic, the adaptive backoff is not used for any response that includes an
errors key.

github/github.go[150-158]
github/github.go[160-176]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`SearchUsers` implements adaptive backoff (65s) only when `rootNode["data"]` is missing/invalid. However, when `rootNode["errors"]` is present, the function sleeps a fixed 10s and `continue`s before the adaptive logic, so secondary-rate-limit retries can remain too aggressive.

## Issue Context
There are currently two retry paths:
- `if val, ok := rootNode["errors"]; ok { ... time.Sleep(10s) ... continue Pages }`
- `if !ok { ... waitDuration := 10s; if strings.Contains(..."secondary rate limit") waitDuration = 65s; time.Sleep(waitDuration); continue Pages }`

## Fix Focus Areas
- github/github.go[150-176]

## Suggested approach
1. Extract a helper like `computeBackoff(msg string) time.Duration` (or similar) and use it in *both* retry branches.
2. In the `errors` branch, derive a message string for backoff decisions:
  - If `errors` is a slice, pull `errors[0].(map[string]any)["message"].(string)` when available.
  - Fall back to existing logging when message extraction fails.
3. Use the computed duration in the `errors` branch instead of hardcoding `10 * time.Second`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@hoangsvit hoangsvit merged commit b0b45a3 into master Apr 14, 2026
1 check passed
@hoangsvit hoangsvit deleted the codex/fix-daily_update-job-failure branch April 14, 2026 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant