RANGER-5654: Solr audit dispatcher fails to index after Kerberos TGT relogin (No key to store) with default useTicketCache=true#1030
Open
ramackri wants to merge 3 commits into
Open
Conversation
…elogin (No key to store) with default useTicketCache=true Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
mneethiraj
reviewed
Jun 23, 2026
| <property> | ||
| <name>xasecure.audit.jaas.Client.option.useTicketCache</name> | ||
| <value>true</value> | ||
| <value>false</value> |
Contributor
There was a problem hiding this comment.
Is there an alternate to disabling use of ticket cache? Like having a thread refresh kerberos ticket before it expires? Is ticket cache disabled in all Ranger modules, like Ranger admin (to fetch audit logs from Solr), plugins (to download policies/tags/role/..)?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes RANGER-5654: Solr audit dispatcher stops indexing audits into Kerberos-protected Solr after TGT refresh/relogin when
useTicketCache=true(the shipped default).What changes were proposed in this pull request?
Set
xasecure.audit.jaas.Client.option.useTicketCache=falsein:audit-server/audit-dispatcher/dispatcher-solr/src/main/resources/conf/ranger-audit-dispatcher-solr-site.xmldev-support/ranger-docker/scripts/audit-dispatcher/ranger-audit-dispatcher-solr-site.xmlNo Java changes. Config-only fix.
Problem
The Solr dispatcher consumes from Kafka but eventually stops writing to Solr when Kerberos is enabled. Logs show
Failure in sending audits into SolrandNo key to store. Kafka offsets advance; Solr document counts stay flat until dispatcher restart.Root cause (corrected causal chain):
SolrAuditDestinationusesKerberosJAASConfigUser+KerberosAction.KerberosAction.execute()→checkTGTAndRelogin().TICKET_RENEW_WINDOW = 0.80inAbstractKerberosUser), that method doeslogout(); login()— this is by design inKerberosAction, not caused byuseKeyTab=true.useKeyTab=true(correct for a keytab daemon) anduseTicketCache=true(incorrect here).logout(),login()withuseTicketCache=truemakesKrb5LoginModuleuse the ticket cache; with no valid cache entry → "No key to store" → dispatcher stuck until restart.logout(); login()?KerberosAction/checkTGTAndRelogin()useKeyTab=trueuseTicketCache=trueFix:
useTicketCache=falseso step 3 still happens at ~80% TGT, but step 5 succeeds by reading the keytab again (same pattern as ingestor Kafka JAAS and Kafka plugin).Why
useTicketCache=false?useTicketCache=trueis appropriate when a process relies on an existing user ticket cache (kinit/KRB5CCNAME). For long-running services that authenticate from a keytab, Ranger and Hadoop convention isuseTicketCache=falseso relogin always uses the keytab — the same pattern already used elsewhere in the audit stack.useTicketCacheAuditServerConstants)checkTGTAndReloginFromKeytab()checkTGTAndReloginFromKeytab()isInitiator=falsein unused acceptor JAAS helper)KerberosActionreloginKerberosActionlogout(); login()at ~80% TGT)The Solr dispatcher was the outlier: the only long-running audit daemon using outbound JAAS client login with proactive
logout()/login()whileuseTicketCache=true. HDFS dispatcher and plugin→ingestor paths avoid this by using UGIcheckTGTAndReloginFromKeytab()instead of JAASlogout()/login().How was this patch tested?
Manual testing (Docker Tier 3 audit stack with Kerberos)
Environment:
dev-support/ranger-dockerTier 3 compose — Admin, KDC, Kafka, ingestor, Solr dispatcher, HDFS/Ozone/Hive plugins.Reproduce (master behavior):
hdfs dfs -ls /, Ozone volume create).No key to store,Failure in sending audits into Solr.reqUser:*count flat; Kafka offset continues to grow.Verify (with
useTicketCache=false+ Solr dispatcher restart):Successful login for rangerauditserver/...).accessAudittotalCount increases.:7081→ Kafkaranger_audits→ Solr dispatcher → Solrranger_audits→ Admin Access Audit UI.Upgrade note
Existing deployments that already have
useTicketCache=truein their liveranger-audit-dispatcher-solr-site.xmlmust set it tofalseand restart the Solr dispatcher (or redeploy from the updated tarball).