Skip to content

SIGHUP does not close all FDs to the old log file #8303

@whitslack

Description

@whitslack

#1847 implemented log rotation only for the main lightningd daemon but ignored subdaemons. While ostensibly subdaemons are supposed to write their log messages through their sockets to the main daemon, nevertheless they do inherit the file descriptor to the log file from the main daemon, and the main daemon does not signal them to close that file descriptor upon receipt of SIGHUP. Due to the lingering file descriptors, the old log file continues consuming disk space even after the last directory entry referencing it is unlinked. (Of course, it can still be truncated manually to reclaim the disk space.)

To demonstrate this, I did the following:

  1. Configure CLN to log to /var/log/lightningd.log and start it.
  2. Create an empty file at /var/log/lightningd/2025-05-20.debug.log, writable by lightningd.
  3. Run mv --exchange /var/log/lightningd.log /var/log/lightningd/2025-05-20.debug.log to atomically swap the active log file and the rotated log file.
  4. Run killall -HUP lightningd to tell lightningd to close and reopen its log file.
  5. Now the lightningd process has the new log file open for writing (and indeed is writing to it):
    # ls -l "/proc/$(pidof lightningd)/fd/1"
    l-wx------ 1 lightning lightning 64 21-May-2025 15:13 /proc/23166/fd/1 -> /var/log/lightningd.log
    
  6. However, the old log file is still open by many subdaemons:
    # fuser /var/log/lightningd/2025-05-20.debug.log
    /var/log/lightningd/2025-05-20.debug.log:   549   550   551   553   937  1420  2028  2264  2963  2990  2994  2995  3016  3018  3019  3578  3762  3781  4461  4887  5008  5502  5592  5629  5796  5853  6296  6479  7637  7998  8096  8148  8166  8232  8267  8532  8687  9166  9238  9239  9296  9544 10799 10967 10968 11465 11848 12015 12587 12701 12702 12707 12709 12728 13120 13197 13369 13639 13860 13958 13959 14086 14089 14441 14629 15677 16601 16919 16920 17285 17293 17516 18105 18239 18325 18919 19022 19050 19683 19977 20763 21276 21467 21489 21670 21934 22262 22360 22537 22558 22677 22869 23087 23091 23096 23097 23099 23100 23102 23103 23105 23107 23149 23232 23234 23323 23324 23327 23328 23392 23588 23701 23812 24112 24177 24603 24813 24957 25009 25314 25512 25536 25877 26053 26169 26351 26373 26603 26845 27085 27299 27745 27821 27996 28105 28713 28844 28845 28846 28848 28850 28852 28853 28854 28855 28856 28857 28867 28876 28877 28878 28879 28880 28882 28883 28885 28886 28888 28889 28893 28895 28903 28934 28951 29460 30286 30793 30961 31199 31268 31302 31333 31441 31490 31942 31943 31944 31946 31948 31949 31950 31951 32763
    
    # ls -l /proc/549/fd/1
    l-wx------ 1 lightning lightning 64 21-May-2025 15:16 /proc/549/fd/1 -> /var/log/lightningd/2025-05-20.debug.log
    

Given that the subdaemons really shouldn't be writing directly to the log file, I am suspicious about whether they ought to be inheriting a file descriptor to the log file. I would suggest that lightningd should redirect stdout (FD 1) to /dev/null whenever it is launching a subprocess (subdaemon or plugin).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions