MINIFICPP-2849 Implement LMDB based content repository#2201
Conversation
| size_t LmdbStream::write(const uint8_t* value, size_t size) { | ||
| if (!write_enable_) { return STREAM_ERROR; } | ||
| if (size != 0 && IsNullOrEmpty(value)) { return STREAM_ERROR; } | ||
| value_.append(reinterpret_cast<const char*>(value), size); |
There was a problem hiding this comment.
LMDB does not have an append function when writing a value like RocksDB's Merge function, so instead of rereading the original value, appending to it, then writing back the new value, all the writes are buffered until the stream is closed, that's when the actual write and commit happens. Currently all content repository streams are used either for write-only or read-only use cases, so there should be no use case where reads and writes are mixed. This should be addressed in a separate PR to change the content repository interface to use separate OutputStream and InputStream types for reads and writes to enforce this, which would also result in separate LmdbInputStream and LmdbOutputStream types (same for RocksDB).
|
|
||
| if (WIN32) | ||
| get_directory_property(MINIFI_SAVED_COMPILE_DEFS COMPILE_DEFINITIONS) | ||
| remove_definitions(-DWIN32_LEAN_AND_MEAN) |
There was a problem hiding this comment.
can you explain why you did this?
There was a problem hiding this comment.
LMDB fails to compile on Windows when WIN32_LEAN_AND_MEAN is defined, which is automatically added to the compile definitions in CMakeLists.txt so it is included in all thirdparties used with FetchContent, so it needs to be removed separately for LMDB on Windows.
There was a problem hiding this comment.
I'd like to make absolutely sure that we only undefine WIN32_LEAN_AND_MEAN while compiling LMDB, or find another workaround. If you look up what it does, it prevents windows.h from including a bunch of additional headers that usually end up unused, and may end up conflicting with other headers, while also increasing compile times. Alternatively, we could look up the transitively included headers that LMDB relies on, and include them explicitly with target_compile_options(target PRIVATE -include foo.h)
Ideally Microsoft would've made the lightweight header the default, with the option to opt in to additional features, but for historical reasons, the heavyweight header is the default, and you can opt out of the extra features. (bloat)
There was a problem hiding this comment.
When I checked this according to my understanding this change should only impact the lmdb subdirectory, so it should not impact any other compilation. Anyway I can still check if there is another workaround.
There was a problem hiding this comment.
There were 2 issues:
- Defining WIN32_LEAN_AND_MEAN removed
winternl.hneeded forNTSTATUSusage, that needed to be added explicitly - After including
winternl.hthe function pointer names inmdb.cclashed with the Windows symbols, those needed to be renamed
Added patch in: 52d93b2
There was a problem hiding this comment.
Pull request overview
Implements an LMDB-backed Content Repository extension for MiNiFi C++ (as an alternative to the RocksDB-based DatabaseContentRepository), including build integration, configuration plumbing, and dedicated unit tests.
Changes:
- Adds a new
minifi-lmdbextension implementingLmdbContentRepositoryandLmdbStream, plus LMDB-focused unit tests. - Wires LMDB into configuration (new
nifi.content.repository.lmdb.max.db.sizeproperty) and improves user-facing error reporting/docs. - Introduces a FetchContent-based LMDB third-party build with patching, and enables LMDB in CI/default extensions.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| thirdparty/lmdb/fix-windows-symbols.patch | Patches upstream LMDB to avoid Windows NT API symbol collisions. |
| thirdparty/lmdb/add-cmake-file.patch | Adds a CMakeLists.txt to LMDB upstream sources for cross-platform builds. |
| minifi-api/include/minifi-cpp/properties/Configuration.h | Adds config key for LMDB max DB size. |
| libminifi/test/libtest/unit/TestBase.h | Adds virtual destructor to TestController for safer polymorphic use. |
| libminifi/src/core/RepositoryFactory.cpp | Adds clearer error log when LMDB extension is missing. |
| libminifi/src/Configuration.cpp | Registers LMDB max DB size property validator. |
| extensions/rocksdb-repos/tests/ContentSessionTests.cpp | Renames member and fixes destructor override for test controller. |
| extensions/lmdb/tests/LmdbStreamTests.cpp | Adds unit tests for LmdbStream read/write/commit behavior. |
| extensions/lmdb/tests/LmdbContentSessionTests.cpp | Adds session semantics tests for LMDB content repository. |
| extensions/lmdb/tests/LmdbContentRepositoryTests.cpp | Adds repository init/exists/read/remove/orphan tests for LMDB. |
| extensions/lmdb/tests/CMakeLists.txt | Adds CMake rules to build/register LMDB unit tests. |
| extensions/lmdb/LmdbStream.h | Introduces LMDB-backed stream abstraction. |
| extensions/lmdb/LmdbStream.cpp | Implements LMDB stream read/write/commit logic. |
| extensions/lmdb/LmdbContentRepository.h | Declares LMDB-backed ContentRepository implementation and session type. |
| extensions/lmdb/LmdbContentRepository.cpp | Implements LMDB-backed content repository lifecycle, GC/orphaning, stats. |
| extensions/lmdb/CMakeLists.txt | Adds extension build and registration for minifi-lmdb. |
| CONFIGURE.md | Documents LMDB content repository option and caveats. |
| conf/minifi.properties.in | Adds commented LMDB max DB size property template. |
| CMakeLists.txt | Adds minifi-lmdb to default enabled extensions list. |
| cmake/MiNiFiOptions.cmake | Adds ENABLE_LMDB option. |
| cmake/LMDB.cmake | FetchContent integration for building LMDB and applying local patches. |
| .github/workflows/ci.yml | Enables LMDB in CI build matrices. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
https://issues.apache.org/jira/browse/MINIFICPP-2849
Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.
In order to streamline the review of the contribution we ask you to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically main)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.