PM-33527 Database SQL scripts#7616
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7616 +/- ##
==========================================
+ Coverage 59.84% 64.22% +4.38%
==========================================
Files 2121 2123 +2
Lines 93464 93517 +53
Branches 8292 8292
==========================================
+ Hits 55929 60063 +4134
+ Misses 35548 31385 -4163
- Partials 1987 2069 +82 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@claude Please add a review |
|
Claude finished @Banrion's task in 2m 56s —— View job Code Review — PM-33527 Database SQL scripts
Thanks for the GDPR cleanup scaffolding. The DDL/sproc/Dapper repo look coherent on their own, but there are gaps before this is wireable into the worker. Findings below grouped by severity. 🔴 Blocking1. Missing EF Core implementation (dual‑ORM strategy)
See the precedent set by 2. DI registration missing
Without this, 3. SELECT TOP 1 * FROM [dbo].[OrganizationEventCleanup]
WHERE [CompletedAt] IS NULL
ORDER BY [QueuedAt] ASCIf more than one worker instance polls (which is the normal case in App Services / k8s), every worker will pick the same row. Recommended fix — atomically claim and return in one statement: UPDATE TOP (1) c
SET [StartedAt] = SYSUTCDATETIME(),
[LastProgressAt] = SYSUTCDATETIME()
OUTPUT inserted.*
FROM [dbo].[OrganizationEventCleanup] c WITH (UPDLOCK, READPAST, ROWLOCK)
WHERE [CompletedAt] IS NULL
AND [StartedAt] IS NULL
ORDER BY [QueuedAt] ASCIf you do want a separate "peek" + "start", at minimum use 🟠 Important4. No max-attempts / terminal failure handling 5. No retry backoff 6. Missing tests 7. CREATE NONCLUSTERED INDEX [IX_Event_OrganizationId]
ON [dbo].[Event]([OrganizationId] ASC)
WHERE [OrganizationId] IS NOT NULL
WITH (ONLINE = ON, MAXDOP = 4);Also: the existing 🟡 Minor / Style8. 9. Filtered index would be cheaper CREATE NONCLUSTERED INDEX [IX_OrganizationEventCleanup_Pending]
ON [dbo].[OrganizationEventCleanup]([QueuedAt] ASC)
WHERE [CompletedAt] IS NULL;10. 11. 12. 13. GDPR retention question 14. SonarCloud flagged 2 new issues on the Sonar dashboard — please review. SummaryThe shape of the queue table is reasonable, but as written this PR can't ship: there is no EF Core implementation for non-MSSQL providers, the repository isn't registered with DI, and |
🤖 Bitwarden Claude Code ReviewOverall Assessment: REQUEST CHANGES This PR introduces the SQL Server schema, stored procedures, Dapper repository, and entity for an Code Review Details
Additional one-line observations (not posted inline):
|
| Task IncrementProgressAsync(Guid id, long delta); | ||
| Task RecordErrorAsync(Guid id, string message); | ||
| Task MarkCompletedAsync(Guid id); | ||
| } |
There was a problem hiding this comment.
Details and fix
Bitwarden's data access layer requires both Dapper (MSSQL) and EF Core (PostgreSQL/MySQL/SQLite) implementations for any new repository interface unless the feature is explicitly EF-only. Only the Dapper implementation exists in this PR. Self-hosted instances on Postgres/MySQL/SQLite cannot resolve IOrganizationEventCleanupRepository once a consumer is wired in.
Required additions:
src/Infrastructure.EntityFramework/Dirt/Models/OrganizationEventCleanup.cs(entity)src/Infrastructure.EntityFramework/Dirt/Configurations/OrganizationEventCleanupEntityTypeConfiguration.cssrc/Infrastructure.EntityFramework/Dirt/Repositories/OrganizationEventCleanupRepository.csDatabaseContext.OrganizationEventCleanupsDbSet- EF migrations under
util/Migrator/PostgresMigrations,MySqlMigrations,SqliteMigrations
Reference: Database migrations (EF)
| { | ||
| public OrganizationEventCleanupRepository(GlobalSettings globalSettings) | ||
| : base(globalSettings.SqlServer.ConnectionString, globalSettings.SqlServer.ReadOnlyConnectionString) | ||
| { } |
There was a problem hiding this comment.
Details and fix
OrganizationEventCleanupRepository is not added to DapperServiceCollectionExtensions.AddDapperRepositories (or the EF equivalent). Any future consumer that injects IOrganizationEventCleanupRepository will fail at runtime with an unresolved-dependency error.
Add in src/Infrastructure.Dapper/DapperServiceCollectionExtensions.cs:
services.AddSingleton<IOrganizationEventCleanupRepository, OrganizationEventCleanupRepository>();And the equivalent registration in EntityFrameworkServiceCollectionExtensions.AddPasswordManagerEFRepositories once the EF implementation lands.
|
| WHERE | ||
| [CompletedAt] IS NULL | ||
| ORDER BY | ||
| [QueuedAt] ASC |
There was a problem hiding this comment.
ReadNextPending does not atomically claim work — concurrent workers will dequeue the same row.
Details and fix
The procedure returns the oldest non-completed row but does not mark it claimed. Two workers polling at the same time will each receive the same OrganizationEventCleanup row, both will call MarkStarted, and both will issue DELETE against the same organization's events in parallel — wasteful work, contention on the Event table, and inflated Attempts/EventsDeletedCount accounting.
The presence of StartedAt, Attempts, and LastError columns implies multi-worker retry semantics. Without atomic claim, those guarantees are unenforceable.
Suggested approach using READPAST + UPDLOCK + OUTPUT:
UPDATE TOP (1) [dbo].[OrganizationEventCleanup] WITH (READPAST, UPDLOCK, ROWLOCK)
SET [StartedAt] = COALESCE([StartedAt], SYSUTCDATETIME()),
[LastProgressAt] = SYSUTCDATETIME()
OUTPUT INSERTED.*
WHERE [Id] = (
SELECT TOP 1 [Id]
FROM [dbo].[OrganizationEventCleanup] WITH (READPAST, UPDLOCK, ROWLOCK)
WHERE [CompletedAt] IS NULL
AND ([StartedAt] IS NULL OR [LastProgressAt] < DATEADD(MINUTE, -<lease>, SYSUTCDATETIME()))
ORDER BY [QueuedAt] ASC
);This also lets you reclaim stale leases when a worker dies mid-run. If a single-worker model is intended instead, that constraint should be documented and enforced at the consumer layer.
| IF NOT EXISTS(SELECT name FROM sys.indexes WHERE name = 'IX_Event_OrganizationId' AND object_id = OBJECT_ID('[dbo].[Event]')) | ||
| BEGIN | ||
| CREATE NONCLUSTERED INDEX [IX_Event_OrganizationId] | ||
| ON [dbo].[Event]([OrganizationId] ASC) | ||
| WHERE [OrganizationId] IS NOT NULL; | ||
| END |
There was a problem hiding this comment.
Event without ONLINE = ON will block writes on a very large table during deploy.
Details and fix
dbo.Event is one of the highest-volume tables in the system. Creating a non-clustered index without WITH (ONLINE = ON) takes a schema-modification lock for the duration of the index build, which blocks all writes to the event stream while the migration runs. On Bitwarden Cloud this is the difference between a transparent deploy and an event-logging outage.
Azure SQL and SQL Server Enterprise both support online index builds; recommended:
CREATE NONCLUSTERED INDEX [IX_Event_OrganizationId]
ON [dbo].[Event]([OrganizationId] ASC)
WHERE [OrganizationId] IS NOT NULL
WITH (ONLINE = ON, MAXDOP = 0);Note: Self-hosted SQL Server Standard does not support ONLINE = ON. If self-hosted compatibility matters, gate the option behind an edition check or document the maintenance window expectation. The migration could also be split out of this PR so the index build runs separately from the table/sproc additions.
| public async Task CreateAsync(OrganizationEventCleanup cleanup) | ||
| { | ||
| using var connection = new SqlConnection(ConnectionString); | ||
| await connection.ExecuteAsync( | ||
| "[dbo].[OrganizationEventCleanup_Create]", | ||
| new { cleanup.Id, cleanup.OrganizationId, cleanup.QueuedAt }, | ||
| commandType: CommandType.StoredProcedure); | ||
| } |
There was a problem hiding this comment.
🎨 SUGGESTED: CreateAsync does not call SetNewId(), diverging from the generic Repository<T,TId> convention.
Details and fix
The base Repository<T, TId>.CreateAsync calls obj.SetNewId() before the insert (see src/Infrastructure.Dapper/Repositories/Repository.cs:46-48). Because this repository inherits from BaseRepository directly, that behavior is lost — callers who construct an OrganizationEventCleanup without setting Id will insert Guid.Empty and the second such insert will collide on the PK.
public async Task CreateAsync(OrganizationEventCleanup cleanup)
{
cleanup.SetNewId();
using var connection = new SqlConnection(ConnectionString);
await connection.ExecuteAsync(
"[dbo].[OrganizationEventCleanup_Create]",
new { cleanup.Id, cleanup.OrganizationId, cleanup.QueuedAt },
commandType: CommandType.StoredProcedure);
}Same applies for QueuedAt: setting it server-side (SYSUTCDATETIME() default on the column, or inside the sproc) would remove client-clock drift from the ordering used by ReadNextPending.
mkincaid-bw
left a comment
There was a problem hiding this comment.
This PR creates a new table and modifies an existing table but there are no EF migrations to handle these changes for self-host/lite customers. Is that intentional?
| [QueuedAt] DATETIME2 (7) NOT NULL, | ||
| [StartedAt] DATETIME2 (7) NULL, | ||
| [LastProgressAt] DATETIME2 (7) NULL, | ||
| [CompletedAt] DATETIME2 (7) NULL, |
There was a problem hiding this comment.
Datetime columns should end with Date, not At
https://contributing.bitwarden.com/contributing/code-style/sql/#column-definitions
Also, perhaps consider if the table should use our standard CreationDate column name for QueueDate and RevisionDate column name in lieu of LastProgressDate.
| [QueuedAt] DATETIME2 (7) NOT NULL, | |
| [StartedAt] DATETIME2 (7) NULL, | |
| [LastProgressAt] DATETIME2 (7) NULL, | |
| [CompletedAt] DATETIME2 (7) NULL, | |
| [CreationDate] DATETIME2 (7) NOT NULL, | |
| [RevisionDate] DATETIME2 (7) NULL, | |
| [StartDate] DATETIME2 (7) NULL, | |
| [CompletedDate] DATETIME2 (7) NULL, |
| ON [dbo].[Event]([Date] DESC, [OrganizationId] ASC, [ActingUserId] ASC, [CipherId] ASC) INCLUDE ([ServiceAccountId], [GrantedServiceAccountId]); | ||
|
|
||
| GO | ||
| CREATE NONCLUSTERED INDEX [IX_Event_OrganizationId] |
There was a problem hiding this comment.
ℹ️ Just noting that Events for our cloud database are kept in Azure Table storage, not in the SQL database, so this index would only be applicable to self-host/lite customers.
| @@ -0,0 +1,15 @@ | |||
| CREATE PROCEDURE [dbo].[OrganizationEventCleanup_IncrementProgress] | |||
There was a problem hiding this comment.
This stored procedure name does not follow our naming conventions. This should be named something like OrganizationEventCleanup_UpdateProgress.
| [dbo].[OrganizationEventCleanup] | ||
| SET | ||
| [EventsDeletedCount] = [EventsDeletedCount] + @Delta, | ||
| [LastProgressAt] = SYSUTCDATETIME() |
There was a problem hiding this comment.
When updating a value in the database, the value should be generated in code and passed in as parameters to the stored procedure.. We should not be using built-in SQL functions like SYSUTCDATETIME or GETUTCDATE
| @@ -0,0 +1,14 @@ | |||
| CREATE PROCEDURE [dbo].[OrganizationEventCleanup_MarkCompleted] | |||
There was a problem hiding this comment.
This stored procedure name does not follow our naming conventions. This should be named something like OrganizationEventCleanup_UpdateCompleted.
| [CompletedAt] = SYSUTCDATETIME(), | ||
| [LastProgressAt] = SYSUTCDATETIME() |
There was a problem hiding this comment.
See previous comment regarding SYSUTCDATETIME
| @@ -0,0 +1,14 @@ | |||
| CREATE PROCEDURE [dbo].[OrganizationEventCleanup_MarkStarted] | |||
There was a problem hiding this comment.
This stored procedure name does not follow our naming conventions. This should be named something like OrganizationEventCleanup_UpdateStarted.
| [StartedAt] = COALESCE([StartedAt], SYSUTCDATETIME()), | ||
| [LastProgressAt] = SYSUTCDATETIME() |
There was a problem hiding this comment.
See previous comments regarding SYSUTCDATETIME
| @@ -0,0 +1,16 @@ | |||
| CREATE PROCEDURE [dbo].[OrganizationEventCleanup_RecordError] | |||
There was a problem hiding this comment.
This stored procedure name does not follow our naming conventions. This should be named something like OrganizationEventCleanup_UpdateError.
| SET | ||
| [Attempts] = [Attempts] + 1, | ||
| [LastError] = @Message, | ||
| [LastProgressAt] = SYSUTCDATETIME() |
There was a problem hiding this comment.
See previous comments regarding SYSUTCDATETIME



🎟️ Tracking
PM-33527
📔 Objective
In order to stay within GDPR compliance when a organization wants to be deleted we need to remove the event logs from Azure Table Storage. This PR is the database changes needed in order to delete the event logs using jobs and long running processes.
📸 Screenshots