I noticed a strange issue where some remote accounts' custom profile fields are returned/rendered as if they were split into individual characters.
This shows up in Mastodon API clients such as Phanpy and SubwayTooter. The UI ends up showing field names like 0, 1, 2, ... with one character of HTML in each value.
I am not yet fully sure about the root cause, but here is what I found so far.
The following analysis notes were prepared together with an AI assistant.
Observed behavior
For some remote accounts, custom profile fields appear broken into many one-character entries.
This seems to happen when accounts.field_htmls is stored as a JSON string rather than a JSON object.
For example, instead of something like:
{
"Website": "<a href=\"https://example.com\">https://example.com</a>"
}
the DB contains a JSON string whose content is an object:
"{\"Website\":\"<a href=\\\"https://example.com\\\">https://example.com</a>\"}"
Then the Mastodon API serializer runs Object.entries(account.fieldHtmls), and if account.fieldHtmls is a string, JavaScript expands it into character index entries.
Relevant code:
src/entities/account.ts
fields: Object.entries(account.fieldHtmls).map(...)
Local investigation
On my instance, the affected rows look like this:
SELECT json_typeof(field_htmls), count(*)
FROM accounts
GROUP BY 1;
The affected rows have:
More specifically:
SELECT
count(*) FILTER (WHERE (field_htmls::text) LIKE '\"{%') AS looks_like_object_string,
count(*) FILTER (WHERE (field_htmls::text) LIKE '\"[%') AS looks_like_array_string,
count(*) FILTER (
WHERE (field_htmls::text) NOT LIKE '\"{%'
AND (field_htmls::text) NOT LIKE '\"[%'
) AS other_string
FROM accounts
WHERE json_typeof(field_htmls) = 'string';
Result:
looks_like_object_string = 6128
looks_like_array_string = 0
other_string = 0
The affected rows were last updated during a specific historical window:
2024-10-15 through 2024-12-29
Recent rows do not seem to be newly affected.
Possible reason this became visible recently
This may have been latent data corruption that only became visible recently.
From a quick look, older Drizzle ORM versions used by Hollo had JSON column mapping that attempted to parse string values again:
mapFromDriverValue(value) {
if (typeof value === "string") {
try {
return JSON.parse(value);
} catch {
return value;
}
}
return value;
}
So even if the database stored field_htmls as a JSON string, the application may have received it as an object after the extra parse.
After the recent Drizzle ORM 1.0 rc2 upgrade, the JSON column implementation appears to use the newer codec = "json" behavior and no longer has that extra mapFromDriverValue() parsing step.
So my current guess is:
- Some remote account
field_htmls rows were stored as JSON strings in the past.
- Older Drizzle versions accidentally hid this by parsing JSON strings again.
- After the Drizzle upgrade, the application receives the actual JSON string.
Object.entries(account.fieldHtmls) then turns the string into per-character entries.
I am not 100% sure this is the whole story, but it seems to match the symptoms and the timing.
Relevant commits / areas
Potentially related recent change:
- Upgrade Drizzle ORM to 1.0 rc2
Places where the malformed value becomes visible:
src/entities/account.ts
- Mastodon API account serializer
Expected behavior
Custom profile fields should be returned as normal field name/value pairs.
Malformed historical field_htmls values probably should not result in per-character fields in API responses.
Actual behavior
If accounts.field_htmls is a JSON string, Mastodon API account serialization expands it with Object.entries(), producing one entry per character.
AI disclosure
This issue text was drafted with AI assistance (GPT-5.5) and reviewed by a human before submission.
I noticed a strange issue where some remote accounts' custom profile fields are returned/rendered as if they were split into individual characters.
This shows up in Mastodon API clients such as Phanpy and SubwayTooter. The UI ends up showing field names like
0,1,2, ... with one character of HTML in each value.I am not yet fully sure about the root cause, but here is what I found so far.
The following analysis notes were prepared together with an AI assistant.
Observed behavior
For some remote accounts, custom profile fields appear broken into many one-character entries.
This seems to happen when
accounts.field_htmlsis stored as a JSON string rather than a JSON object.For example, instead of something like:
{ "Website": "<a href=\"https://example.com\">https://example.com</a>" }the DB contains a JSON string whose content is an object:
"{\"Website\":\"<a href=\\\"https://example.com\\\">https://example.com</a>\"}"Then the Mastodon API serializer runs
Object.entries(account.fieldHtmls), and ifaccount.fieldHtmlsis a string, JavaScript expands it into character index entries.Relevant code:
src/entities/account.tsfields: Object.entries(account.fieldHtmls).map(...)Local investigation
On my instance, the affected rows look like this:
The affected rows have:
More specifically:
Result:
The affected rows were last updated during a specific historical window:
Recent rows do not seem to be newly affected.
Possible reason this became visible recently
This may have been latent data corruption that only became visible recently.
From a quick look, older Drizzle ORM versions used by Hollo had JSON column mapping that attempted to parse string values again:
So even if the database stored
field_htmlsas a JSON string, the application may have received it as an object after the extra parse.After the recent Drizzle ORM 1.0 rc2 upgrade, the JSON column implementation appears to use the newer
codec = "json"behavior and no longer has that extramapFromDriverValue()parsing step.So my current guess is:
field_htmlsrows were stored as JSON strings in the past.Object.entries(account.fieldHtmls)then turns the string into per-character entries.I am not 100% sure this is the whole story, but it seems to match the symptoms and the timing.
Relevant commits / areas
Potentially related recent change:
f443210Places where the malformed value becomes visible:
src/entities/account.tsExpected behavior
Custom profile fields should be returned as normal field name/value pairs.
Malformed historical
field_htmlsvalues probably should not result in per-character fields in API responses.Actual behavior
If
accounts.field_htmlsis a JSON string, Mastodon API account serialization expands it withObject.entries(), producing one entry per character.AI disclosure
This issue text was drafted with AI assistance (GPT-5.5) and reviewed by a human before submission.