Skip to content

Custom profile fields may be split into per-character entries in Mastodon API responses #504

@ntsklab

Description

@ntsklab

I noticed a strange issue where some remote accounts' custom profile fields are returned/rendered as if they were split into individual characters.

This shows up in Mastodon API clients such as Phanpy and SubwayTooter. The UI ends up showing field names like 0, 1, 2, ... with one character of HTML in each value.

Image

I am not yet fully sure about the root cause, but here is what I found so far.

The following analysis notes were prepared together with an AI assistant.

Observed behavior

For some remote accounts, custom profile fields appear broken into many one-character entries.

This seems to happen when accounts.field_htmls is stored as a JSON string rather than a JSON object.

For example, instead of something like:

{
  "Website": "<a href=\"https://example.com\">https://example.com</a>"
}

the DB contains a JSON string whose content is an object:

"{\"Website\":\"<a href=\\\"https://example.com\\\">https://example.com</a>\"}"

Then the Mastodon API serializer runs Object.entries(account.fieldHtmls), and if account.fieldHtmls is a string, JavaScript expands it into character index entries.

Relevant code:

  • src/entities/account.ts
    • fields: Object.entries(account.fieldHtmls).map(...)

Local investigation

On my instance, the affected rows look like this:

SELECT json_typeof(field_htmls), count(*)
FROM accounts
GROUP BY 1;

The affected rows have:

json_typeof = string

More specifically:

SELECT
  count(*) FILTER (WHERE (field_htmls::text) LIKE '\"{%') AS looks_like_object_string,
  count(*) FILTER (WHERE (field_htmls::text) LIKE '\"[%') AS looks_like_array_string,
  count(*) FILTER (
    WHERE (field_htmls::text) NOT LIKE '\"{%'
      AND (field_htmls::text) NOT LIKE '\"[%'
  ) AS other_string
FROM accounts
WHERE json_typeof(field_htmls) = 'string';

Result:

looks_like_object_string = 6128
looks_like_array_string  = 0
other_string             = 0

The affected rows were last updated during a specific historical window:

2024-10-15 through 2024-12-29

Recent rows do not seem to be newly affected.

Possible reason this became visible recently

This may have been latent data corruption that only became visible recently.

From a quick look, older Drizzle ORM versions used by Hollo had JSON column mapping that attempted to parse string values again:

mapFromDriverValue(value) {
  if (typeof value === "string") {
    try {
      return JSON.parse(value);
    } catch {
      return value;
    }
  }
  return value;
}

So even if the database stored field_htmls as a JSON string, the application may have received it as an object after the extra parse.

After the recent Drizzle ORM 1.0 rc2 upgrade, the JSON column implementation appears to use the newer codec = "json" behavior and no longer has that extra mapFromDriverValue() parsing step.

So my current guess is:

  1. Some remote account field_htmls rows were stored as JSON strings in the past.
  2. Older Drizzle versions accidentally hid this by parsing JSON strings again.
  3. After the Drizzle upgrade, the application receives the actual JSON string.
  4. Object.entries(account.fieldHtmls) then turns the string into per-character entries.

I am not 100% sure this is the whole story, but it seems to match the symptoms and the timing.

Relevant commits / areas

Potentially related recent change:

  • Upgrade Drizzle ORM to 1.0 rc2
    • f443210

Places where the malformed value becomes visible:

  • src/entities/account.ts
    • Mastodon API account serializer

Expected behavior

Custom profile fields should be returned as normal field name/value pairs.

Malformed historical field_htmls values probably should not result in per-character fields in API responses.

Actual behavior

If accounts.field_htmls is a JSON string, Mastodon API account serialization expands it with Object.entries(), producing one entry per character.

AI disclosure

This issue text was drafted with AI assistance (GPT-5.5) and reviewed by a human before submission.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Priority

    None yet

    Effort

    None yet

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions