Skip to content

Don't remove stars from alt attribute of an image #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
talatkuyuk opened this issue Apr 28, 2025 · 4 comments
Closed
4 tasks done

Don't remove stars from alt attribute of an image #44

talatkuyuk opened this issue Apr 28, 2025 · 4 comments
Labels
🙋 no/question This does not need any changes 👎 phase/no Post cannot or will not be acted on

Comments

@talatkuyuk
Copy link

Initial checklist

Affected package

mdast-util-from-markdown

Steps to reproduce

const input = '![*text*](image.png "*title*")';

console.dir(fromMarkdown(input), {depth: null});

Actual behavior

It removes the stars from "alt", keeps stars in "title"

{
  "alt": "text",
  "title": "*title*",
  "type": "image",
  "url": "image.png",
},

Expected behavior

It should keep stars in "alt" as well.

{
  "alt": "*text*",
  "title": "*title*",
  "type": "image",
  "url": "image.png",
},

But, I see that commonmark specs also removes stars from "alt". I couldn't get the rationale of removing stars from "alt", whileas doesn't remove stars from "title". It would be better to keep stars in "alt", so I would use it as a directive for images for transformation in a plugin.

I know, you will close the issue pointing commonmark specs. I just wanted to know the rationale in commonmark specs?

(why I opened this issue is because the next question will be why curly braces are also removed from "alt" attribute of an image when remark-mdx parse it; at least I would expect curly braces are kept in "alt" when remark-mdx parse it since commonmark specs also keeps the curly braces in alt attribute. This issue is critic for me because of this discussion)

Runtime

node@latest

Package manager

npm@latest

Operating system

macos@latest

Build and bundle tools

No response

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Apr 28, 2025
@ChristianMurphy
Copy link
Member

It comes from this line in the spec

Though this spec is concerned with parsing, not rendering, it is recommended that in rendering to HTML, only the plain string content of the image description be used.

https://spec.commonmark.org/0.31.2/#example-575

And is because alt is intended for screen readers and extra symbols are for sighted users and are handled inconsistency (at best) by screen readers, traditionally asterisk are ignored by screen readers and should not be included in text or alt text intended to be consumed by a reader.
https://www.deque.com/blog/dont-screen-readers-read-whats-screen-part-1-punctuation-typographic-symbols/

@ChristianMurphy ChristianMurphy added the 🙋 no/question This does not need any changes label Apr 28, 2025
@github-actions github-actions bot added 👎 phase/no Post cannot or will not be acted on and removed 🤞 phase/open Post is being triaged manually labels Apr 28, 2025
@talatkuyuk
Copy link
Author

talatkuyuk commented May 1, 2025

Thank you for the clarification, @ChristianMurphy, that makes sense in the context of accessibility and screen readers. I got the rationale.

However, I still find the decision to strip only the asterisk (*) character from image alt text somewhat questionable. This specific exclusion suggests that the motivation might be more than just accessibility.

Looking at the CommonMark spec, link texts [text]() are parsed as inline markdown and can include formatting elements like strong and em, which correspond to the same syntactic position as the image alt. My assumption is that the spec authors wanted to prevent similar parsing behavior from happening in image alt text, especially since alt attributes are plain strings and not rendered HTML. From this perspective, disallowing * could be a preemptive measure to avoid confusion or misuse.

Still, this feels overly prescriptive. Users might initially include asterisks in image descriptions for inline markdown parsing, but upon seeing that the asterisks are rendered literally in the fallback alt text when the image fails to load (or read aloud awkwardly by screen readers), they would naturally adjust their behavior and remove them theirselves. Over time, such feedback would discourage misuse organically, without needing to enforce this restriction at the parser level.

In my view, this behavior limits flexibility. I suspect it stems from the desire to avoid the formatting complexity already present in link labels, and that rationale was extended, according to me unnecessarily, to image descriptions.

Anyway, I got it and nothing to do since the commonmark spec is as it is.

My main question: why curly braces {} are also removed from alt attribute of an image when parsed by remark-mdx; At least I would expect curly braces are kept in "alt" when remark-mdx parse it since commonmark specs also keeps the curly braces in alt attribute. %100 commonmark compliance is always mentioned in the docs! Curly braces in an image like ![{here}](...) should not be removed, especially considering that MDX is still markdown at its core, just with JSX extensions. Should I open an issue on remark-mdx for this?

@ChristianMurphy
Copy link
Member

In my view, this behavior limits flexibility. I suspect it stems from the desire to avoid the formatting complexity already present in link labels, and that rationale was extended, according to me unnecessarily, to image descriptions.

If you disagree with the approach, commonmark has a discussion forum where you could share your thoughts with the spec team https://talk.commonmark.org/

Should I open an issue on remark-mdx for this?

Yeah if you want to discuss more, that is specific to remark-mdx.

My main question: why curly braces {}

{} are a JSX inline variable, which is not allowed to appear in the link text or alt in MDX.

@talatkuyuk
Copy link
Author

talatkuyuk commented May 2, 2025

Thanks @ChristianMurphy, I will continue with https://talk.commonmark.org/ and remark-mdx, then.

{} are a JSX inline variable, which is not allowed to appear in the link text or alt in MDX.

Regarding {},

  • mdx expressions in the link text for example [**{name}**](...) are interpolated, parsed into inline markdown, and produce a JSX like <strong>{name}</strong> equivalent, as expected in MDX, it is okey.
  • however, in the image alt for example ![{name}](...) produces <img src="..." alt="name">, but should produce <img src="..." alt="{name}"> as the commonmark spec says. If I use double curly braces in alt, one curly braces appear in my tests. ![{{name}}](...) produces <img src="..." alt="{name}">, meaningly mdx parser does not remove all curly braces. I mean, the authors of the MDX spec should allow to appear {} in alt attribute in order to meet the commonmark compliance. Whereas the title of image/link can hold {} even in MDX, why alt can not hold {} inline with commonmark. Why I've insisted on that issue this is because of that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙋 no/question This does not need any changes 👎 phase/no Post cannot or will not be acted on
Development

No branches or pull requests

2 participants