Vote #81398
完了Code blocks - consistent rendering and retaining user-supplied language name in rendered HTML
0%
説明
This patch:
- Makes code block rendering consistent across formats, which is important for plugins, client-side scripts and styling.
- Makes sure rendered language name is escaped.
- Exposes user-supplied language name in an HTML attribute, so that client-side scripts and Redmine plugins can always access it.
The patch applies to the current Textile and Markdown formatters. The very same change to the planned CommonMark formatter will be attached to #32424.
Please see below for design decisions rationale.
h3. Code block rendering
@<pre><code>@ vs. -just @<pre>@-:
Textile supports blocks without code (@<pre>...</pre>@), while code can be part of such blocks. Typically, there is just one and only code block per @<pre>@ block, but it's possible to have more than one code block in a @<pre>@ block. So in general, we cannot avoid the @code@ tag in the HTML rendered from Textile.
While it is possible to render only the @pre@ tag for Markdown code blocks and rendering libraries do have option for that, we should use the @code@ tag too to be consistent. Fortunatelly, this is already the case and it is even what CommonMark spec suggests.
-No-code blocks in Markdown-
Markdown does not have preformatted blocks except for code blocks. Code blocks are either idented or fenced, but both are meant to represent code ("spec":https://spec.commonmark.org/0.29/#indented-code-blocks).
The fenced code blocks can also carry an info string, which is used to determine the languge of the code.
Comparison of different renderers:
|<. Formatter |. No-code block → HTML |. Code block w/o lang → HTML |. Code block with supported lang → HTML |. Code block with unsupported lang → HTML |
|<. Redmine Textile | @<pre>...</pre>@ | @<pre><code>...</code></pre>@ | @<pre><code class="ruby syntaxhl">...</code></pre>@ | @<pre><code>...</code></pre>@ |
|<. CommonMark reference renderer | N/A | @<pre><code>...</code></pre>@ | @<pre><code class="language-ruby">...</code></pre>@ | @<pre><code class="language-foo">...</code></pre>@ |
|<. Redmine Markdown | N/A | @<pre>...</pre>@ | @<pre><code class="ruby syntaxhl">...</code></pre>@ | @<pre>...</pre>@ |
|_<. Redmine CommonMark #32424 | N/A | @<pre><code>...</code></pre>@ | @<pre><code class="ruby syntaxhl">...</code></pre>@ | @<pre>...</pre>@ |
It's quite obvious that we should follow the Textile behavior to be consistent.
h3. Original data language
The user-supplied language information is currently dropped if the particular language does not have any lexer for syntax highlighting. However, there are several use cases when this information is valuable:
- Plugins like "redmine_wysiwyg_editor":https://redmine.org/plugins/redmine_wysiwyg_editor
- Data format converters like "redmine_reformat":https://redmine.org/plugins/redmine_reformat
- Client-side scripts that could render or decorate specific languages in their own way, e.g. "mermaid":https://mermaid-js.github.io/mermaid/#/
For the reasons above, the information has to be carried on the @code@ tag. I've examined a few projects and libraries related to the topic to find an appropriate name for such attribute. The conclusion is to use @<code data-language="foo" ...>@.
Resulting code block rendering:
|<. Formatter |. No-code block → HTML |. Code block w/o lang → HTML |. Code block with supported lang → HTML |. Code block with unsupported lang → HTML |
|<. All Redmine formatters | @<pre>...</pre>@ | @<pre><code>...</code></pre>@ |
@<pre><code class="ruby syntaxhl" data-language="ruby">...</code></pre>@ | @<pre><code data-language="foo">...</code></pre>@ |
h3. Quality Assurance, Security & Compatibility
- +QA+: Added new unit tests and updated the existing ones.
- +Security+: Previously, Textile formatter did not escape language name and relied on validating it against syntax highlighter, while expecting language names to be safe. Language names are now always subject to escaping.
- +Compatibility+: ** The rendered HTML is now the same as the HTML from Textile. ** The added @data-language@ attribute should be indeed ignored by unconcerned components. ** I don't really expect that anybody to rely on getting @<pre>@ and not @<pre><code>@, especially when it is the current behavior of Textile. I've verified this on redmine_wysiwyg_editor as a rare example of a potentially concerned plugin.
So I'm convinced it is safe to include this in any minor release.
journals
--------------------------------------------------------------------------------
...accidentaly marked this as a defect, it should have been patch. Thank you.
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Patches committed, thanks again Martin for your work on this.
--------------------------------------------------------------------------------
related_issues
relates,Closed,32424,CommonMark Markdown Text Formatting