extended-functionality.dita 9.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!DOCTYPE reference PUBLIC "-//OASIS//DTD DITA Reference//EN" "reference.dtd">
  3. <!-- This file is part of the DITA Open Toolkit project. See the accompanying LICENSE file for applicable license. -->
  4. <reference id="code-reference">
  5. <title>Extended codeblock processing</title>
  6. <titlealts>
  7. <navtitle>Codeblock extensions</navtitle>
  8. </titlealts>
  9. <shortdesc>DITA-OT provides additional processing support beyond that which is mandated by the DITA specification.
  10. These extensions can be used to define character encodings or line ranges for code references, normalize
  11. indendation, add line numbers or display whitespace characters in code blocks.</shortdesc>
  12. <prolog>
  13. <metadata>
  14. <keywords>
  15. <indexterm><xmlelement>coderef</xmlelement></indexterm>
  16. <indexterm><xmlelement>codeblock</xmlelement></indexterm>
  17. <indexterm><xmlatt>format</xmlatt></indexterm>
  18. <indexterm><xmlatt>outputclass</xmlatt></indexterm>
  19. <indexterm>encoding</indexterm>
  20. <indexterm><msgnum>DOTJ052E</msgnum></indexterm>
  21. <indexterm>character set</indexterm>
  22. </keywords>
  23. </metadata>
  24. </prolog>
  25. <refbody>
  26. <section id="coderef-charset">
  27. <title>Character set definition</title>
  28. <p>For <xmlelement>coderef</xmlelement> elements, DITA-OT supports defining the code reference target file
  29. encoding using the <xmlatt>format</xmlatt> attribute. The supported format is:</p>
  30. <codeblock>format (";" space* "charset=" charset)?</codeblock>
  31. <p>If a character set is not defined, the system default character set will be used. If the character set is not
  32. recognized or supported, the <msgnum>DOTJ052E</msgnum> error is thrown and the system default character set is
  33. used as a fallback.</p>
  34. <codeblock outputclass="language-xml">&lt;coderef href="unicode.txt" format="txt; charset=UTF-8"/></codeblock>
  35. <p>As of DITA-OT 3.3, the default character set for code references can be changed by adding the
  36. <parmname>default.coderef-charset</parmname> key to the
  37. <xref keyref="configuration-properties-file">configuration.properties</xref> file:</p>
  38. <codeblock outputclass="language-properties">default.coderef-charset = ISO-8859-1</codeblock>
  39. <p>The character set values are those supported by the Java
  40. <xref format="html" href="https://docs.oracle.com/javase/8/docs/api/java/nio/charset/Charset.html"
  41. scope="external">Charset</xref> class.</p>
  42. </section>
  43. <section>
  44. <title>Line range extraction</title>
  45. <p>Code references can be limited to extract only a specified line range by defining the
  46. <codeph>line-range</codeph> pointer in the URI fragment. The format is:</p>
  47. <codeblock>uri ("#line-range(" start ("," end)? ")" )?</codeblock>
  48. <p>Start and end line numbers start from 1 and are inclusive. If the end range is omitted, the range ends on the
  49. last line of the file.</p>
  50. </section>
  51. <example>
  52. <codeblock outputclass="language-xml">&lt;coderef href="Parser.scala#line-range(5, 10)" format="scala"/></codeblock>
  53. <p>Only lines from 5 to 10 will be included in the output.</p>
  54. </example>
  55. <section>
  56. <title>RFC 5147</title>
  57. <indexterm>RFC 5147</indexterm>
  58. <p>DITA-OT also supports the line position and range syntax from
  59. <xref keyref="rfc5147"/>. The format for line range is:</p>
  60. <codeblock>uri ("#line=" start? "," end? )?</codeblock>
  61. <p>Start and end line numbers start from 0 and are inclusive and exclusive, respectively. If the start range is
  62. omitted, the range starts from the first line; if the end range is omitted, the range ends on the last line of
  63. the file. The format for line position is:</p>
  64. <codeblock>uri ("#line=" position )?</codeblock>
  65. <p>The position line number starts from 0.</p>
  66. </section>
  67. <example>
  68. <codeblock outputclass="language-xml">&lt;coderef href="Parser.scala#line=4,10" format="scala"/></codeblock>
  69. <p>Only lines from 5 to 10 will be included in the output.</p>
  70. </example>
  71. <section>
  72. <title>Line range by content</title>
  73. <p>Instead of specifying line numbers, you can also select lines to include in the code reference by specifying
  74. keywords (or “<term>tokens</term>”) that appear in the referenced file.</p>
  75. <div id="coderef-by-content">
  76. <p>DITA-OT supports the <codeph>token</codeph> pointer in the URI fragment to extract a line range based on the
  77. file content. The format for referencing a range of lines by content is:</p>
  78. <codeblock>uri ("#token=" start? ("," end)? )?</codeblock>
  79. <p>Lines identified using start and end tokens are exclusive: the lines that contain the start token and end
  80. token will be not be included. If the start token is omitted, the range starts from the first line in the
  81. file; if the end token is omitted, the range ends on the last line of the file. </p>
  82. </div>
  83. </section>
  84. <example>
  85. <p>Given a Haskell source file named <filepath>fact.hs</filepath> with the following content,</p>
  86. <codeblock outputclass="language-haskell normalize-space show-line-numbers show-whitespace"><coderef href="../resources/fact.hs"/></codeblock>
  87. <p>a range of lines can be referenced as:</p>
  88. <codeblock outputclass="language-xml">&lt;coderef href="fact.hs#token=START-FACT,END-FACT"/></codeblock>
  89. <p>to include the range of lines that follows the <codeph>START-FACT</codeph> token on Line 1, up to (but not
  90. including) the line that contains the <codeph>END-FACT</codeph> token (Line 5). The resulting
  91. <xmlelement>codeblock</xmlelement> would contain lines 2–4:</p>
  92. <codeblock outputclass="language-haskell"><coderef href="../resources/fact.hs#token=START-FACT,END-FACT"/></codeblock>
  93. <note type="tip" id="coderef-by-content-tip">This approach can be used to reference code samples that are
  94. frequently edited. In these cases, referencing line ranges by line number can be error-prone, as the target line
  95. range for the reference may shift if preceding lines are added or removed. Specifying ranges by line content
  96. makes references more robust, as long as the <codeph>token</codeph> keywords are preserved when the referenced
  97. resource is modified.</note></example>
  98. <refbodydiv id="normalize-codeblock-whitespace">
  99. <section>
  100. <title>Whitespace normalization</title>
  101. <indexterm>whitespace handling</indexterm>
  102. <p>DITA-OT can adjust the leading whitespace in code blocks to remove excess indentation and keep lines short.
  103. Given an XML snippet in a codeblock with lines that all begin with spaces (indicated here as dots “·”),</p>
  104. </section>
  105. <example>
  106. <p><codeblock outputclass="language-xml">··&lt;subjectdef keys="audience">
  107. ····&lt;subjectdef keys="novice"/>
  108. ····&lt;subjectdef keys="expert"/>
  109. ··&lt;/subjectdef></codeblock></p>
  110. <p>DITA-OT can remove the leading whitespace that is common to all lines in the code block. To trim the excess
  111. space, set the <xmlatt>outputclass</xmlatt> attribute on the <xmlelement>codeblock</xmlelement> element to
  112. include the <codeph>normalize-space</codeph> keyword.</p>
  113. <p>In this case, two spaces (“··”) would be removed from the beginning of each line, shifting content to the
  114. left by two characters, while preserving the indentation of lines that contain additional whitespace (beyond
  115. the common indent):</p>
  116. <p><codeblock outputclass="language-xml">&lt;subjectdef keys="audience">
  117. ··&lt;subjectdef keys="novice"/>
  118. ··&lt;subjectdef keys="expert"/>
  119. &lt;/subjectdef></codeblock></p>
  120. </example>
  121. </refbodydiv>
  122. <refbodydiv id="visualize-codeblock-whitespace">
  123. <section>
  124. <title>Whitespace visualization (PDF)</title>
  125. <p>DITA-OT can be set to display the whitespace characters in code blocks to visualize indentation in PDF
  126. output.</p>
  127. <p>To enable this feature, set the <xmlatt>outputclass</xmlatt> attribute on the
  128. <xmlelement>codeblock</xmlelement> element to include the <codeph>show-whitespace</codeph> keyword.</p>
  129. <p>When PDF output is generated, space characters in the code will be replaced with a middle dot or “interpunct”
  130. character ( <codeph>·</codeph> ); tab characters are replaced with a rightwards arrow and three spaces
  131. ( <codeph>→   </codeph> ).</p>
  132. </section>
  133. <example deliveryTarget="pdf">
  134. <fig>
  135. <title>Sample Java code with visible whitespace characters <i>(PDF only)</i></title>
  136. <codeblock outputclass="language-java show-whitespace"> for i in 0..10 {
  137. println(i)
  138. }</codeblock>
  139. </fig>
  140. </example>
  141. </refbodydiv>
  142. <refbodydiv id="codeblock-line-numbers">
  143. <section>
  144. <title>Line numbering (PDF)</title>
  145. <indexterm>line numbering</indexterm>
  146. <p>DITA-OT can be set to add line numbers to code blocks to make it easier to distinguish specific lines.</p>
  147. <p>To enable this feature, set the <xmlatt>outputclass</xmlatt> attribute on the
  148. <xmlelement>codeblock</xmlelement> element to include the <codeph>show-line-numbers</codeph> keyword.</p>
  149. </section>
  150. <example deliveryTarget="pdf">
  151. <fig>
  152. <title>Sample Java code with line numbers and visible whitespace characters <i>(PDF only)</i></title>
  153. <codeblock outputclass="language-java show-line-numbers show-whitespace"> for i in 0..10 {
  154. println(i)
  155. }</codeblock>
  156. </fig>
  157. </example>
  158. </refbodydiv>
  159. </refbody>
  160. </reference>