View on GitHub

udoxy

Guidelines and script (bash) for generic standalone code documentation

Rationale for guidelines adoption

Why documenting your code (if that’s not already obvious)?

Beyond enabling the sharing and reuse of your code, the practical benefits of documenting it are in enabling reproducibility and verification, as well as possible extension and potential migration:

Hence, a good documentation is not only useful for the users to run and (re)use your code, but it will also help developers to maintain, share, extend, and migrate this code.

As stated in (Ince et al., 2011), “with some exceptions, anything less than the release of source programs is intolerable for results that depend on computation”. Ultimately, we believe that one should “provide public access to scripts, runs, and results” (Sandve et al., 2013), hence not only the outcomes of a given analysis, but the whole processes, data and tools necessary to produce it should be open and shared. Source code documentation overall supports these objectives.

Why adopting markdown for the documentation?

Lightweight markup languages, e.g. markdown, AsciiDoc, provide formats that are both processable by documentation generators, and easily readable by human produsers (see also comparison between languages).

Language Supported implementations Output formats
XHTML PDF DocBook ODFDoc
AsciiDoc Python, JavaScript, Ruby Yes Yes Yes YesYes
markdown (and variants) C, C#, Java, R, Python, JavaScript, Ruby, PHP, Perl, Haskell Yes (HTML) Yes Yes YesYes
MediaWiki PHP, Perl, Haskell Yes No No NoNo
reStructuredText Java, Python, Haskell Yes (HTML,XML) Yes Yes YesNo

For some languages, the literature may provide “consistent” examples of documentation, still they are often not generic enough and do not go beyond the inline documentation (targeting the developer, not the user). For instance, there is, to our knowledge, no documentation framework or built-in tool that is compatible with SAS (a tool like DocItOut is not maintained since 2008 anymore).

Based on the results reported in the wiki mentioned above, we preselected 4 markup languages that are: (i) widely adopted opens-source, (ii) enable HTML import/export (note though that Textile does not enable HTML import), and (iii) are supported (possibly through different documentation generators) by more than one language. Finally, markdown language shall be adopted:

Note that it is also important that the use of a specific documentation style (possibly associated to a given generator) does not alter the natural documentation of a language (intrinsic to the language itself). In many languages (like SAS or Stata), it does not represent an issue since the documentation is inserted as comments like in C language.

Why using Doxygen as the documentation generator?

In order to create portable documentation, documentation generators can be used, Such tools - e.g. well-known javadoc - generate software documentation from internal code comments.

generator
Doc++ Doxygen HeaderDoc Natural Docs RoBODoc Sphinx
programming languages C/C++ Yes Yes Yes Yes (partial) Yes Yes
C# Yes Yes
Java Yes Yes Yes Yes (partial) Yes
Python Yes Yes Yes (partial) Yes Yes
JavaScript Yes Yes Yes Yes
IDL Yes Yes Yes Yes
PHP Yes Yes Yes (partial) Yes Yes
Perl Yes Yes Yes
Ruby Yes Yes (partial) Yes Yes
SQL Yes Yes (partial) Yes
Visual Basic Yes (plugin) Yes (partial) Yes (plugin)
R
output types HTML Yes Yes Yes Yes Yes Yes
XML Yes Yes Yes
DocBook Yes Yes
man Yes Yes Yes Yes
RTF Yes Yes Yes
PDF/PS Yes Yes Yes Yes
LaTex Yes Yes Yes Yes

Based on the results reported in the previously mentioned wiki, we preselected 6 documentation generators that are: (i) open source, (ii) multi-platform, i.e. running on Windows, Linux, Unix, Mac OS X and BSD operating systems (note though that HeaderDoc is not directly running on Windows), and (iii) support more than one language only. Our final choice is Doxygen also because it provides support to markdown.

References