Documentation Best Practices
============================
This guide provides practical advice for creating high-quality documentation for scientific Python projects, with a focus on beginner-friendly approaches and integration with lab workflows.
Why Documentation Matters
--------------------------
Good documentation:
* π **Helps future you**: Remember what you did 6 months from now
* π€ **Enables collaboration**: Team members can understand and use your code
* π¬ **Ensures reproducibility**: Others can replicate your analysis
* π **Supports publication**: Methods section writes itself
* βοΈ **Meets compliance**: Required for regulated industries
* π **Facilitates learning**: Onboard new lab members faster
**Reality check**: Writing documentation takes time, but saves much more time later.
Documentation Types
-------------------
Different documentation serves different purposes:
.. list-table:: Documentation Types
:widths: 25 35 40
:header-rows: 1
* - Type
- Purpose
- Examples
* - Code Documentation
- Explain how functions work
- Docstrings, inline comments
* - User Guide
- Show how to use the code
- Tutorials, examples, workflows
* - API Reference
- List all functions/classes
- Auto-generated from docstrings
* - Theory/Methods
- Explain the science
- Equations, algorithms, references
* - Contributing Guide
- Help others contribute
- Setup, style guide, PR process
**For beginners**: Start with docstrings and one example. Expand from there.
Writing Docstrings
------------------
Docstrings are documentation inside your code. Python uses NumPy-style docstrings:
Basic Structure
~~~~~~~~~~~~~~~
.. code-block:: python
def function_name(parameter1, parameter2):
"""
One-line summary of what the function does.
Longer description with more details about the function,
its purpose, and how it works. Can span multiple lines.
Parameters
----------
parameter1 : type
Description of the first parameter.
parameter2 : type
Description of the second parameter.
Returns
-------
return_type
Description of what the function returns.
Examples
--------
>>> result = function_name(1, 2)
>>> print(result)
3
Notes
-----
Any additional notes, warnings, or important information.
See Also
--------
related_function : Brief description of relation
"""
# Function implementation
pass
Docstring Sections
~~~~~~~~~~~~~~~~~~
**Required sections**:
* **Summary line**: One line explaining what it does
* **Parameters**: All input parameters with types
* **Returns**: What the function returns
**Recommended sections**:
* **Examples**: Actual code showing usage
* **Notes**: Important details, limitations, or warnings
* **See Also**: Links to related functions
**Optional sections**:
* **References**: Citations to papers or books
* **Raises**: Exceptions that might be raised
Example: Good Docstring
~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python
def calculate_peak_area(time, absorbance, start_idx, end_idx):
"""
Calculate the area under a chromatographic peak.
Uses trapezoidal integration to compute the peak area between
specified indices. This is a standard method for quantitative
chromatography analysis.
Parameters
----------
time : np.ndarray
Time values in minutes (1D array).
absorbance : np.ndarray
Absorbance values in mAU (1D array).
start_idx : int
Index of peak start in the data arrays.
end_idx : int
Index of peak end in the data arrays.
Returns
-------
area : float
Peak area in mAUΒ·min.
Examples
--------
>>> time = np.array([0.0, 0.1, 0.2, 0.3, 0.4])
>>> absorbance = np.array([2.0, 5.0, 8.0, 5.0, 2.0])
>>> area = calculate_peak_area(time, absorbance, 0, 4)
>>> print(f"Peak area: {area:.2f} mAUΒ·min")
Peak area: 2.00 mAUΒ·min
Notes
-----
For eLabFTW integration, document integration parameters
(start/end indices, baseline correction) in the experiment
record for full traceability.
References
----------
.. [1] Snyder, L. R., et al. (2010). Introduction to Modern
Liquid Chromatography. Wiley.
"""
return np.trapezoid(absorbance[start_idx:end_idx+1],
time[start_idx:end_idx+1])
Type Hints
~~~~~~~~~~
Use type hints for clarity:
.. code-block:: python
from typing import Tuple, List, Optional
import numpy as np
def find_peaks(time: np.ndarray,
absorbance: np.ndarray,
threshold: float = 10.0) -> List[dict]:
"""
Find peaks in chromatogram data.
Parameters
----------
time : np.ndarray
Time values.
absorbance : np.ndarray
Absorbance values.
threshold : float, optional
Minimum peak height (default: 10.0).
Returns
-------
list of dict
Each dict contains 'retention_time', 'height', 'area'.
"""
pass
Type hints help:
* Users understand what inputs are expected
* IDEs provide better autocomplete
* Tools can catch type errors
Writing User Guides
-------------------
User guides show *how* to use your code. Key principles:
Start with Installation
~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: rst
Installation
============
Requirements
------------
* Python 3.7 or higher
* NumPy 1.19.0 or higher
* eLabFTW instance (optional, for full workflow)
Install
-------
Clone the repository:
.. code-block:: bash
git clone https://github.com/username/project.git
cd project
Install dependencies:
.. code-block:: bash
pip install -r requirements.txt
Provide Quick Start
~~~~~~~~~~~~~~~~~~~
.. code-block:: rst
Quick Start
===========
Analyze an HPLC chromatogram in 3 steps:
1. **Load data**:
.. code-block:: python
from hplc_analysis import load_chromatogram
time, absorbance = load_chromatogram('data/sample.txt')
2. **Find peaks**:
.. code-block:: python
from hplc_analysis import find_peaks
peaks = find_peaks(time, absorbance, threshold=50.0)
3. **View results**:
.. code-block:: python
for i, peak in enumerate(peaks, 1):
print(f"Peak {i}: {peak['retention_time']:.2f} min")
Include Complete Examples
~~~~~~~~~~~~~~~~~~~~~~~~~~
Show realistic, complete workflows:
.. code-block:: rst
Complete Workflow Example
=========================
This example shows the full analysis pipeline from data loading
to results reporting, including eLabFTW integration.
Step 1: Reference eLabFTW Experiment
-------------------------------------
Before starting, note your eLabFTW experiment ID::
Experiment: #67890
URL: https://your-instance.org/experiments.php?mode=view&id=67890
Step 2: Load and Analyze Data
------------------------------
.. code-block:: python
import numpy as np
from hplc_analysis import analyze_chromatogram
# Reference eLabFTW in code
# Experiment: https://your-instance.org/experiments.php?mode=view&id=67890
results = analyze_chromatogram('data/sample.txt', threshold=50.0)
print(f"Found {results['n_peaks']} peaks")
for peak in results['peaks']:
print(f" RT: {peak['retention_time']:.2f} min")
print(f" Area: {peak['area']:.1f} mAUΒ·min")
Step 3: Document in eLabFTW
----------------------------
Upload results to your eLabFTW experiment and include:
* Peak table (CSV export)
* Analysis parameters used
* GitHub commit hash for reproducibility
Using Sphinx
------------
Sphinx converts documentation to beautiful HTML. Key concepts:
reStructuredText Basics
~~~~~~~~~~~~~~~~~~~~~~~~
reStructuredText (.rst files) is a markup language:
.. code-block:: rst
Section Heading
===============
Subsection
----------
**Bold text** and *italic text*
Bullet lists:
* First item
* Second item
* Third item
Numbered lists:
1. Step one
2. Step two
3. Step three
Code blocks:
.. code-block:: python
import numpy as np
x = np.array([1, 2, 3])
External links:
`Python `_
Cross-references:
See :doc:`other_page` for more details.
See :func:`module.function` for API docs.
Mathematical Equations
~~~~~~~~~~~~~~~~~~~~~~
Use LaTeX syntax for equations:
.. code-block:: rst
The peak area :math:`A` is calculated as:
.. math::
A = \int_{t_1}^{t_2} h(t) \, dt
where :math:`h(t)` is the absorbance at time :math:`t`.
This renders as:
The peak area :math:`A` is calculated as:
.. math::
A = \int_{t_1}^{t_2} h(t) \, dt
where :math:`h(t)` is the absorbance at time :math:`t`.
Cross-References
~~~~~~~~~~~~~~~~
Link between documentation pages:
.. code-block:: rst
# Link to another page
See :doc:`elabftw_integration` for details.
# Link to a section
See :ref:`writing-docstrings` for examples.
# Link to a function
Use :func:`hplc_analysis.find_peaks` to detect peaks.
# Link to a module
See :mod:`hplc_analysis` for all analysis functions.
Auto-Generated API Docs
~~~~~~~~~~~~~~~~~~~~~~~
Sphinx can automatically generate API documentation from docstrings:
.. code-block:: rst
API Reference
=============
HPLC Analysis Module
--------------------
.. automodule:: hplc_analysis
:members:
:undoc-members:
:show-inheritance:
This creates a complete API reference from your docstrings.
Organizing Documentation
-------------------------
Structure for Small Projects
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: text
docs/source/
βββ index.rst # Homepage with quick start
βββ installation.rst # How to install
βββ tutorial.rst # Step-by-step guide
βββ api.rst # Auto-generated API docs
βββ elabftw.rst # eLabFTW integration guide
Structure for Larger Projects
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: text
docs/source/
βββ index.rst
βββ getting_started/
β βββ installation.rst
β βββ quickstart.rst
β βββ configuration.rst
βββ user_guide/
β βββ basic_usage.rst
β βββ advanced_topics.rst
β βββ elabftw_integration.rst
βββ api_reference/
β βββ analysis.rst
β βββ visualization.rst
β βββ utilities.rst
βββ developer_guide/
βββ contributing.rst
βββ testing.rst
βββ release_process.rst
**For beginners**: Start with the simple structure. Reorganize as you grow.
eLabFTW Integration in Docs
----------------------------
Show Complete Workflows
~~~~~~~~~~~~~~~~~~~~~~~~
Document the full cycle:
.. code-block:: rst
Laboratory Workflow
===================
Before Running Analysis
-----------------------
1. Create eLabFTW experiment record
2. Document method and parameters
3. Upload raw data files
4. Note experiment ID for reference
During Analysis
---------------
1. Reference eLabFTW ID in code header
2. Load data with eLabFTW references
3. Run analysis with documented parameters
4. Save intermediate results
After Analysis
--------------
1. Upload results to eLabFTW
2. Link GitHub commit for traceability
3. Document analysis parameters used
4. Review and validate results
Provide Examples with eLabFTW Links
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: rst
Example Analysis
================
This example analyzes HPLC data from eLabFTW experiment #67890:
https://your-elabftw-instance.org/experiments.php?mode=view&id=67890
The experiment used equipment EQUIP-12345 (HPLC-UV system):
https://your-elabftw-instance.org/database.php?mode=view&id=EQUIP-12345
Method details, calibration data, and quality control results
are documented in the eLabFTW records.
Document Where Things Go
~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: rst
Data Organization
=================
.. list-table::
:widths: 30 35 35
:header-rows: 1
* - Data Type
- Store In
- Notes
* - Raw instrument data
- eLabFTW (attached to experiment)
- Original files, unmodified
* - Analysis scripts
- GitHub repository
- Version controlled
* - Processed results
- eLabFTW (attached to experiment)
- With GitHub commit reference
* - Protocols/methods
- eLabFTW (database)
- Shared across experiments
* - Small test data
- GitHub (data/ folder)
- For examples and testing
Version Control for Documentation
----------------------------------
Documentation needs version control too:
Commit Documentation Changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: bash
# Good commit messages for docs
git commit -m "Add HPLC analysis tutorial with examples"
git commit -m "Fix broken links in API reference"
git commit -m "Update installation instructions for Windows"
Keep Docs Synchronized
~~~~~~~~~~~~~~~~~~~~~~~
When changing code:
1. **Update docstrings** if function signatures change
2. **Update examples** if usage patterns change
3. **Update tutorials** if workflows change
4. **Check cross-references** still work
5. **Rebuild docs** to catch issues
Version Documentation
~~~~~~~~~~~~~~~~~~~~~
For releases, document the version:
.. code-block:: python
# In docs/source/conf.py
version = '1.0' # Short X.Y version
release = '1.0.0' # Full version
Tag releases in Git:
.. code-block:: bash
git tag -a v1.0.0 -m "Version 1.0.0 release"
git push origin v1.0.0
Common Mistakes to Avoid
-------------------------
β **No Documentation**
"The code is self-documenting" - No, it's not.
β
**Fix**: Start with minimal docstrings. Expand gradually.
β **Outdated Documentation**
Documentation describes old version of code.
β
**Fix**: Update docs when changing code. Test examples.
β **Assuming Too Much Knowledge**
"Obviously you need to preprocess the data first..."
β
**Fix**: Write for beginners. Explain each step.
β **No Examples**
Just parameter lists without showing usage.
β
**Fix**: Add at least one working example per function.
β **Broken Links**
Cross-references point to nonexistent pages.
β
**Fix**: Build docs locally and check for warnings.
β **Inconsistent Style**
Random mix of formatting and organization.
β
**Fix**: Follow this template. Be consistent.
Tools and Resources
-------------------
Documentation Tools
~~~~~~~~~~~~~~~~~~~
* **Sphinx**: Main documentation generator
* **sphinx-rtd-theme**: Professional Read the Docs theme
* **napoleon**: NumPy docstring support
* **autodoc**: Auto-generate API docs
* **doctest**: Test code examples in docstrings
Checking Documentation
~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: bash
# Build and check for warnings
cd docs
make clean
make html
# Check for broken links
make linkcheck
# Spell check (if aspell installed)
find source -name "*.rst" -exec aspell check {} \;
Learning Resources
~~~~~~~~~~~~~~~~~~
* `Sphinx Documentation `_
* `NumPy Docstring Guide `_
* `Write the Docs `_
* `Documentation Guide `_
Example Documentation Checklist
--------------------------------
Use this checklist for your documentation:
.. code-block:: text
Code Documentation
==================
[ ] All functions have docstrings
[ ] Docstrings include Parameters and Returns
[ ] At least one example per main function
[ ] Type hints on function signatures
[ ] Complex logic has inline comments
[ ] eLabFTW references where applicable
User Guide
==========
[ ] Installation instructions included
[ ] Quick start example provided
[ ] Complete workflow example shown
[ ] eLabFTW integration explained
[ ] Troubleshooting section exists
[ ] Links to external resources
API Reference
=============
[ ] Auto-generated from docstrings
[ ] All modules documented
[ ] Cross-references work
[ ] Examples build without errors
Build and Deploy
================
[ ] Builds locally without errors
[ ] No broken links (linkcheck passes)
[ ] Equations render correctly
[ ] GitHub Actions workflow configured
[ ] Deploys to GitHub Pages
Getting Help
------------
If you're stuck on documentation:
1. **Check existing examples** in this repository
2. **Search Sphinx documentation** for specific features
3. **Look at similar projects** (NumPy, SciPy, pandas)
4. **Ask in discussions** on GitHub
5. **Start simple** and improve iteratively
Remember: **Perfect documentation doesn't exist**. Good enough documentation that actually exists is infinitely better than perfect documentation that never gets written.
See Also
--------
* :doc:`index` - Main documentation
* :doc:`elabftw_integration` - eLabFTW integration guide
* :mod:`hplc_analysis` - Example code with complete documentation