Scientific Python Documentation with eLabFTW Integration

Welcome! This repository provides a beginner-friendly template for Python projects in scientific labs, demonstrating how to create professional documentation with Sphinx and integrate with eLabFTW electronic lab notebooks for complete experimental traceability.

What You’ll Learn

This template demonstrates:

  • πŸ“– Professional Documentation: Using Sphinx to create beautiful, searchable docs

  • πŸ”— eLabFTW Integration: Linking code to experiment records for full traceability

  • πŸ§ͺ Complete Workflows: From lab experiment β†’ data β†’ analysis β†’ results

  • 🀝 Best Practices: Industry-standard approaches for scientific computing

  • πŸš€ Automated Deployment: Publishing docs to GitHub Pages automatically

Perfect for scientists, students, and lab users starting with Python, GitHub, and modern lab data management.

Why This Template?

Traditional Lab Workflow Problems

❌ Lab notebook entries disconnected from analysis code ❌ Data files scattered across computers ❌ No clear link between experiments and results ❌ Hard to reproduce analyses months later ❌ Methods sections written from memory

Modern Integrated Solution

βœ… eLabFTW stores experiment protocols, equipment info, and raw data βœ… GitHub manages analysis code with full version history βœ… Python scripts reference eLabFTW experiment IDs directly βœ… Sphinx docs explain methods and link everything together βœ… Automated publishing keeps documentation current

Result: Complete traceability from experiment to publication.

Quick Start

  1. Clone the repository:

git clone https://github.com/B-Wie/how_to_document_your_reposiory_with_sphinx.git
cd how_to_document_your_reposiory_with_sphinx
  1. Install dependencies:

pip install -r requirements.txt
  1. Try the example:

python src/hplc_analysis.py
  1. Build the documentation:

cd docs
make html

Then open docs/build/html/index.html in your browser.

Example: HPLC Chromatogram Analysis

This repository includes a complete HPLC analysis example demonstrating eLabFTW integration:

Loading Data with eLabFTW References

The sample data file includes eLabFTW experiment references:

# HPLC Chromatogram Data
# Experiment: eLabFTW #67890
# URL: https://your-elabftw-instance.org/experiments.php?mode=view&id=67890
# Equipment: HPLC-UV (eLabFTW ID: EQUIP-12345)
# Time(min)    Absorbance(mAU)
0.00    2.1
...

Analysis Code with Documentation

The analysis script demonstrates:

  • NumPy-style docstrings with complete parameter documentation

  • eLabFTW references in module and function docstrings

  • Type hints for clarity

  • Working examples with sample data

Try it:

python src/hplc_analysis.py

Output:

HPLC Chromatogram Analysis
==================================================
eLabFTW Experiment: https://your-elabftw-instance.org/experiments.php?mode=view&id=67890
eLabFTW Equipment: https://your-elabftw-instance.org/database.php?mode=view&id=EQUIP-12345

File: data/sample_hplc_chromatogram.txt
Data points: 101
Time range: 0.00 - 10.00 min
Baseline: 2.40 mAU

Detected 2 peaks:
--------------------------------------------------
Peak 1: RT=2.10 min, Height=98.7 mAU
Peak 2: RT=5.70 min, Height=122.3 mAU

Peak Resolution: Rs = 1.71

What is eLabFTW?

eLabFTW is a free, open-source electronic lab notebook designed for research laboratories. It provides:

  • Experiment Documentation: Record protocols, observations, and results

  • Equipment Database: Track instruments, specifications, and maintenance

  • Data Management: Store and organize research data with versioning

  • Collaboration: Share experiments with team members

  • Compliance: Meet FDA 21 CFR Part 11, GLP regulations

  • Persistent IDs: Every experiment gets a permanent reference number

Combined with GitHub for code and Sphinx for documentation, eLabFTW completes the research workflow:

eLabFTW Experiment β†’ GitHub Code β†’ Python Analysis β†’ Results back to eLabFTW

(Protocol & Data) β†’ (Version Control) β†’ (Reproducible) β†’ (Complete Record)

The Complete Workflow

Step 1: Document in eLabFTW

Create experiment record with:

  • Protocol and method details

  • Equipment used (with database IDs)

  • Raw data files uploaded

  • Note the experiment ID (e.g., #67890)

Step 2: Write Analysis Code

Reference eLabFTW in your Python scripts:

"""
HPLC Analysis Script

eLabFTW References:
- Experiment: https://your-instance.org/experiments.php?mode=view&id=67890
- Equipment: https://your-instance.org/database.php?mode=view&id=EQUIP-12345
"""

from hplc_analysis import analyze_chromatogram

# Load data (file header includes eLabFTW reference)
results = analyze_chromatogram('data/sample.txt')

Step 3: Generate Documentation

Sphinx automatically builds docs from your docstrings:

cd docs
make html

Documentation includes:

  • Auto-generated API reference

  • eLabFTW integration guide

  • Best practices for scientific docs

  • Working code examples

Step 4: Return Results to eLabFTW

Upload results to experiment record:

  • Processed data and figures

  • Analysis parameters used

  • GitHub commit reference for reproducibility

This closes the loop: experiment β†’ analysis β†’ results β†’ permanent record.

API Documentation

HPLC Analysis Module

The hplc_analysis module provides functions for analyzing HPLC chromatogram data with eLabFTW integration.

HPLC Chromatogram Analysis Module

This module provides functions for analyzing HPLC chromatogram data, including peak detection and basic chromatographic parameter calculations.

eLabFTW Integration

This analysis script is designed to work with chromatogram data exported from laboratory instruments. Link your analysis to experimental records in eLabFTW:

Example eLabFTW Reference in Code

Always include eLabFTW links in your data files and analysis scripts:

# Link to eLabFTW experiment in file header or docstring
# Experiment: https://your-elabftw-instance.org/experiments.php?mode=view&id=67890
# Equipment: https://your-elabftw-instance.org/database.php?mode=view&id=EQUIP-12345

For more information on eLabFTW integration, see the documentation.

hplc_analysis.analyze_chromatogram(filepath: str, threshold: float = 10.0) Dict[source]

Complete analysis workflow for an HPLC chromatogram.

Loads data, detects peaks, and calculates chromatographic parameters. Provides a comprehensive summary suitable for reporting in laboratory notebooks or publications.

Parameters:
  • filepath (str) – Path to chromatogram data file.

  • threshold (float, optional) – Peak detection threshold in mAU (default: 10.0).

Returns:

results – Analysis results containing:

  • ’filepath’: str - Input file path

  • ’n_points’: int - Number of data points

  • ’time_range’: tuple - (min_time, max_time) in minutes

  • ’peaks’: list - Detected peaks with properties

  • ’n_peaks’: int - Number of detected peaks

  • ’baseline’: float - Estimated baseline absorbance

Return type:

dict

Examples

>>> results = analyze_chromatogram('data/sample_hplc_chromatogram.txt',
...                                threshold=50.0)
>>> print(f"Detected {results['n_peaks']} peaks")
Detected 2 peaks
>>> for i, peak in enumerate(results['peaks'], 1):
...     print(f"Peak {i}: RT={peak['retention_time']:.2f} min")
Peak 1: RT=2.10 min
Peak 2: RT=5.70 min

Notes

Complete Workflow with eLabFTW:

  1. Before Analysis:

    • Create experiment record in eLabFTW with method details

    • Record instrument ID and calibration information

    • Upload raw data file to eLabFTW

  2. During Analysis:

    • Run this analysis function

    • Reference eLabFTW experiment ID in analysis script header

  3. After Analysis:

    • Export results to CSV or JSON

    • Upload results to eLabFTW experiment record

    • Link GitHub repository/commit in eLabFTW for code traceability

    • Document any manual peak assignments or corrections

This workflow ensures complete traceability from raw data to final results.

See also

load_chromatogram

Load data file

find_peaks

Peak detection algorithm

hplc_analysis.calculate_resolution(peak1: Dict[str, float], peak2: Dict[str, float], time: ndarray, absorbance: ndarray) float[source]

Calculate chromatographic resolution between two peaks.

Resolution (Rs) is a measure of peak separation, defined as:

\[R_s = \frac{2(t_{R2} - t_{R1})}{w_1 + w_2}\]

where \(t_{R1}\) and \(t_{R2}\) are retention times, and \(w_1\) and \(w_2\) are peak widths at baseline.

Parameters:
  • peak1 (dict) – First peak dictionary from find_peaks().

  • peak2 (dict) – Second peak dictionary from find_peaks().

  • time (np.ndarray) – Time values in minutes.

  • absorbance (np.ndarray) – Absorbance values in mAU.

Returns:

resolution – Resolution value. Rs > 1.5 indicates baseline separation.

Return type:

float

Examples

>>> time, absorbance = load_chromatogram('data/sample_hplc_chromatogram.txt')
>>> peaks = find_peaks(time, absorbance, threshold=50.0)
>>> if len(peaks) >= 2:
...     rs = calculate_resolution(peaks[0], peaks[1], time, absorbance)
...     print(f"Resolution: {rs:.2f}")
Resolution: 4.23

Notes

This implementation estimates peak width at half height (FWHH) and converts to baseline width using the approximation: baseline_width β‰ˆ 2 * FWHH.

Quality Control: Document resolution values in eLabFTW for method validation and quality control. Include acceptance criteria (typically Rs > 1.5 for baseline separation).

References

hplc_analysis.find_peaks(time: ndarray, absorbance: ndarray, threshold: float = 10.0, min_distance: int = 5) List[Dict[str, float]][source]

Detect peaks in HPLC chromatogram data.

Identifies local maxima in the absorbance signal that exceed a specified threshold. Returns peak properties including retention time, height, and approximate area.

Parameters:
  • time (np.ndarray) – Time values in minutes (1D array).

  • absorbance (np.ndarray) – Absorbance values in mAU (1D array).

  • threshold (float, optional) – Minimum peak height in mAU to be considered a peak (default: 10.0).

  • min_distance (int, optional) – Minimum number of data points between peaks (default: 5).

Returns:

peaks – List of detected peaks, where each peak is a dictionary containing:

  • ’retention_time’: float - Peak retention time in minutes

  • ’height’: float - Peak height in mAU

  • ’area’: float - Approximate peak area (trapezoidal integration)

  • ’index’: int - Index of peak maximum in the data array

Return type:

list of dict

Examples

>>> time, absorbance = load_chromatogram('data/sample_hplc_chromatogram.txt')
>>> peaks = find_peaks(time, absorbance, threshold=50.0)
>>> for i, peak in enumerate(peaks, 1):
...     print(f"Peak {i}: RT={peak['retention_time']:.2f} min, "
...           f"Height={peak['height']:.1f} mAU")
Peak 1: RT=2.10 min, Height=98.7 mAU
Peak 2: RT=5.70 min, Height=122.3 mAU

Notes

This is a simple peak detection algorithm suitable for well-resolved peaks. For complex chromatograms with overlapping peaks, consider using more sophisticated peak deconvolution methods.

The peak area is calculated using trapezoidal integration from the point where absorbance drops below the threshold on either side of the peak.

Documentation in eLabFTW: When documenting peak identification results, include them in your eLabFTW experiment record along with:

  • Integration parameters (threshold, baseline correction)

  • Peak assignments (compound identities)

  • Calibration curve information for quantification

This ensures all analysis parameters are tracked with your experimental data.

See also

load_chromatogram

Load chromatogram data from file

calculate_resolution

Calculate peak resolution

hplc_analysis.load_chromatogram(filepath: str) Tuple[ndarray, ndarray][source]

Load HPLC chromatogram data from a text file.

Reads a two-column text file containing time and absorbance data. Lines starting with β€˜#’ are treated as comments and skipped.

Parameters:

filepath (str) – Path to the chromatogram data file. File should contain two columns: time (minutes) and absorbance (mAU).

Returns:

  • time (np.ndarray) – Time values in minutes (1D array).

  • absorbance (np.ndarray) – Absorbance values in mAU (1D array).

Raises:
  • FileNotFoundError – If the specified file does not exist.

  • ValueError – If the file format is invalid or cannot be parsed.

Examples

>>> time, absorbance = load_chromatogram('data/sample_hplc_chromatogram.txt')
>>> print(f"Data points: {len(time)}")
Data points: 101
>>> print(f"Time range: {time[0]:.2f} - {time[-1]:.2f} min")
Time range: 0.00 - 10.00 min

Notes

eLabFTW Best Practice: Include the eLabFTW experiment ID in the chromatogram file header as a comment. This creates a permanent link between your raw data and experimental documentation.

Example file header:

# Experiment: eLabFTW #67890
# Equipment: HPLC-UV (eLabFTW ID: EQUIP-12345)
# Time(min)    Absorbance(mAU)
0.00    2.1
...

See also

find_peaks

Detect peaks in the loaded chromatogram

Key functions:

Sample Module (General Scientific Computing)

The sample_module module demonstrates general scientific computing with NumPy-style documentation.

Sample module for demonstrating scientific Python documentation.

This module provides functions for data analysis and statistical computations commonly used in scientific research.

class sample_module.DataAnalyzer(data: ndarray, name: str = 'dataset')[source]

Bases: object

A class for performing common data analysis operations.

This class provides methods for statistical analysis, data transformation, and visualization preparation for scientific datasets.

Parameters:
  • data (np.ndarray) – Input dataset to analyze.

  • name (str, optional) – Name identifier for the dataset (default is β€˜dataset’).

data

The stored dataset.

Type:

np.ndarray

name

The dataset name.

Type:

str

n_samples

Number of samples in the dataset.

Type:

int

Examples

>>> import numpy as np
>>> data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> analyzer = DataAnalyzer(data, name='test_data')
>>> summary = analyzer.get_summary()
>>> print(summary['mean'])
5.5

Notes

This class is designed for 1D numerical datasets. For multidimensional data, consider reshaping or using specialized tools.

__init__(data: ndarray, name: str = 'dataset')[source]

Initialize the DataAnalyzer.

detect_outliers(method: str = 'iqr', threshold: float = 1.5) ndarray[source]

Detect outliers in the dataset.

Parameters:
  • method ({'iqr', 'zscore'}, optional) –

    Method for outlier detection (default is β€˜iqr’).

    • ’iqr’: Interquartile range method

    • ’zscore’: Z-score method

  • threshold (float, optional) – Threshold for outlier detection (default is 1.5 for IQR, typically 3.0 for z-score).

Returns:

Boolean array indicating outliers (True) and inliers (False).

Return type:

np.ndarray

Examples

>>> import numpy as np
>>> data = np.array([1, 2, 3, 4, 5, 100])  # 100 is an outlier
>>> analyzer = DataAnalyzer(data)
>>> outliers = analyzer.detect_outliers(method='iqr')
>>> print(data[outliers])
[100]

Notes

The IQR method considers values outside of \([Q1 - threshold \times IQR, Q3 + threshold \times IQR]\) as outliers, where \(IQR = Q3 - Q1\).

get_summary() dict[source]

Get statistical summary of the dataset.

Returns:

Dictionary containing statistical measures:

  • ’mean’: arithmetic mean

  • ’median’: median value

  • ’std’: standard deviation

  • ’min’: minimum value

  • ’max’: maximum value

  • ’q25’: 25th percentile

  • ’q75’: 75th percentile

Return type:

dict

Examples

>>> import numpy as np
>>> analyzer = DataAnalyzer(np.array([1, 2, 3, 4, 5]))
>>> summary = analyzer.get_summary()
>>> summary['median']
3.0
sample_module.calculate_mean_std(data: ndarray) Tuple[float, float][source]

Calculate the mean and standard deviation of a dataset.

This function computes the arithmetic mean and population standard deviation of the input data using NumPy’s efficient implementations.

Parameters:

data (np.ndarray) – A 1D numpy array containing numerical data.

Returns:

  • mean (float) – The arithmetic mean of the dataset.

  • std (float) – The population standard deviation of the dataset.

Examples

>>> import numpy as np
>>> data = np.array([1, 2, 3, 4, 5])
>>> mean, std = calculate_mean_std(data)
>>> print(f"Mean: {mean}, Std: {std:.2f}")
Mean: 3.0, Std: 1.41

Notes

The standard deviation is calculated using the population formula (N divisor), not the sample formula (N-1 divisor).

See also

numpy.mean

Compute the arithmetic mean.

numpy.std

Compute the standard deviation.

sample_module.linear_regression(x: ndarray, y: ndarray) Tuple[float, float, float][source]

Perform simple linear regression on two variables.

Fits a linear model y = mx + b to the data using the least squares method. Returns the slope, intercept, and coefficient of determination (RΒ²).

Parameters:
  • x (np.ndarray) – Independent variable data (1D array).

  • y (np.ndarray) – Dependent variable data (1D array).

Returns:

  • slope (float) – The slope (m) of the fitted line.

  • intercept (float) – The y-intercept (b) of the fitted line.

  • r_squared (float) – The coefficient of determination (RΒ²), indicating goodness of fit. Values range from 0 to 1, where 1 indicates perfect fit.

Raises:

ValueError – If x and y have different lengths or if x has zero variance.

Examples

>>> import numpy as np
>>> x = np.array([1, 2, 3, 4, 5])
>>> y = np.array([2, 4, 5, 4, 5])
>>> slope, intercept, r2 = linear_regression(x, y)
>>> print(f"y = {slope:.2f}x + {intercept:.2f}, RΒ² = {r2:.3f}")
y = 0.60x + 2.20, RΒ² = 0.462

Notes

The RΒ² value is calculated as:

\[R^2 = 1 - \frac{SS_{res}}{SS_{tot}}\]

where \(SS_{res}\) is the residual sum of squares and \(SS_{tot}\) is the total sum of squares.

References

sample_module.normalize_data(data: ndarray, method: str = 'zscore') ndarray[source]

Normalize data using specified method.

Transforms the input data to a standardized scale for improved comparability and analysis.

Parameters:
  • data (np.ndarray) – Input data to be normalized (1D or 2D array).

  • method ({'zscore', 'minmax'}, optional) –

    Normalization method to use (default is β€˜zscore’).

    • ’zscore’: Standardize to zero mean and unit variance

    • ’minmax’: Scale to [0, 1] range

Returns:

normalized – Normalized data with the same shape as input.

Return type:

np.ndarray

Raises:

ValueError – If method is not β€˜zscore’ or β€˜minmax’.

Examples

>>> import numpy as np
>>> data = np.array([1, 2, 3, 4, 5])
>>> normalized = normalize_data(data, method='zscore')
>>> print(f"Mean: {np.mean(normalized):.2f}, Std: {np.std(normalized):.2f}")
Mean: 0.00, Std: 1.00
>>> normalized = normalize_data(data, method='minmax')
>>> print(f"Min: {np.min(normalized):.2f}, Max: {np.max(normalized):.2f}")
Min: 0.00, Max: 1.00

Notes

Z-score normalization is defined as:

\[z = \frac{x - \mu}{\sigma}\]

Min-max normalization is defined as:

\[x_{norm} = \frac{x - x_{min}}{x_{max} - x_{min}}\]

This module includes:

Repository Structure

.
β”œβ”€β”€ data/
β”‚   └── sample_hplc_chromatogram.txt    # Example data with eLabFTW refs
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ hplc_analysis.py                # HPLC analysis with eLabFTW
β”‚   └── sample_module.py                # General scientific computing
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ source/
β”‚   β”‚   β”œβ”€β”€ conf.py                     # Sphinx configuration
β”‚   β”‚   β”œβ”€β”€ index.rst                   # This page
β”‚   β”‚   β”œβ”€β”€ elabftw_integration.rst     # eLabFTW workflow guide
β”‚   β”‚   └── best_practices.rst          # Documentation best practices
β”‚   └── Makefile                        # Build commands
β”œβ”€β”€ README.md                           # Main repository documentation
β”œβ”€β”€ CONTRIBUTING.md                     # Contribution guidelines
β”œβ”€β”€ LICENSE.md                          # MIT license
└── requirements.txt                    # Python dependencies

For Beginners

New to Python?

New to Git/GitHub?

New to Sphinx?

New to eLabFTW?

Next Steps

Ready to dive deeper?

Deploying to GitHub Pages

This repository includes GitHub Actions automation:

  1. Push to main/master - Triggers automatic build

  2. Documentation builds - Sphinx generates HTML

  3. Deploys to GitHub Pages - Published automatically

  4. View at: https://<username>.github.io/<repository>/

See GITHUB_PAGES_SETUP.md for setup instructions.

Contributing

Contributions welcome! This is a learning-friendly project. See CONTRIBUTING.md for:

  • How to set up development environment

  • Code style guidelines

  • Documentation standards

  • Pull request process

All skill levels welcome!

License

MIT License - see LICENSE.md.

Acknowledgments

Built on best practices from:

  • NumPy and SciPy documentation communities

  • Sphinx and Read the Docs projects

  • eLabFTW development team

  • Scientific Python package maintainers

  • Research software engineering community

Questions?

Indices and Tables