Documenting and detailing a single script based on the comments inside

2024/9/30 5:22:27

I am going to write a set of scripts, each independent from the others but with some similarities. The structure will most likely be the same for all the scripts and probably looks like:

# -*- coding: utf-8 -*-
"""
Small description and information
@author: Author
"""# Imports
import numpy as np
import math
from scipy import signal
...# Constant definition (always with variable in capital letters)
CONSTANT_1 = 5
CONSTANT_2 = 10# Main class
class Test():def __init__(self, run_id, parameters):# Some stuff not too importantdef _run(self, parameters):# Main program returning a result object. 

For each script, I would like to write documentation and export it in PDF. I need a library/module/parser which reads the scripts, extracts the noted comment, code and puts it back together in the desired output format.

For instance, in the _run() method, there might be several steps detailed in the comments:

def _run(self, parameters):# Step 1: we start by doing thiscode to do it# Step 2: then we do thiscode to do itcode code # this code does that

Which library/parser could I use to analyze the python script and output a PDF? At first, I was thinking of sphinx, but it is not suited to my need as I would have to design a custom extension. Moreover, sphinx strength lies in the links and hierarchy between multiple scripts of a same or of different modules. In my case, I will only be documenting one script, one file at a time.

Then, my second idea is to use the RST format and RST2PDF to create the PDF. For the parser, I could then design a parser which reads the .py file and extract the commented/decorated lines or set of lines as proposed below, and then write the RST file.

#-description
## Title of something
# doing this here
#-#-code
some code to extract and put in the doc
some more code
#-

Finally, I would also like to be able to execute some code and catch the result in order to put it in the output PDF file. For instance, I could run a python code to compute the SHA1 hash of the .py file content and include this as a reference in the PDF documentation.

Answer

Docstrings instead of comments

In order to make things easier for yourself, you probably want to make use of docstrings rather than comments:

A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.

This way, you can make use of the __doc__ attribute when parsing the scripts when generating documentation.

The three double quoted string placed immediately after the function/module definition that becomes the docstring is just syntactic sugaring. You can edit the __doc__ attribute programmatically as needed.

For instance, you can make use of decorators to make the creation of docstrings nicer in your specific case. For instance, to let you comment the steps inline, but still adding the comments to the docstring (programmed in browser, probably with errors):

def with_steps(func):def add_step(n, doc):func.__doc__ = func.__doc__ + "\nStep %d: %s" % (n, doc)func.add_step = add_step@with_steps
def _run(self, parameters):"""Initial description that is turned into the initial docstring"""_run.add_step(1, "we start by doing this")code to do it_run.add_step(2, "then we do this")code to do itcode 

Which would create a docstring like this:

Initial description that is turned into the initial docstring
Step 1: we start by doing this
Step 2: then we do this

You get the idea.

Generating PDF from documented scripts

Sphinx

Personally, I'd just try the PDF-builders available for Sphinx, via the bundled LaTeXBuilder or using rinoh if you don't want to depend on LaTeX.

However, you would have to use a docstring format that Sphinx understands, such as reStructuredText or Google Style Docstrings.

AST

An alternative is to use ast to extract the docstrings. This is probably what the Sphinx autodoc extension uses internally to extract the documentation from the source files. There are a few examples out there on how to do this, like this gist or this blog post.

This way you can write a script that parses and outputs any formats you want. For instance, you can output Markdown or reST and convert it to PDF using pandoc.

You could write marked up text directly in the docstrings, which would give you a lot of flexibility. Let's say you wanted to write your documentation using markdown – just write markdown directly in your docstring.

def _run(self, parameters):"""Example script================This script does a, b, c1. Does something first2. Does something else next3. Returns something elseUsage example:result = script(parameters)foo = [r.foo for r in results]"""

This string can be extracted using ast and parsed/processed using whatever library you see fit.

https://en.xdnf.cn/q/71122.html

Related Q&A

Using Ansible variables in testinfra

Using TestInfra with Ansible backend for testing purposes. Everything goes fine except using Ansible itself while running teststest.pyimport pytest def test_zabbix_agent_package(host):package = host.pa…

How to create a dictionary of dictionaries of dictionaries in Python

So I am taking a natural language processing class and I need to create a trigram language model to generate random text that looks "realistic" to a certain degree based off of some sample da…

How to separate Master Slave (DB read / writes) in Flask Sqlalchemy

Im trying to separate the Read and write DB operations via Flask Sqlalchemy. Im using binds to connect to the mysql databases. I would want to perform the write operation in Master and Reads from slave…

Why import class from another file will call __init__ function?

The structure of the project is:project - main.py - session.py - spider.pyThere is a class in session.py:import requestsclass Session:def __init__(self):self.session = requests.Session()print(Session c…

Flask: login session times out too soon

While editing a record, if there is a long wait of let say a few minutes (getting coffee) and then coming back to press the save (POST), I get redirected to the main page to login instead and the data …

Activate virtual environement and start jupyter notebook all in batch file

I created the following batch file: jupyter_nn.bat. Inside file I have:cd "C:\My_favorite_path" activate neuralnets jupyter notebookSo the goal is to activate conda virtual environment and s…

several contour plots in the same figures

I have several 3d functions. I would like two plot the contour plots of them in the same figure to see the difference between them. I expect to see some crossings between contours of two functions. Her…

how to detect all the rectangular boxes in the given image

I tried to detect all the rectangles in image using threshold, canny edge and applied contour detection but it was not able to detect all the rectangles. Finally, I thought of detect the same using hou…

Python Pandas Series failure datetime

I think that this has to be a failure of pandas, having a pandas Series (v.18.1 and 19 too), if I assign a date to the Series, the first time it is added as int (error), the second time it is added as …

Remove a dictionary key that has a certain value [duplicate]

This question already has answers here:Removing entries from a dictionary based on values(4 answers)Closed 10 years ago.I know dictionarys are not meant to be used this way, so there is no built in fun…