Reusable actions
Like scripted actions, reusable actions are logical units of analytic code. However, whereas a scripted action is written to solve a problem for one study and must be copied-and-pasted to solve a similar problem for another study, a reusable action is written to solve a problem for several studies without copying-and-pasting between them. This makes reusable actions ideal for tasks that must be completed by several studies, such as joining datasets or producing deciles charts.
Running reusable actions🔗
You can browse existing reusable actions at https://actions.opensafely.org. Although each is different, they have a common API. Consider the following extract from a study's project.yaml:
actions:
generate_dataset:
run: ehrql:v0 generate-dataset analysis/dataset_definition.py --output output/dataset.csv.gz
outputs:
highly_sensitive:
dataset: output/dataset.csv.gz
run_a_reusable_action:
# We will run version `v1.0.0` of the reusable action called `a_reusable_action`.
# The reusable action accepts an argument; in this case, a path to a file.
run: a_reusable_action:v1.0.0 output/dataset.csv.gz
# The reusable action accepts a configuration option;
# in this case, an output format.
config:
output-format: PNG
needs: [generate_dataset]
outputs:
moderately_sensitive:
output: output/output_from_a_reusable_action.png
In the above extract, the run
and config
properties of run_a_reusable_action
are the common API.
The run
property, which is required, describes how the reusable action is run.
It comprises the name and the version of the reusable action, and, optionally, one or more arguments.
The config
property, which is optional, describes configuration options.
Developing reusable actions🔗
Note
If you're thinking about developing a reusable action, then start by creating a new study within the opensafely
organisation that encapsulates the problem.
As a minimum, the study should extract and operate on a dataset:
indeed, the code that operates on the dataset is the reusable action.
At this point, you should open an issue. Below, we describe how to convert the study into a reusable action.
A reusable action is a public repo within the opensafely-actions
organisation.
It has a main
branch, which is the release branch;
versions are marked with tags.
In the above extract from a study's project.yaml, for example, we will run version v1.0.0
of the reusable action called a_reusable_action
.
The repo has the following minimal structure:
.
├── README.md
└── action.yaml
README.md contains information about the reusable action, which is displayed by https://actions.opensafely.org.
action.yaml contains a run command, which is composed of a runtime image (either python
, r
, or stata-mp
) and an entry point.
The entry point is a path to a script.
For example:
run: python:latest python action/cli.py
Where action/cli.py is:
def main():
print("A reusable action")
if __name__ == "__main__":
main()
When developing a reusable action, just as when developing a scripted action, the action's dependencies are made available by the runtime; they are not made available by the action.
- The Python runtime is provided by
python-docker
. Its dependencies are in requirements.txt. In practice, this means that a Python action's requirements.txt is ignored. - The R runtime is provided by
r-docker
. Its dependencies are in packages.csv. - The Stata runtime is provided by
stata-docker
. Its dependencies are in libraries.