Skip to content

Commit 512af50

Browse files
committed
add benchmark.rst file
1 parent df86de8 commit 512af50

File tree

1 file changed

+162
-0
lines changed

1 file changed

+162
-0
lines changed

docs/source/benchmarking.rst

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
.. _benchmarkutility:
2+
3+
Benchmark Utility in PyLops-MPI
4+
===============================
5+
PyLops-MPI users can convenienly benchmark the performance of their code with a simple decorator.
6+
7+
This tutorial demonstrates how to use the :py:func:`pylops_mpi.utils.benchmark` and
8+
:py:func:`pylops_mpi.utils.mark` utility methods in PyLops-MPI. These utilities support various
9+
function calling patterns that may arise when benchmarking distributed code.
10+
11+
- :py:func:`pylops_mpi.utils.benchmark` is a **decorator** used to time the execution of entire functions.
12+
- :py:func:`pylops_mpi.utils.mark` is a **function** used inside decorated functions to insert fine-grained time measurements.
13+
14+
Basic Setup
15+
-----------
16+
17+
We start by importing the required modules and setting up some parameters of our simple program.
18+
19+
.. code-block:: python
20+
21+
import sys
22+
import logging
23+
import numpy as np
24+
from mpi4py import MPI
25+
from pylops_mpi import DistributedArray, Partition
26+
27+
from pylops_mpi.utils.benchmark import benchmark, mark
28+
29+
np.random.seed(42)
30+
rank = MPI.COMM_WORLD.Get_rank()
31+
32+
par = {'global_shape': (500, 501),
33+
'partition': Partition.SCATTER, 'dtype': np.float64,
34+
'axis': 1}
35+
36+
Benchmarking a Simple Function
37+
------------------------------
38+
39+
We define a simple function and decorate it with :py:func:`benchmark`.
40+
41+
.. code-block:: python
42+
43+
@benchmark
44+
def inner_func(par):
45+
dist_arr = DistributedArray(global_shape=par['global_shape'],
46+
partition=par['partition'],
47+
dtype=par['dtype'], axis=par['axis'])
48+
# may perform computation here
49+
dist_arr.dot(dist_arr)
50+
51+
Calling the function will result in the elapsed runtime being printed to standard output.
52+
53+
.. code-block:: python
54+
55+
inner_func(par)
56+
57+
You can also customize the label of the printout using the ``description`` parameter:
58+
59+
.. code-block:: python
60+
61+
@benchmark(description="printout_name")
62+
def my_func(...):
63+
...
64+
65+
Fine-grained Time Measurements
66+
------------------------------
67+
68+
To gain more insight into the runtime of specific code regions, use :py:func:`mark` within
69+
a decorated function. This allows insertion of labeled time checkpoints.
70+
71+
.. code-block:: python
72+
73+
@benchmark
74+
def inner_func_with_mark(par):
75+
mark("Begin array constructor")
76+
dist_arr = DistributedArray(global_shape=par['global_shape'],
77+
partition=par['partition'],
78+
dtype=par['dtype'], axis=par['axis'])
79+
mark("Begin dot")
80+
dist_arr.dot(dist_arr)
81+
mark("Finish dot")
82+
83+
The output will now contain timestamped entries for each marked location, along with the total time
84+
from the outer decorator (marked with ``[decorator]`` in the output).
85+
86+
.. code-block:: python
87+
88+
inner_func_with_mark(par)
89+
90+
Nested Function Benchmarking
91+
----------------------------
92+
93+
You can nest benchmarked functions to track execution times across layers of function calls.
94+
Below, we define an :py:func:`outerfunc_with_mark` that calls :py:func:`inner_func_with_mark` defined earlier.
95+
96+
.. code-block:: python
97+
98+
@benchmark
99+
def outer_func_with_mark(par):
100+
mark("Outer func start")
101+
inner_func_with_mark(par)
102+
dist_arr = DistributedArray(global_shape=par['global_shape'],
103+
partition=par['partition'],
104+
dtype=par['dtype'], axis=par['axis'])
105+
dist_arr + dist_arr
106+
mark("Outer func ends")
107+
108+
Calling the function prints the full call tree with indentation, capturing both outer and nested timing.
109+
110+
.. code-block:: python
111+
112+
outer_func_with_mark(par)
113+
114+
Logging Benchmark Output
115+
------------------------
116+
117+
To store benchmarking results in a file, pass a custom :py:class:`logging.Logger` instance
118+
to the :py:func:`benchmark` decorator. Below is a utility function that constructs such a logger.
119+
120+
.. code-block:: python
121+
122+
def make_logger(save_file=False, file_path=''):
123+
logger = logging.getLogger(__name__)
124+
logging.basicConfig(filename=file_path if save_file else None,
125+
filemode='w', level=logging.INFO, force=True)
126+
logger.propagate = False
127+
if save_file:
128+
handler = logging.FileHandler(file_path, mode='w')
129+
else:
130+
handler = logging.StreamHandler(sys.stdout)
131+
logger.addHandler(handler)
132+
return logger
133+
134+
Use this logger when decorating your function:
135+
136+
.. code-block:: python
137+
138+
save_file = True
139+
file_path = "benchmark.log"
140+
logger = make_logger(save_file, file_path)
141+
142+
@benchmark(logger=logger)
143+
def inner_func_with_logger(par):
144+
dist_arr = DistributedArray(global_shape=par['global_shape'],
145+
partition=par['partition'],
146+
dtype=par['dtype'], axis=par['axis'])
147+
# may perform computation here
148+
dist_arr.dot(dist_arr)
149+
150+
Run the function to generate output written directly to ``benchmark.log``.
151+
152+
.. code-block:: python
153+
154+
inner_func_with_logger(par)
155+
156+
Final Notes
157+
-----------
158+
159+
This tutorial demonstrated how to benchmark distributed PyLops-MPI operations using both
160+
coarse and fine-grained instrumentation tools. These utilities help track and debug
161+
performance bottlenecks in parallel workloads.
162+

0 commit comments

Comments
 (0)