Skip to content

Commit 81e62b9

Browse files
committed
simplify benchmarking.rst, support environment variable to toggle benchmark
1 parent c77912f commit 81e62b9

File tree

3 files changed

+35
-148
lines changed

3 files changed

+35
-148
lines changed

docs/source/benchmarking.rst

Lines changed: 19 additions & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -3,160 +3,39 @@
33
Benchmark Utility in PyLops-MPI
44
===============================
55
PyLops-MPI users can convenienly benchmark the performance of their code with a simple decorator.
6-
7-
This tutorial demonstrates how to use the :py:func:`pylops_mpi.utils.benchmark` and
8-
:py:func:`pylops_mpi.utils.mark` utility methods in PyLops-MPI. These utilities support various
6+
:py:func:`pylops_mpi.utils.benchmark` and :py:func:`pylops_mpi.utils.mark` support various
97
function calling patterns that may arise when benchmarking distributed code.
108

119
- :py:func:`pylops_mpi.utils.benchmark` is a **decorator** used to time the execution of entire functions.
1210
- :py:func:`pylops_mpi.utils.mark` is a **function** used inside decorated functions to insert fine-grained time measurements.
1311

14-
Basic Setup
15-
-----------
16-
17-
We start by importing the required modules and setting up some parameters of our simple program.
18-
19-
.. code-block:: python
20-
21-
import sys
22-
import logging
23-
import numpy as np
24-
from mpi4py import MPI
25-
from pylops_mpi import DistributedArray, Partition
26-
27-
from pylops_mpi.utils.benchmark import benchmark, mark
12+
.. note::
13+
This benchmark utility is enabled by default i.e., if the user decorates the function with :py:func:`@benchmark`, the function will go through
14+
the time measurements, adding overheads. Users can turn off the benchmark while leaving the decorator in-place with
2815

29-
np.random.seed(42)
30-
rank = MPI.COMM_WORLD.Get_rank()
16+
.. code-block:: bash
3117
32-
par = {'global_shape': (500, 501),
33-
'partition': Partition.SCATTER, 'dtype': np.float64,
34-
'axis': 1}
18+
>> export BENCH_PYLOPS_MPI=0
3519
36-
Benchmarking a Simple Function
37-
------------------------------
38-
39-
We define a simple function and decorate it with :py:func:`benchmark`.
20+
The usage can be as simple as:
4021

4122
.. code-block:: python
4223
4324
@benchmark
44-
def inner_func(par):
45-
dist_arr = DistributedArray(global_shape=par['global_shape'],
46-
partition=par['partition'],
47-
dtype=par['dtype'], axis=par['axis'])
48-
# may perform computation here
49-
dist_arr.dot(dist_arr)
50-
51-
Calling the function will result in the elapsed runtime being printed to standard output.
52-
53-
.. code-block:: python
54-
55-
inner_func(par)
56-
57-
You can also customize the label of the printout using the ``description`` parameter:
25+
def function_to_time():
26+
# Your computation
5827
59-
.. code-block:: python
60-
61-
@benchmark(description="printout_name")
62-
def my_func(...):
63-
...
64-
65-
Fine-grained Time Measurements
66-
------------------------------
67-
68-
To gain more insight into the runtime of specific code regions, use :py:func:`mark` within
69-
a decorated function. This allows insertion of labeled time checkpoints.
28+
The result will print out to the standard output.
29+
For fine-grained time measurements, :py:func:`pylops_mpi.utils.mark` can be inserted in the code region of benchmarked functions:
7030

7131
.. code-block:: python
7232
7333
@benchmark
74-
def inner_func_with_mark(par):
75-
mark("Begin array constructor")
76-
dist_arr = DistributedArray(global_shape=par['global_shape'],
77-
partition=par['partition'],
78-
dtype=par['dtype'], axis=par['axis'])
79-
mark("Begin dot")
80-
dist_arr.dot(dist_arr)
81-
mark("Finish dot")
82-
83-
The output will now contain timestamped entries for each marked location, along with the total time
84-
from the outer decorator (marked with ``[decorator]`` in the output).
85-
86-
.. code-block:: python
87-
88-
inner_func_with_mark(par)
89-
90-
Nested Function Benchmarking
91-
----------------------------
92-
93-
You can nest benchmarked functions to track execution times across layers of function calls.
94-
Below, we define an :py:func:`outerfunc_with_mark` that calls :py:func:`inner_func_with_mark` defined earlier.
95-
96-
.. code-block:: python
97-
98-
@benchmark
99-
def outer_func_with_mark(par):
100-
mark("Outer func start")
101-
inner_func_with_mark(par)
102-
dist_arr = DistributedArray(global_shape=par['global_shape'],
103-
partition=par['partition'],
104-
dtype=par['dtype'], axis=par['axis'])
105-
dist_arr + dist_arr
106-
mark("Outer func ends")
107-
108-
Calling the function prints the full call tree with indentation, capturing both outer and nested timing.
109-
110-
.. code-block:: python
111-
112-
outer_func_with_mark(par)
113-
114-
Logging Benchmark Output
115-
------------------------
116-
117-
To store benchmarking results in a file, pass a custom :py:class:`logging.Logger` instance
118-
to the :py:func:`benchmark` decorator. Below is a utility function that constructs such a logger.
119-
120-
.. code-block:: python
121-
122-
def make_logger(save_file=False, file_path=''):
123-
logger = logging.getLogger(__name__)
124-
logging.basicConfig(filename=file_path if save_file else None,
125-
filemode='w', level=logging.INFO, force=True)
126-
logger.propagate = False
127-
if save_file:
128-
handler = logging.FileHandler(file_path, mode='w')
129-
else:
130-
handler = logging.StreamHandler(sys.stdout)
131-
logger.addHandler(handler)
132-
return logger
133-
134-
Use this logger when decorating your function:
135-
136-
.. code-block:: python
137-
138-
save_file = True
139-
file_path = "benchmark.log"
140-
logger = make_logger(save_file, file_path)
141-
142-
@benchmark(logger=logger)
143-
def inner_func_with_logger(par):
144-
dist_arr = DistributedArray(global_shape=par['global_shape'],
145-
partition=par['partition'],
146-
dtype=par['dtype'], axis=par['axis'])
147-
# may perform computation here
148-
dist_arr.dot(dist_arr)
149-
150-
Run the function to generate output written directly to ``benchmark.log``.
151-
152-
.. code-block:: python
153-
154-
inner_func_with_logger(par)
155-
156-
Final Notes
157-
-----------
158-
159-
This tutorial demonstrated how to benchmark distributed PyLops-MPI operations using both
160-
coarse and fine-grained instrumentation tools. These utilities help track and debug
161-
performance bottlenecks in parallel workloads.
162-
34+
def funtion_to_time():
35+
# You computation that you may want to ignore it in benchmark
36+
mark("Begin Region")
37+
# You computation
38+
mark("Finish Region")
39+
40+
You can also nest benchmarked functions to track execution times across layers of function calls with the output being correctly formatted.
41+
Additionally, the result can also be exported to the text file. For completed and runnable examples, visit :ref:`sphx_glr_tutorials_benchmarking.py`

pylops_mpi/utils/benchmark.py

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import functools
22
import logging
3+
import os
34
import time
45
from typing import Callable, Optional, List
56
from mpi4py import MPI
@@ -16,8 +17,8 @@
1617
def _nccl_sync():
1718
pass
1819

19-
# TODO (tharitt): later move to env file or something
20-
ENABLE_BENCHMARK = True
20+
# Benchmark is enabled by default
21+
ENABLE_BENCHMARK = int(os.getenv("BENCH_PYLOPS_MPI", 1)) == 1
2122

2223
# Stack of active mark functions for nested support
2324
_mark_func_stack = []
@@ -77,6 +78,8 @@ def mark(label: str):
7778
A label of the mark. This signifies both 1) the end of the
7879
previous mark 2) the beginning of the new mark
7980
"""
81+
if not ENABLE_BENCHMARK:
82+
return
8083
if not _mark_func_stack:
8184
raise RuntimeError("mark() called outside of a benchmarked region")
8285
_mark_func_stack[-1](label)
@@ -108,9 +111,11 @@ def benchmark(func: Optional[Callable] = None,
108111
is not provided, the output is printed to stdout.
109112
"""
110113

111-
# Zero-overhead
112-
if not ENABLE_BENCHMARK:
113-
return func
114+
def noop_decorator(func):
115+
@functools.wraps(func)
116+
def wrapped(*args, **kwargs):
117+
return func(*args, **kwargs)
118+
return wrapped
114119

115120
@functools.wraps(func)
116121
def decorator(func):
@@ -153,7 +158,10 @@ def local_mark(label):
153158
print("".join(output))
154159
return result
155160
return wrapper
156-
if func is not None:
157-
return decorator(func)
158161

159-
return decorator
162+
# The code still has to return decorator so that the in-place decorator with arguments
163+
# like @benchmark(logger=logger) does not throw the error and can be kept untouched.
164+
if not ENABLE_BENCHMARK:
165+
return noop_decorator if func is None else noop_decorator(func)
166+
167+
return decorator if func is None else decorator(func)
File renamed without changes.

0 commit comments

Comments
 (0)