-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[tmva][sofie] New Keras Parser #19692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[tmva][sofie] New Keras Parser #19692
Conversation
Thank you very much for the PR! It's indeed better to implement this in Python.
While waiting for the review by @lmoneta and @sanjibansg, I have a general question. What is the plan with the existing Keras parser in C++? If it is superseded by your new parser, can we remove it maybe in this PR as well? Like this we ensure that the users are not confused by two implementations, and also that the total maintenance cost doesn't increase. |
I agree with @guitargeek, we will remove the existing C++ parser for Keras within this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did an initial review
@@ -58,7 +58,31 @@ if(tmva) | |||
ROOT/_pythonization/_tmva/_rtensor.py | |||
ROOT/_pythonization/_tmva/_tree_inference.py | |||
ROOT/_pythonization/_tmva/_utils.py | |||
ROOT/_pythonization/_tmva/_gnn.py) | |||
ROOT/_pythonization/_tmva/_gnn.py | |||
ROOT/_pythonization/_tmva/_sofie/_parser/_keras/__init__.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe better to have an additional CMake file for the keras parser files within their directory, so as to not add all of them here.
@@ -0,0 +1,202 @@ | |||
# functional_models.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this comment here?
# # 1. Dropout (to test SOFIE's Identity operator) | ||
# inp = layers.Input(shape=(10,)) | ||
# out = layers.Dropout(0.5)(inp) | ||
# model = models.Model(inputs=inp, outputs=out) | ||
# train_and_save(model, "Functional_Dropout_test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we remove the commented out code if this is not used anyway?
# 1. Dropout | ||
# model = models.Sequential([ | ||
# layers.Input(shape=(10,)), | ||
# layers.Dropout(0.5) # Dropout | ||
# ]) | ||
# train_and_save(model, "Sequential_Dropout_test") | ||
|
||
# 2. Binary Ops: Add, Subtract, Multiply are not typical in Sequential — skipping here | ||
|
||
# 3. Concat (not applicable in Sequential without multi-input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as earlier for functional model
return op | ||
else: | ||
raise RuntimeError( | ||
"TMVA::SOFIE - Unsupported - Operator Gemm does not yet support input type " + fLayerDType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"TMVA::SOFIE - Unsupported - Operator Gemm does not yet support input type " + fLayerDType | |
"TMVA::SOFIE - Unsupported - Operator Conv does not yet support input type " + fLayerDType |
op = gbl_namespace.TMVA.Experimental.SOFIE.ROperator_BasicBinary(float,'TMVA::Experimental::SOFIE::EBasicBinaryOperator::Mul')(fX1, fX2, fY) | ||
else: | ||
raise RuntimeError( | ||
"TMVA::SOFIE - Unsupported - Operator Identity does not yet support input type " + fLayerDType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"TMVA::SOFIE - Unsupported - Operator Identity does not yet support input type " + fLayerDType | |
"TMVA::SOFIE - Unsupported - Operator BasicBinary does not yet support input type " + fLayerDType |
@@ -0,0 +1,48 @@ | |||
from cppyy import gbl as gbl_namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe good to check if the datatypes for the input tensors are float since we don't support the other cases?
input = [str(i) for i in finput] | ||
output = str(foutput[0]) | ||
axis = int(attributes["axis"]) | ||
op = gbl_namespace.TMVA.Experimental.SOFIE.ROperator_Concat(input, axis, 0, output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe good to check the datatype for the input tensor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TensoFlow/Keras is too fragile to import unconditionally. You need to make sure it's not imported unconditionally, otherwise its presence will break several ROOT usecases, as we see in the tests. Always importing keras will also slow down importing ROOT, which is not desired
@@ -0,0 +1,479 @@ | |||
from ......_pythonization import pythonization | |||
from cppyy import gbl as gbl_namespace | |||
import keras |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here for example: instead of importing keras
whenever the parsers are imported, you need to import it locally in the functions.
@@ -44,6 +44,7 @@ def inject_rbatchgenerator(ns): | |||
|
|||
|
|||
from ._gnn import RModel_GNN, RModel_GraphIndependent | |||
from ._sofie._parser._keras.parser import RModelParser_Keras |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But here is the main culprit: this global import is the entry point for all other imports I think.
Test Results 21 files 21 suites 3d 23h 48m 50s ⏱️ For more details on these failures, see this check. Results for commit 93a55ac. |
@@ -44,6 +44,7 @@ def inject_rbatchgenerator(ns): | |||
|
|||
|
|||
from ._gnn import RModel_GNN, RModel_GraphIndependent | |||
from ._sofie._parser._keras.parser import RModelParser_Keras |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from ._sofie._parser._keras.parser import RModelParser_Keras |
This Pull request:
This pull request adds the support for parsing Keras models with SOFIE
Changes or fixes:
Currently, SOFIE's existing Keras parser is written in C++ and is quite old. Although it's written in C++ but its actual parsing logic is written in Python. Additionally, it lacks support for parsing layers such as MaxPool.
This new parser is natively written in Python and uses of pythnoization to access C++ methods from SOFIE. The parser support latest version of Keras (Keras 3) and also has backwards compatibility with earlier versions of Keras 2. It also supports for both types of models i.e. models built using Keras' Functional as well as Sequential API.
Layers supported by the parser:
Checklist: