-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Added weightless cache attributes for intel GPU plugin. #31468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
f108dbe
to
c9f955c
Compare
c9f955c
to
84e65dd
Compare
build_jenkins |
} | ||
} else if (auto ti = ov::as_type<const ov::op::v0::TensorIterator>(node.get())) { | ||
auto ti_body = ti->get_body(); | ||
fill_offset_to_constant_map(ti_body); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fill_offset_to_constant_map(ti_body); | |
fill_offset_to_constant_map(ti_body, cache_attr); |
There is no logic for handling TensorIterator in set_weightless_cache_attributes()
, so either its body should be ignored here, or the logic should be added to the corresponding functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are unit test failure with suggested changes. 4 tests out of 193 weightless cache related tests are failing. One such test is below one.
[ FAILED ] smoke_CheckWeightlessCacheAccuracy/CheckWeightlessCacheAccuracy.TiWithLstmCell/import_api=compile_model_do_encryption=1_inference_mode=f32_model_dtype=f32, where GetParam() = (4-byte object <02-00 00-00>, true, f32, f32) (273 ms)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@susbhere, did you check the tests with proper TensorIterator handling in set_weightless_cache_attributes() / create_weightless_cache_attributes() functioncs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sshlyapn, I have run below command and saw the failures. 4 out of 193 tests failed. Those are not failing with current state of the PR.
ov_gpu_func_tests.exe --gtest_filter=Weightless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@susbhere, my point is that either this TensorIterator-related condition is unnecessary, or TensorIterator should be properly handled during runtime attribute configuration. Currently, only constants are processed there:
for (const auto& node : model->get_ordered_ops()) {
if (ov::op::util::is_constant(node)) {
// Offset behaves as a unique key for each constant. Size = 1 is used as dummy.
cache_attr_map->emplace(node->get_instance_id(), ov::WeightlessCacheAttribute(1, offset++, node->get_element_type()));
}
}
src/plugins/intel_gpu/include/intel_gpu/runtime/internal_properties.hpp
Outdated
Show resolved
Hide resolved
@@ -1860,14 +1860,20 @@ void program::save(cldnn::BinaryOutputBuffer& ob) const { | |||
} | |||
} | |||
|
|||
void program::load(cldnn::BinaryInputBuffer& ib, std::shared_ptr<const ov::Model> model_ptr) { | |||
void program::load(cldnn::BinaryInputBuffer& ib, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[random spot]
Could you please add some functional tests for this caching mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do post merging this PR.
84e65dd
to
e4a48d2
Compare
Renaming variables and minor updates as per review comments.
Weightless cache attributes are added when the weights are not coming from bin file. That happens for non-IR inputs like ORT.
https://jira.devtools.intel.com/browse/CVS-167691