Skip to content

Added weightless cache attributes for intel GPU plugin. #31468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

susbhere
Copy link

@susbhere susbhere commented Jul 25, 2025

Weightless cache attributes are added when the weights are not coming from bin file. That happens for non-IR inputs like ORT.

https://jira.devtools.intel.com/browse/CVS-167691

@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Jul 25, 2025
@sys-openvino-ci sys-openvino-ci added the ExternalIntelPR External contributor from Intel label Jul 25, 2025
@susbhere susbhere marked this pull request as ready for review July 25, 2025 08:57
@susbhere susbhere requested review from a team as code owners July 25, 2025 08:57
@susbhere susbhere force-pushed the gpu_weighless_caching branch 3 times, most recently from f108dbe to c9f955c Compare July 25, 2025 11:02
@susbhere
Copy link
Author

@susbhere susbhere force-pushed the gpu_weighless_caching branch from c9f955c to 84e65dd Compare July 28, 2025 03:24
@praasz
Copy link
Contributor

praasz commented Jul 28, 2025

build_jenkins

}
} else if (auto ti = ov::as_type<const ov::op::v0::TensorIterator>(node.get())) {
auto ti_body = ti->get_body();
fill_offset_to_constant_map(ti_body);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fill_offset_to_constant_map(ti_body);
fill_offset_to_constant_map(ti_body, cache_attr);

There is no logic for handling TensorIterator in set_weightless_cache_attributes(), so either its body should be ignored here, or the logic should be added to the corresponding functions

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are unit test failure with suggested changes. 4 tests out of 193 weightless cache related tests are failing. One such test is below one.

[ FAILED ] smoke_CheckWeightlessCacheAccuracy/CheckWeightlessCacheAccuracy.TiWithLstmCell/import_api=compile_model_do_encryption=1_inference_mode=f32_model_dtype=f32, where GetParam() = (4-byte object <02-00 00-00>, true, f32, f32) (273 ms)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@susbhere, did you check the tests with proper TensorIterator handling in set_weightless_cache_attributes() / create_weightless_cache_attributes() functioncs?

Copy link
Author

@susbhere susbhere Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sshlyapn, I have run below command and saw the failures. 4 out of 193 tests failed. Those are not failing with current state of the PR.

ov_gpu_func_tests.exe --gtest_filter=Weightless

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@susbhere, my point is that either this TensorIterator-related condition is unnecessary, or TensorIterator should be properly handled during runtime attribute configuration. Currently, only constants are processed there:

    for (const auto& node : model->get_ordered_ops()) {
        if (ov::op::util::is_constant(node)) {            
            // Offset behaves as a unique key for each constant. Size = 1 is used as dummy.
            cache_attr_map->emplace(node->get_instance_id(), ov::WeightlessCacheAttribute(1, offset++, node->get_element_type()));
        }
    }

@@ -1860,14 +1860,20 @@ void program::save(cldnn::BinaryOutputBuffer& ob) const {
}
}

void program::load(cldnn::BinaryInputBuffer& ib, std::shared_ptr<const ov::Model> model_ptr) {
void program::load(cldnn::BinaryInputBuffer& ib,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[random spot]
Could you please add some functional tests for this caching mechanism?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do post merging this PR.

@susbhere susbhere force-pushed the gpu_weighless_caching branch from 84e65dd to e4a48d2 Compare July 29, 2025 10:36
Renaming variables and minor updates as per review comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin ExternalIntelPR External contributor from Intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants