Skip to content

[SPARK-52146][SQL] Detect cyclic function references in SQL UDFs #51626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

allisonwang-db
Copy link
Contributor

What changes were proposed in this pull request?

This change adds logic to detect cyclic function references when creating SQL UDFs to prevent infinite analysis.

Why are the changes needed?

To improve SQL UDF usability

Does this PR introduce any user-facing change?

No

How was this patch tested?

New tests

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Jul 23, 2025
@allisonwang-db allisonwang-db changed the title [SPARK-52146][SQL] Detect cyclic function references in SQL UDFs WIP [SPARK-52146][SQL] Detect cyclic function references in SQL UDFs Jul 23, 2025
Comment on lines +1671 to +1675
val outer = expr.transform {
case a: Attribute if a.resolved => OuterReference(a)
case o: OuterReference => OuterReference(o)
}
Alias(Cast(outer, param.dataType), param.name)(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @cloud-fan this is needed to make the plan structure valid when using OneRowRelation.

@allisonwang-db allisonwang-db changed the title WIP [SPARK-52146][SQL] Detect cyclic function references in SQL UDFs [SPARK-52146][SQL] Detect cyclic function references in SQL UDFs Jul 23, 2025
@allisonwang-db allisonwang-db requested a review from cloud-fan July 23, 2025 21:36
@allisonwang-db allisonwang-db force-pushed the spark-52146-cyc-func-usage branch from 30c74a8 to eaee0a7 Compare July 23, 2025 21:44
}
// Check cyclic reference using qualified function names.
val newPath = path :+ f.function.name
if (f.function.name == name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we consider case sensitivity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and is this name qualified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function names are always case insensitive, and the name here is qualified. I've added a few more tests to cover both scenarios.

allisonwang-db and others added 2 commits July 24, 2025 10:47
…log/SessionCatalog.scala

Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
@cloud-fan
Copy link
Contributor

need to fix SparkThrowableSuite

@@ -367,6 +373,61 @@ case class CreateSQLFunctionCommand(
}
}

/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fix the indentation, @allisonwang-db ?

CREATE OR REPLACE FUNCTION foo3_4a(x INT) RETURN FoO3_4b(x);
CREATE OR REPLACE FUNCTION foo3_4a(x INT) RETURNS INT RETURN SELECT SUM(a) FROM foo3_4e(x);
CREATE OR REPLACE FUNCTION foo3_4e(x INT) RETURNS TABLE (c INT) RETURN SELECT * FROM foo3_4f(x);
CREATE OR REPLACE FUNCTION foo3_4e(x INT) RETURNS TABLE RETURN SELECT * FROM fOo3_4F(x);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this test coverage.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (with one minor comment).

cc @peter-toth

@cloud-fan
Copy link
Contributor

cloud-fan commented Jul 28, 2025

the last commit just fixed indentation, thanks, merging to master/4.0!

@cloud-fan cloud-fan closed this in 3ff28ae Jul 28, 2025
cloud-fan added a commit that referenced this pull request Jul 28, 2025
This change adds logic to detect cyclic function references when creating SQL UDFs to prevent infinite analysis.

To improve SQL UDF usability

No

New tests

No

Closes #51626 from allisonwang-db/spark-52146-cyc-func-usage.

Lead-authored-by: Allison Wang <allison.wang@databricks.com>
Co-authored-by: Allison Wang <allisonwang@apache.org>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 3ff28ae)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@peter-toth
Copy link
Contributor

Late LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants