-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-52444][SQL][CONNECT] Add support for Variant/Char/Varchar Literal #51215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ping @hvanhovell to take a look. |
|
||
message Char { | ||
string value = 1; | ||
optional int32 length = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only needed when the length of the value and the intended data type do not match right? If so please document this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when using Char or Varchar, the length field is better to provide, or String type is preferred. We will use the length field if it's provided and do some validation, or use len(value) as a default when length is omitted. I added these descriptions to the documents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering whether we should just use DataType.Char.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, using char in DataType
will avoid duplication of meta info, but we already have some duplication in Decimal
and Time
, should we also make some adjustments for them? cc @hvanhovell
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to make some adjustments for them as well. cc @HyukjinKwon and @zhengruifeng
@@ -240,6 +243,21 @@ message Expression { | |||
Strings strings = 6; | |||
} | |||
} | |||
|
|||
message Variant { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure to provide a reference to the format that is used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a reference to Spark's VariantVal, and I searched the docs directory without finding expressions-related docs, so I changed no documents in the docs directory.
b9cf82b
to
dfbdf73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell Thank you for your review, I'm sorry that I forgot to submit my reply, and I have resolved your comments, PTAL again.
@@ -240,6 +243,21 @@ message Expression { | |||
Strings strings = 6; | |||
} | |||
} | |||
|
|||
message Variant { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a reference to Spark's VariantVal, and I searched the docs directory without finding expressions-related docs, so I changed no documents in the docs directory.
|
||
message Char { | ||
string value = 1; | ||
optional int32 length = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when using Char or Varchar, the length field is better to provide, or String type is preferred. We will use the length field if it's provided and do some validation, or use len(value) as a default when length is omitted. I added these descriptions to the documents.
bc7bdd6
to
bbdbbce
Compare
// - If omitted, uses the actual length of `value`. | ||
// - If provided but smaller than `value`'s length, an error will be thrown. | ||
// - If provided and larger than `value`'s length, stores exact value without padding. | ||
optional int32 length = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering whether we should just use DataType.VarChar.
e98cfdb
to
2d8db4b
Compare
What changes were proposed in this pull request?
We are currently missing support for a number of DataTypes, add Variant/Char/Varchar in this PR,
Strings with Collation, YearMonthIntervalType with begin/end and DayTimeIntervalType with begin/end require us to change the expression structure and will be done in another PR.
Why are the changes needed?
Align the gap between literal expressions and data types.
Does this PR introduce any user-facing change?
Yes.
fun.lit()
can pass a VariantVal.How was this patch tested?
fun.lit
andfun.typedLit
.Was this patch authored or co-authored using generative AI tooling?
No