Operators#
Category |
Description |
|---|---|
Boolean logic operators (AND, OR, NOT). |
|
Value comparison and membership tests (=, >, <, IS, BETWEEN, IN, ALL, ANY). |
|
Arithmetic operators (+, -, *, /, //, %, unary). |
|
Bitwise integer operators (&, |, XOR). |
|
String matching and concatenation (LIKE, ILIKE, ||, ^@). |
|
Regular expression matching (~, ~*, REGEXP, RLIKE). |
|
Struct field and array element access (->, ->>, #>, [n]). |
Logical#
Operator |
Description |
|---|---|
Combines conditions, returning True if both conditions are True. |
|
Combines conditions, returning True if either condition is True. |
|
Negates a condition, returning True if the condition is False. |
AND: Combines conditions, returning True if both conditions are True.
Example:
df = pl.DataFrame({"foo": [10, 20, 50, 35], "bar": [10, 10, 50, 35]})
df.sql("""
SELECT * FROM self
WHERE (foo >= bar) AND (bar < 50) AND (foo != 10)
""")
# shape: (2, 2)
# ┌─────┬─────┐
# │ foo ┆ bar │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 20 ┆ 10 │
# │ 35 ┆ 35 │
# └─────┴─────┘
OR: Combines conditions, returning True if either condition is True.
Example:
df = pl.DataFrame({"foo": [10, 20, 50, 35], "bar": [10, 10, 50, 35]})
df.sql("""
SELECT * FROM self
WHERE (foo % 2 != 0) OR (bar > 40)
""")
# shape: (2, 2)
# ┌─────┬─────┐
# │ foo ┆ bar │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 50 ┆ 50 │
# │ 35 ┆ 35 │
# └─────┴─────┘
NOT: Negates a condition, returning True if the condition is False.
Example:
df = pl.DataFrame({"foo": [10, 20, 50, 35], "bar": [10, 10, 50, 35]})
df.sql("""
SELECT * FROM self
WHERE NOT(foo % 2 != 0 OR bar > 40)
""")
# shape: (2, 2)
# ┌─────┬─────┐
# │ foo ┆ bar │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 10 ┆ 10 │
# │ 20 ┆ 10 │
# └─────┴─────┘
Comparison#
Operator |
Description |
|---|---|
Greater than (or equal to). |
|
Less than (or equal to). |
|
Equal to (or not equal to). |
|
Check if a value is one of the special values NULL, TRUE, or FALSE. |
|
Check if a value is distinct from another value; this is the NULL-safe equivalent of |
|
Check if a value is between two other values. |
|
Check if a value is present in a list of values. |
|
Compare a value against all values in a column. |
|
Compare a value against any value in a column. |
Returns True if the first value is greater than the second value.
>: Greater than.>=: Greater than or equal to.Example:
df = pl.DataFrame({"n": [1000, 3000, 5000]})
df.sql("SELECT * FROM self WHERE n > 3000")
# shape: (1, 1)
# ┌──────┐
# │ n │
# │ --- │
# │ i64 │
# ╞══════╡
# │ 5000 │
# └──────┘
df.sql("SELECT * FROM self WHERE n >= 3000")
# shape: (2, 1)
# ┌──────┐
# │ n │
# │ --- │
# │ i64 │
# ╞══════╡
# │ 3000 │
# │ 5000 │
# └──────┘
Returns True if the first value is less than the second value.
<: Less than.<=: Less than or equal to.Example:
df = pl.DataFrame({"n": [1000, 3000, 5000]})
df.sql("SELECT * FROM self WHERE n < 3000")
# shape: (1, 1)
# ┌──────┐
# │ n │
# │ --- │
# │ i64 │
# ╞══════╡
# │ 1000 │
# └──────┘
df.sql("SELECT * FROM self WHERE n <= 3000")
# shape: (2, 1)
# ┌──────┐
# │ n │
# │ --- │
# │ i64 │
# ╞══════╡
# │ 1000 │
# │ 3000 │
# └──────┘
Returns True if the two values are considered equal.
=: Equal to.!=, <>: Not equal to.Example:
df = pl.DataFrame({"n": [1000, 3000, 5000]})
df.sql("SELECT * FROM self WHERE n = 3000")
# shape: (1, 1)
# ┌──────┐
# │ n │
# │ --- │
# │ i64 │
# ╞══════╡
# │ 3000 │
# └──────┘
df.sql("SELECT * FROM self WHERE n != 3000")
# shape: (2, 1)
# ┌──────┐
# │ n │
# │ --- │
# │ i64 │
# ╞══════╡
# │ 1000 │
# │ 5000 │
# └──────┘
IS: Returns True if the first value is identical to the second value (typically one of NULL, TRUE, or FALSE).
Unlike == (and !=), this operator will always return TRUE or FALSE, never NULL.
Example:
df = pl.DataFrame({"lbl": ["aa", "bb", "cc"], "n": [1000, None, 5000]})
df.sql("SELECT * FROM self WHERE n IS NULL")
# shape: (1, 2)
# ┌─────┬──────┐
# │ lbl ┆ n │
# │ --- ┆ --- │
# │ str ┆ i64 │
# ╞═════╪══════╡
# │ bb ┆ null │
# └─────┴──────┘
df.sql("SELECT * FROM self WHERE n IS NOT NULL")
# shape: (2, 2)
# ┌─────┬──────┐
# │ lbl ┆ n │
# │ --- ┆ --- │
# │ str ┆ i64 │
# ╞═════╪══════╡
# │ aa ┆ 1000 │
# │ cc ┆ 5000 │
# └─────┴──────┘
IS DISTINCT FROM: Compare two values; this operator assumes that NULL values are equal.IS NOT DISTINCT FROM, can also be written using the <=> operator.== (or !=) for non-NULL values.Example:
df = pl.DataFrame({"n1": [2222, None, 8888], "n2": [4444, None, 8888]})
df.sql("SELECT * FROM self WHERE n1 IS DISTINCT FROM n2")
# shape: (1, 2)
# ┌──────┬──────┐
# │ n1 ┆ n2 │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞══════╪══════╡
# │ 2222 ┆ 4444 │
# └──────┴──────┘
df.sql("SELECT * FROM self WHERE n1 IS NOT DISTINCT FROM n2")
df.sql("SELECT * FROM self WHERE n1 <=> n2")
# shape: (2, 2)
# ┌──────┬──────┐
# │ n1 ┆ n2 │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞══════╪══════╡
# │ null ┆ null │
# │ 8888 ┆ 8888 │
# └──────┴──────┘
BETWEEN: Returns True if the first value is between the second and third values (inclusive).
Example:
df = pl.DataFrame({"n": [1000, 2000, 3000, 4000]})
df.sql("SELECT * FROM self WHERE n BETWEEN 2000 AND 3000")
# shape: (2, 1)
# ┌──────┐
# │ n │
# │ --- │
# │ i64 │
# ╞══════╡
# │ 2000 │
# │ 3000 │
# └──────┘
df.sql("SELECT * FROM self WHERE n NOT BETWEEN 2000 AND 3000")
# shape: (2, 1)
# ┌──────┐
# │ n │
# │ --- │
# │ i64 │
# ╞══════╡
# │ 1000 │
# │ 4000 │
# └──────┘
IN: Returns True if the value is found in the given list (or subquery result).
Example:
df = pl.DataFrame({"n": [1, 2, 3, 4, 5]})
df.sql("SELECT * FROM self WHERE n IN (2, 4)")
# shape: (2, 1)
# ┌─────┐
# │ n │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 2 │
# │ 4 │
# └─────┘
df.sql("SELECT * FROM self WHERE n NOT IN (2, 4)")
# shape: (3, 1)
# ┌─────┐
# │ n │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1 │
# │ 3 │
# │ 5 │
# └─────┘
ALL: Compare a value against all values in a column expression.>, <, >=, <=.Example:
df = pl.DataFrame({"a": [1, 5, 10], "b": [3, 3, 3]})
df.sql("SELECT * FROM self WHERE a > ALL(b)")
# shape: (2, 2)
# ┌─────┬─────┐
# │ a ┆ b │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 5 ┆ 3 │
# │ 10 ┆ 3 │
# └─────┴─────┘
ANY: Compare a value against any value in a column expression.>, <, >=, <=, =, !=.Example:
df = pl.DataFrame({"a": [1, 5, 10], "b": [3, 7, 3]})
df.sql("SELECT * FROM self WHERE a < ANY(b)")
# shape: (2, 2)
# ┌─────┬─────┐
# │ a ┆ b │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 1 ┆ 3 │
# │ 5 ┆ 7 │
# └─────┴─────┘
Numeric#
Operator |
Description |
|---|---|
Addition and subtraction. |
|
Multiplication and division. |
|
Modulo (remainder). |
|
Unary plus and minus. |
+: Addition.-: Subtraction.Example:
df = pl.DataFrame({"a": [10, 20, 30], "b": [3, 7, 4]})
df.sql("SELECT a + b AS sum, a - b AS diff FROM self")
# shape: (3, 2)
# ┌─────┬──────┐
# │ sum ┆ diff │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪══════╡
# │ 13 ┆ 7 │
# │ 27 ┆ 13 │
# │ 34 ┆ 26 │
# └─────┴──────┘
*: Multiplication./: Division.//: Integer division.Example:
df = pl.DataFrame({"a": [10, 20, 30], "b": [5.55, 4.25, 3.33]})
df.sql("""
SELECT
*,
a * b AS mul,
a / b AS div,
a // b AS int_div
FROM self
""")
# shape: (3, 2)
# ┌─────┬─────┐
# │ mul ┆ div │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 30 ┆ 3 │
# │ 140 ┆ 2 │
# │ 120 ┆ 7 │
# └─────┴─────┘
%: Returns the remainder of the division.
Example:
df = pl.DataFrame({"a": [10, 20, 30], "b": [3, 7, 4]})
df.sql("SELECT a % b AS mod FROM self")
# shape: (3, 1)
# ┌─────┐
# │ mod │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1 │
# │ 6 │
# │ 2 │
# └─────┘
+: Unary plus (no-op; returns the value unchanged).-: Unary minus (negates the value).Example:
df = pl.DataFrame({"n": [1, -2, 3]})
df.sql("SELECT -n AS neg FROM self")
# shape: (3, 1)
# ┌─────┐
# │ neg │
# │ --- │
# │ i64 │
# ╞═════╡
# │ -1 │
# │ 2 │
# │ -3 │
# └─────┘
Bitwise#
&: Performs a bitwise AND on two integer values.
Example:
df = pl.DataFrame({"a": [12, 10], "b": [10, 6]})
df.sql("SELECT a & b AS bw_and FROM self")
# shape: (2, 1)
# ┌────────┐
# │ bw_and │
# │ --- │
# │ i64 │
# ╞════════╡
# │ 8 │
# │ 2 │
# └────────┘
|: Performs a bitwise OR on two integer values.
Example:
df = pl.DataFrame({"a": [12, 10], "b": [10, 6]})
df.sql("SELECT a | b AS bw_or FROM self")
# shape: (2, 1)
# ┌───────┐
# │ bw_or │
# │ --- │
# │ i64 │
# ╞═══════╡
# │ 14 │
# │ 14 │
# └───────┘
XOR: Performs a bitwise exclusive OR on two integer values.
Example:
df = pl.DataFrame({"a": [12, 10], "b": [10, 6]})
df.sql("SELECT a XOR b AS bw_xor FROM self")
# shape: (2, 1)
# ┌────────┐
# │ bw_xor │
# │ --- │
# │ i64 │
# ╞════════╡
# │ 6 │
# │ 12 │
# └────────┘
String#
Operator |
Description |
|---|---|
Matches a SQL “LIKE” expression. |
|
Matches a SQL “ILIKE” expression (case-insensitive “LIKE”). |
|
Concatenate two string values. |
|
Check if a string starts with a given prefix. |
Note
The LIKE operators match on string patterns, using the following wildcards:
The percent sign
%represents zero or more characters.An underscore
_represents one character.
LIKE, ~~: Matches a SQL “LIKE” pattern.NOT LIKE, !~~: Does not match a SQL “LIKE” pattern.Example:
df = pl.DataFrame({"name": ["Alice", "Bob", "Charlie", "David"]})
df.sql("SELECT * FROM self WHERE name LIKE 'A%'")
# shape: (1, 1)
# ┌───────┐
# │ name │
# │ --- │
# │ str │
# ╞═══════╡
# │ Alice │
# └───────┘
df.sql("SELECT * FROM self WHERE name NOT LIKE '%o%'")
# shape: (3, 1)
# ┌─────────┐
# │ name │
# │ --- │
# │ str │
# ╞═════════╡
# │ Alice │
# │ Charlie │
# │ David │
# └─────────┘
Using the PostgreSQL-style operator syntax:
df.sql("SELECT * FROM self WHERE name ~~ '_o%'")
# shape: (1, 1)
# ┌──────┐
# │ name │
# │ --- │
# │ str │
# ╞══════╡
# │ Bob │
# └──────┘
ILIKE, ~~*: Matches a SQL “LIKE” pattern, case insensitively.NOT ILIKE, !~~*: Does not match a SQL “LIKE” pattern, case insensitively.Example:
df = pl.DataFrame({"name": ["Alice", "bob", "CHARLIE", "David"]})
df.sql("SELECT * FROM self WHERE name ILIKE '%LI%'")
# shape: (2, 1)
# ┌─────────┐
# │ name │
# │ --- │
# │ str │
# ╞═════════╡
# │ Alice │
# │ CHARLIE │
# └─────────┘
df.sql("SELECT * FROM self WHERE name NOT ILIKE 'b%'")
# shape: (3, 1)
# ┌─────────┐
# │ name │
# │ --- │
# │ str │
# ╞═════════╡
# │ Alice │
# │ CHARLIE │
# │ David │
# └─────────┘
||: Concatenate two or more string values.
Example:
df = pl.DataFrame({"first": ["Alice", "Bob"], "last": ["Smith", "Jones"]})
df.sql("SELECT first || ' ' || last AS full_name FROM self")
# shape: (2, 1)
# ┌─────────────┐
# │ full_name │
# │ --- │
# │ str │
# ╞═════════════╡
# │ Alice Smith │
# │ Bob Jones │
# └─────────────┘
Note
Non-string values are automatically cast to string when used with the || operator.
^@: Returns True if the first string value starts with the second string value.
Example:
df = pl.DataFrame({"name": ["Alice", "Bob", "Arthur", "David"]})
df.sql("SELECT * FROM self WHERE name ^@ 'A'")
# shape: (2, 1)
# ┌────────┐
# │ name │
# │ --- │
# │ str │
# ╞════════╡
# │ Alice │
# │ Arthur │
# └────────┘
Regex#
Operator |
Description |
|---|---|
Match a regular expression. |
|
Match a regular expression, case insensitively. |
|
Match a regular expression using keyword syntax. |
~: Match a regular expression.!~: Does not match a regular expression.Example:
df = pl.DataFrame({"idx": [0, 1, 2, 3], "lbl": ["foo", "bar", "baz", "zap"]})
df.sql("SELECT * FROM self WHERE lbl ~ '.*[aeiou]$'")
# shape: (1, 2)
# ┌─────┬─────┐
# │ idx ┆ lbl │
# │ --- ┆ --- │
# │ i64 ┆ str │
# ╞═════╪═════╡
# │ 0 ┆ foo │
# └─────┴─────┘
df.sql("SELECT * FROM self WHERE lbl !~ '^b[aeiou]'")
# shape: (2, 2)
# ┌─────┬─────┐
# │ idx ┆ lbl │
# │ --- ┆ --- │
# │ i64 ┆ str │
# ╞═════╪═════╡
# │ 0 ┆ foo │
# │ 3 ┆ zap │
# └─────┴─────┘
~*: Match a regular expression, case insensitively.!~*: Does not match a regular expression, case insensitively.Example:
df = pl.DataFrame({"idx": [0, 1, 2, 3], "lbl": ["FOO", "bar", "Baz", "ZAP"]})
df.sql("SELECT * FROM self WHERE lbl ~* '.*[aeiou]$'")
# shape: (1, 2)
# ┌─────┬─────┐
# │ idx ┆ lbl │
# │ --- ┆ --- │
# │ i64 ┆ str │
# ╞═════╪═════╡
# │ 0 ┆ FOO │
# └─────┴─────┘
df.sql("SELECT * FROM self WHERE lbl !~* '^b[aeiou]'")
# shape: (2, 2)
# ┌─────┬─────┐
# │ idx ┆ lbl │
# │ --- ┆ --- │
# │ i64 ┆ str │
# ╞═════╪═════╡
# │ 0 ┆ FOO │
# │ 3 ┆ ZAP │
# └─────┴─────┘
REGEXP, RLIKE: Match a regular expression using keyword syntax.NOT REGEXP, NOT RLIKE: Does not match a regular expression.These are equivalent to the ~ and !~ operators, but use keyword syntax
instead of symbolic operators.
Example:
df = pl.DataFrame({"idx": [0, 1, 2, 3], "lbl": ["foo", "bar", "baz", "zap"]})
df.sql("SELECT * FROM self WHERE lbl REGEXP '^b'")
# shape: (2, 2)
# ┌─────┬─────┐
# │ idx ┆ lbl │
# │ --- ┆ --- │
# │ i64 ┆ str │
# ╞═════╪═════╡
# │ 1 ┆ bar │
# │ 2 ┆ baz │
# └─────┴─────┘
df.sql("SELECT * FROM self WHERE lbl NOT RLIKE '[aeiou]$'")
# shape: (3, 2)
# ┌─────┬─────┐
# │ idx ┆ lbl │
# │ --- ┆ --- │
# │ i64 ┆ str │
# ╞═════╪═════╡
# │ 1 ┆ bar │
# │ 2 ┆ baz │
# │ 3 ┆ zap │
# └─────┴─────┘
Indexing#
Operator |
Description |
|---|---|
Access a struct field by name or index, returning the native type. |
|
Access a struct field by name or index, returning a string. |
|
Access a struct field by path, returning the native type or a string. |
|
Access an array element by index (1-indexed). |
->: Access a struct field by name or by ordinal index, returning the value in its native type.
Example:
df = pl.DataFrame(
{"data": [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]}
)
df.sql("SELECT data -> 'name' AS name, data -> 'age' AS age FROM self")
# shape: (2, 2)
# ┌───────┬─────┐
# │ name ┆ age │
# │ --- ┆ --- │
# │ str ┆ i64 │
# ╞═══════╪═════╡
# │ Alice ┆ 30 │
# │ Bob ┆ 25 │
# └───────┴─────┘
->>: Access a struct field by name or by ordinal index, returning the value cast to a string.
Example:
df = pl.DataFrame(
{"data": [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]}
)
df.sql("SELECT data ->> 'age' AS age FROM self")
# shape: (2, 1)
# ┌─────┐
# │ age │
# │ --- │
# │ str │
# ╞═════╡
# │ 30 │
# │ 25 │
# └─────┘
#>: Access a struct field by path, returning the native type.#>>: Access a struct field by path, returning a string.Example:
df = pl.DataFrame(
{"data": [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]}
)
df.sql("SELECT data #>> 'name' AS name FROM self")
# shape: (2, 1)
# ┌───────┐
# │ name │
# │ --- │
# │ str │
# ╞═══════╡
# │ Alice │
# │ Bob │
# └───────┘
[n]: Access an array element by index. SQL array indices are 1-based.
Example:
df = pl.DataFrame({"arr": [[10, 20, 30], [40, 50, 60]]})
df.sql("SELECT arr[1] AS first, arr[3] AS last FROM self")
# shape: (2, 2)
# ┌───────┬──────┐
# │ first ┆ last │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═══════╪══════╡
# │ 10 ┆ 30 │
# │ 40 ┆ 60 │
# └───────┴──────┘