Skip to content

REGEXP function

Estimated reading: 6 minutes 692 views
REGEXP_COUNT

Returns the number of times a regular expression matches a string.

Syntax: REGEXP_COUNT(string|categorical_column,categorical_expr, pattern, position, regexp_modifier )

Example: Finding the count of strings which contain ‘United.’

Example Queries:

  • Count occurrences of ‘High’ in the order priority column:
    REGEXP_COUNT([Retail].[Order Priority], ‘High’)
  • Count occurrences of ‘United’ in the region column:
    REGEXP_COUNT([Retail].[Region], ‘United’)
  • Count occurrences of ‘Tech’ in the product category column:
    REGEXP_COUNT([Retail].[Category], ‘Tech’)
  • Count occurrences of ‘Express’ in the ship mode column:
    REGEXP_COUNT([Retail].[Ship Mode], ‘Express’)
  • Count occurrences of ‘Discount’ in the order discount column (useful if it contains text values:
    REGEXP_COUNT([Retail].[Order Discount], ‘Discount’)
REGEXP_ILIKE

Performs a case-insensitive pattern-matching comparison.

Syntax – REGEXP_ILIKE(string|categorical_column,categorical_expr )

Example: Returning true if the string contains ‘I.’

Example Queries:

  • Check if the country starts with ‘I’, otherwise return the city:
    CASE WHEN REGEXP_ILIKE([Retail].[Country], ‘^I’) THEN [Retail].[Country]
    ELSE [Retail].[City]
    END 
  • Check if the customer name contains ‘Sharma’, otherwise return ‘Other Customer’:
    CASE WHEN REGEXP_ILIKE([Customer].[Customer Name], ‘Sharma’) THEN [Customer].[Customer Name] ELSE ‘Other Customer’
    END
  • Check if the product sub-category contains ‘Phone’, otherwise return ‘Other Product’:
    CASE WHEN REGEXP_ILIKE([Product].[Sub Category], ‘Phone’) THEN ‘Mobile Device’ 
    ELSE ‘Other Product’ 
    END
  • Check if the order priority contains ‘Critical’, otherwise return ‘Normal Order’:
    CASE WHEN REGEXP_ILIKE([Order].[Order Priority], ‘Critical’) THEN ‘Urgent Order’
    ELSE ‘Normal Order’
    END 
  • Check if the ship mode starts with ‘S’ (e.g., ‘Standard’, ‘Same Day’), otherwise return ‘Other Mode’:
    CASE WHEN REGEXP_ILIKE([Order].[Ship Mode], ‘^S’) THEN [Order].[Ship Mode]
    ELSE ‘Other Mode’
    END 
REGEXP_INSTR

It is used to find the position of the first occurrence of a regular expression pattern in a string.

Syntax: REGEXP_INSTR(string|categorical_column, categorical_expr)

Example: Returning true if the string matches.

Example Queries:

  • Return true if occurrences of ‘High’ in the order priority column:
    REGEXP_INSTR([Retail].[Order Priority], ‘High’)
  • Return true if occurrences of ‘United’ in the region column:
    REGEXP_INSTR([Retail].[Region], ‘United’)
  • Return true if occurrences of ‘Tech’ in the product category column:
    REGEXP_INSTR([Retail].[Category], ‘Tech’)
  • Return true if occurrences of ‘Express’ in the ship mode column:
    REGEXP_INSTR([Retail].[Ship Mode], ‘Express’)
  • Find the position of ‘Phone’ in the Sub Category column:
    REGEXP_INSTR([Product].[Sub Category], ‘Phone’)
REGEXP_LIKE

Performs conditional matching.

Syntax: REGEXP_LIKE(string|categorical_column,categorical_expr)

Example: It will give country name if it starts with ‘I’ else city name.

Example Queries:

  • Check if the country starts with ‘I’, otherwise return the city:
    CASE WHEN REGEXP_LIKE([Retail].[Country], ‘^I’) THEN [Retail].[Country]
    ELSE [Retail].[City]
    END 
  • Check if the customer name contains ‘Sharma’, otherwise return ‘Other Customer’:
    CASE WHEN REGEXP_LIKE([Customer].[Customer Name], ‘Sharma’) THEN [Customer].[Customer Name]
    ELSE ‘Other Customer’
    END
  • Check if the product sub-category contains ‘Phone’, otherwise return ‘Other Product’:
    CASE WHEN REGEXP_LIKE([Product].[Sub Category], ‘Phone’) THEN ‘Mobile Device’
    ELSE ‘Other Product’
    END
  • Check if the order priority contains ‘Critical’, otherwise return ‘Normal Order’:
    CASE WHEN REGEXP_LIKE([Order].[Order Priority], ‘Critical’) THEN ‘Urgent Order’
    ELSE ‘Normal Order’
    END 
  • Check if the ship mode starts with ‘S’ (e.g., ‘Standard’, ‘Same Day’), otherwise return ‘Other Mode’:
    CASE WHEN REGEXP_LIKE([Order].[Ship Mode], ‘^S’) THEN [Order].[Ship Mode]
    ELSE ‘Other Mode’
    END 
REGEXP_NOT_ILIKE

This query will return rows where the specified pattern does not match in a case-insensitive manner.

Syntax: REGEXP_NOT_ILIKE(string|categorical_column,categorical_expr)

Example: This will return true if ‘United’ is not present in the string.

Example Queries:

  • Check if Country does not start with ‘United’, then return State, else return City:
    CASE WHEN REGEXP_NOT_LIKE([Retail].[Country], ‘^United’) THEN [Retail].[State]
    ELSE [Retail].[City]
    END 
  • Check if Order Priority does not contain ‘High’, then return Order Date, else return Ship Date:
    CASE WHEN REGEXP_NOT_LIKE([Retail].[Order Priority], ‘High’) THEN [Retail].[Order Date]
    ELSE [Retail].[Ship Date]
    END 
  • Check if Customer Segment does not contain ‘Corporate’, then return Customer Name, else return Customer ID:
    CASE WHEN REGEXP_NOT_LIKE([Retail].[Customer Segment], ‘Corporate’) THEN [Retail].[Customer Name] ELSE [Retail].[Customer ID]
    END 
  • Check if Ship Mode does not start with ‘Standard’, then return Shipping Cost, else return Order Discount:
    CASE WHEN REGEXP_NOT_LIKE([Retail].[Ship Mode], ‘^Standard’) THEN [Retail].[Shipping Cost]
    ELSE [Retail].[Order Discount]
    END 
  • Check if Category does not contain ‘Technology’, then return Sub Category, else return Product:
    CASE WHEN REGEXP_NOT_LIKE([Retail].[Category], ‘Technology’) THEN [Retail].[Sub Category]
    ELSE [Retail].[Product] 
    END 
REGEXP_REPLACE

Replace all occurrences of a substring that match a regular expression with another substring.

Syntax: REGEXP_REPLACE(column, expression)

Example: Here we are replacing capital ‘I’ with small ‘i.’

Example Queries:

  • Replace the first letter ‘I’ with ‘i’ in the Country column:
    REGEXP_REPLACE([Retail].[Country], ‘^I’, ‘i’)
  • Replace the first occurrence of ‘New’ with ‘Old’ in the City column:
    REGEXP_REPLACE([Retail].[City], ‘^New’, ‘Old’)
  • Replace ‘Critical’ with ‘High Priority’ in the Order Priority column:
    REGEXP_REPLACE([Retail].[Order Priority], ‘Critical’, ‘High Priority’)
  • Replace the first occurrence of ‘Tech’ with ‘Technology’ in the Category column:
    REGEXP_REPLACE([Retail].[Category], ‘Tech’, ‘Technology’)
  • Replace ‘Online’ with ‘Digital’ in the Channel column:
    REGEXP_REPLACE([Retail].[Channel], ‘Online’, ‘Digital’)
REGEXP_SUBSTR

Returns the substring that matches a regular expression within a string. If no matches are found, this function returns NULL.

Syntax: REGEXP_SUBSTR(string|categorical_column,categorical_expr)

Example: If no matches are found this will return null.

Example Queries:

  • Extract the substring starting with ‘Ind’ from the Country column (e.g., ‘India’, ‘Indonesia’):
    REGEXP_SUBSTR([Retail].[Country], ‘^Ind’)
  • Extract the substring starting with ‘New’ from the City column (e.g., ‘New York’, ‘New Delhi’):
    REGEXP_SUBSTR([Retail].[City], ‘^New’)
  • Extract ‘Tech’ if it appears at the beginning of the Category column:
    REGEXP_SUBSTR([Retail].[Category], ‘^Tech’)
  • Extract ‘Express’ if it appears at the beginning of the Ship Mode column:
    REGEXP_SUBSTR([Retail].[Ship Mode], ‘^Express’)
  • Extract ‘Online’ if it appears at the beginning of the Channel column:
    REGEXP_SUBSTR([Retail].[Channel], ‘^Online’)

Leave a Reply

Your email address will not be published. Required fields are marked *

Share this Doc

REGEXP function

Or copy link

CONTENTS