« Data Whisperer: Detecting & Extracting Meaning from Changing Data Presentation at Philly .NET Code Camp 2012.2 | Main | Presenting at Philly.Net Code Camp 2011.2 Oct 15 »
Monday
Oct172011

Learn RegEx in T-SQL & kick as{2}|a[s][s]+

Thanks to all who attended the Learn RegEx in TSQL presentation at the Philly.Net Code Camp 2011.2.  Especially those who took the time to catch up with me throughout the day.

RegEx is a big topic and is hard to learn in only an hour and 20 minutes. I do hope that you got enough information to get yourself started in learning how to use RegEx in TSQL in your own environment.

All the links discussed in the presentation are included throughout this post. The PowerPoint was not included since it seemed kind of pointless without me also attached to the presentation. Some of the content was extracted and included in this post.

The Class!

Code Camp Heroes Bill and Rob

Code:

Down Load The code for the presentation.

The file 01 - installRegexAssembly.sql is my version of the create assembley and create of the RegEx functions in the [util] schema.

Goal:

Get you started using RegEx in TSQL

  • Load the toolbox with techniques 
  • Exposure to variations 
  • Under the hood knowledge

 

Warning

  •  There are many ways to write a Regular Expression!
  • RegEx isn’t inherently better then other solutions, even in TSQL.

 

What Are Regular Expressions?

Tools for matching and manipulating text RegEx pattern = as{2}

Will Match

  • ass
  • assume
  • passenger
  • class

Will NOT match

  • fast
  • cake
  • salad

 

Why use RegEx?

Over standard TSQL RegEx can:

  • Be faster
  • Be easier to code
  • Require fewer steps
  • Accomplish more complex tasks

 

What can RegEx be used for?

  • Validate Data Format 
  • Scrub Data 
  • Same as standard TSQL text functions but allow for more complex criteria 
    • Replace() 
    • LIKE ‘%’  --Technically not a function but you get the idea
    • PATINDEX() 
    • SUBSTRING()

 

How to use RegEx in TSQL?

 

Formula for RegEx in TSQL

SQL 2005 and up supports .Net CLR for User Defined Functions

.Net has built in base class library for RegEx: System.Text.RegularExpressions

Combine above two facts -> RegEx in TSQL

 

Now how do we actually use RegEx in TSQL?

Write the code for the .NET based UDFs ourselves 

OR

Use someone else's code!

http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/

The is an excellent article by Phil Factor over at Simple-talk.  There are a few parts to this article including the basics of writing .NET based functions in TSQL.  There is also a file that lets you jump start using RegEx in TSQL.

Keep in mind you will need to be running SQL 2005 or newer for this code to work.  It will also work on express versions of SQL server too.  The CLR will also need to be enabled for the database you want to install the RegEx functions.

 There are 4 files at the top of the article you can download:

  1. InstallRegex.sql
  2. installRegexAssembly.sql
  3. installRegexSample.sql
  4. RegexSQLCLR.vb

The fourth file is a text file with the VB code that gets compiled in to the DLL.  You do not need to do this unless you are curious and interested in the details.

To jump right in to using the TSQL UDFs in your data base just run the first to files.  The first one, despite its name creates the Assembly on the SQL db.  The second creates the functions in TSQL that are implemented by the code in the assembly.

NOTE: The code linked to above installs the functions in the [dbo] schema.  The version used in the presentation was modified to create a new Schema [util] and install the assembly and functions under that schema.

 

Overview of the RegEx Functions

The functions installed by the SQL scripts will create 9 funtions that will be useable in your TSQL code.  The break down as follows based on return type:

 

  • Boolean 
    • RegExIsMatch() 
  • Text Returning 
    • RegExReplace () 
    • RegExReplaceX ()
    • RegExMatch()
  • Table Valued 
    • RegExSplit() 
    • RegExMatches () 
  • Integer 
    • RegExIndex() 
  • RegEx Utility 
    • RegExEscape() 
    • RegExOptionEnumeration()

 

Tips for Learning RegEx patterns

Have a good cheat sheet. Here are some I've used

http://www.regular-expressions.info/reference.html

http://regexlib.com/CheatSheet.aspx

http://regexlib.com/DisplayPatterns.aspx This link is to a libray of expressions for matching common patterns.

References (1)

References allow you to track sources for this article, as well as articles that were written in response to this article.

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
All HTML will be escaped. Hyperlinks will be created for URLs automatically.