Create SQL for Temp Table

Sometimes it is really useful to work with temp tables and to create them up front rather than using ‘Select Into’, but they can be a pain create for a couple of reasons:

  1. The temp table has loads of columns and you really don’t want to sit and type them all in.
  2. You are not aware of the structure of the source data.

The following stored procedure takes the hassle away. Merely select some data (use Select Top(0) for speed) into a temp table that has the desired structure and then execute the stored procedure below to generate the SQL will create the temp table.

Anyway, here’s the code.


Create Procedure dbo.GenerateTempTableSQL
    Given a temp table the sql required to create a new temp table of the same
    structure is generated.
    This is useful during development to select into a temp table and
    then generate the code that could be used to create the temp table.
        The name of the temp table to use as the template the generate the SQL
    @output_temp_table_name [optional: @input_temp_table_name]
        The name of the table that is used in the create table script. If
        left as null, then the @input_temp_table_name will be used.
    @database_collation [optional: current database collation]
        The collation to use for char based columns.
    16-Oct-2015 S.Wilber
    09-Nov-2015 S.Wilber
        Updated to include # in temp table name if it had been ommitted.
    14-Jul-2016 S.Wilber
        Updated to work from temp tables instead of permanent tables
    @input_temp_table_name sysname
   ,@output_temp_table_name sysname = null
   ,@database_collation sysname = null

-- declare constants

-- declare variables
Declare     @sql nvarchar(max)

-- table to store the columns
If Object_ID('tempdb..#temp') Is Not Null Drop Table #temp
Create Table #temp
    column_id int not null
   ,column_name nvarchar(130) null
   ,type_name nvarchar(130) not null
   ,max_length smallint not null
   ,scale tinyint not null
   ,[precision] tinyint not null
   ,is_nullable bit null
   ,collation_name sysname null

-- table to store the generated SQL
Declare     @output_sql table (line_id int identity(1, 1), sql nvarchar(max))
Declare     @object_id int

-- set the output name to be the input name if it is null or blank or whitespace
If NullIf(LTrim(RTrim(@output_temp_table_name)), '') Is Null Set @output_temp_table_name = @input_temp_table_name

-- set the collation if it's not set
If NullIf(LTrim(RTrim(@database_collation)), '') Is Null Set @database_collation = Cast(DatabasePropertyEx(db_name(), 'collation') As nvarchar(128))

-- add a # to the input temp table name if it does not already start with #
If Left(@input_temp_table_name, 1) != '#' Set @input_temp_table_name = '#' + @input_temp_table_name

-- add a # to the output temp table name if it does not already start with #
If Left(@output_temp_table_name, 1) != '#' Set @output_temp_table_name = '#' + @output_temp_table_name

-- get the object id
Select      @object_id = Object_ID('tempdb..' + @input_temp_table_name)

If @object_id Is Null
    RaisError('''%s'' cannot be found.', 11, 1, @input_temp_table_name)

-- generate the sql
Insert      @output_sql (sql)
Select      'If Object_ID(''tempdb..' + @output_temp_table_name + ''') Is Not Null Drop Table ' + @output_temp_table_name

Insert      @output_sql (sql)
Select      'Create Table ' + @output_temp_table_name

Insert      @output_sql (sql)
Select      '('

Insert      #temp (column_id, column_name, type_name, max_length, scale, precision, is_nullable, collation_name)
Select      c.column_id
           ,Case When CharIndex('|' + + '|', @reserved_words) != 0 Then '[' + + ']' Else End As column_name
           ,Case When CharIndex('|' + + '|', @reserved_words) != 0 Then '[' + + ']' Else End As type_name
From        tempdb.sys.objects o
Inner Join  tempdb.sys.schemas s On (s.schema_id = o.schema_id)
Inner Join  tempdb.sys.columns c On (o.object_id = c.object_id)
Inner Join  tempdb.sys.types t On (t.user_type_id = c.user_type_id)
Where       o.object_id = @object_id

Insert      @output_sql (sql)
Select      Case When column_id = 1 Then space(4) Else space(3) + ',' End
          + column_name + ' '
          + type_name
          + Case
                When Lower(type_name) In ('datetimeoffset', 'datetime2', 'time') Then
                    '(' + Cast(IsNull(c.scale, '') As varchar(10)) + ')'
                When Lower(type_name) In ('decimal', 'numeric') Then
                    '(' + Cast(IsNull(c.precision, '') As varchar(3)) + ', ' + Cast(IsNull(c.scale, '') As varchar(10)) + ')'
                When Lower(type_name) In ('varbinary', 'binary') Then
                  + Case
                        When c.max_length = -1 Then
                            Cast(c.max_length As varchar(10))
                  + ')'
                When Lower(type_name) In ('char', 'varchar', 'nchar', 'nvarchar') Then
                  + Case
                        When c.max_length = -1 Then
                        When type_name In ('nchar', 'nvarchar') Then
                            Cast(c.max_length / 2 As varchar(10))
                            Cast(IsNull(c.max_length, '') As varchar(10))
                  + ')'
                  + Case
                        When c.collation_name != @database_collation Then
                            ' Collate ' + c.collation_name
                        Else ''
          + Case
                When c.is_nullable = 0 Then
                    ' not null'
                    ' null'
From        #temp c
Order By    column_id

Insert      @output_sql (sql)
Select      ');'

Select      sql
From        @output_sql
Order By    line_id

T-SQL Strip Invalid Characters

Occasionally I need to strip invalid characters from a string and the requirements are often subtly different, sometimes we want numbers only, other times numbers and decimal points, and maybe only letters.

We’ve probably all had to do this at some point and probably grabbed something off the internet, but I finally took the time to write a more generic and performant version as listed below which allows for a valid character pattern to be passed in.

Create Function dbo.StripInvalidCharacters
    Strips invalid characters from a string. This is useful for cleaning up
    strings such as telephone numbers or human entered values such as
    '10.7 miles'
        The string to strip characters from.
        A search pattern that identifies invalid characters. Some example are:
        - '0-9'    - numbers excluding decimal point
        - '0-9.'   - numbers including decimal point
        - 'a-z'    - letters only
        - 'a-f0-9' - hexadecimal only
    10-May-2016 S.Wilber
    @input varchar(1000)
   ,@valid_character_pattern varchar(100)
Returns varchar(1000)
    While PatIndex('%[^' + @valid_character_pattern + ']%', @input) > 0
        Set @input = Stuff(@input,PatIndex('%[^' + @valid_character_pattern + ']%', @input), 1, '')     

    -- return
    Return @input

Feel free to copy and re-use. If you spot any improvements, please let me know and I’ll update this post.

Loading JSON List of Lists into PowerBI using PowerQuery

I was recently stuck trying to load a JSON source into PowerBI as the source itself appeared to be at its core a list of lists as per the below example.



This is actually an extract of the result from the following public API call to Kraken.

When trying to load this as a Web source it was initially presented as expected as shown below.



Drilling down on the ‘result’ Record produced the following where XETHZUSD was presented as a list. No problem so far.



Drilling down further on the XETHZUSD list produced the following, a list of lists.



There is not much that can be done at this point without transforming the list to a table, however this is only presented the list of lists as a column of lists as follows.



Expanding the columns does not create a result that we would be looking for as all the content from each individual list is merely expanded into the same list as shown below.



The secret is to add columns that reference specific items within the List stored in ‘Column1’.

To do this I added a new column with the following formula:



This statement returns the first item in the list stored in ‘Column1’.

This is great – it allows us to create our first column by dipping into the list. However, there is not a direct way to access individual elements of a list given their position.

Therefore to get the second item we need to skip over the first item. Thus, the formula for the second column is as follows:



This can then be repeated as required, e.g.






Once you have all the elements of the list required in their own separate columns the original list column, ‘Column1’ in our example here, can be removed. The final result is as shown below.



Categories: PowerBI, PowerQuery Tags: , , ,

Clear All Tables Within a Schema

August 11, 2015 Leave a comment

Often as part of an ETL process I want to simply clear out all the table within a schema, e.g. ‘landing’ or ‘etl’.

However, as new tables are added to the schema, the ‘clear out’ code needs to be maintained. In addition, if any referential integrity exists, the tables will need to be cleared in a certain order. Finally, the Truncate Table statement is significantly faster than a simple Delete, but there are restrictions on when it can be used.
All of which led me to create the following stored procedure so that I don’t have to think these concerns again, allowing me time to focus on the core of the ETL work.

Feel free to use/copy/amend as appropriate, but please let me have your comments/updates.

Create Procedure etl.ClearSchemaTables
    Clears down any tables in the target schema.
    Tables are cleared down in referential order using truncate where possible.
        The name of the schema whose tables should be cleared, e.g. 'landing'.
    03-Aug-2015 S.Wilber
    @target_schema_name sysname

Set         NoCount On

-- declare variables
Declare     @object_id int
           ,@table_name sysname
           ,@schema_name sysname
           ,@is_referenced bit
           ,@success bit
           ,@sql nvarchar(max)

If Object_ID('tempdb..#table_dependencies') Is Not Null Drop Table #table_dependencies
Create Table #table_dependencies
    source_object_id int
   ,dependancy_object_id int

-- declare tables
If Object_ID('tempdb..#tables') Is Not Null Drop Table #tables
Create Table #tables
    object_id int
   ,schema_name sysname
   ,table_name sysname
   ,is_referenced bit default(0)
   ,is_processed bit default(0)

-- find all tables in the schema - look for fk references that are not from self
Insert      #tables (object_id, schema_name, table_name, is_referenced)
Select      t.object_id
           ,Case When fk.object_id Is Not Null Then 1 Else 0 End
From        sys.tables t
Inner Join  sys.schemas s On (s.schema_id = t.schema_id)
Left Join   sys.foreign_keys fk On (fk.referenced_object_id = t.object_id and fk.parent_object_id != t.object_id)
Where = @target_schema_name

-- retrieve all interdependancies between batch managed tables.
Insert      #table_dependencies (source_object_id, dependancy_object_id)
Select      fk.parent_object_id
From        sys.foreign_keys fk
Inner Join  #tables s On (s.object_id = fk.parent_object_id)
Inner Join  #tables t On (t.object_id = fk.referenced_object_id)

-- loop through all the tables delete the content, either via a truncate or delete depending on foreign keys
While Exists (Select 1 From #tables Where is_processed = 0)
    -- find the first table where all of the dependant tables have been processed
    Select      @object_id = object_id
               ,@schema_name = schema_name
               ,@table_name = table_name
               ,@is_referenced = is_referenced
    From        #tables sbt
    Where       is_processed = 0
    And         Not Exists
                    Select      1
                    From        #table_dependencies td
                    Inner Join  #tables tbt On (tbt.object_id = td.source_object_id)
                    Where       td.dependancy_object_id = sbt.object_id
                    And         tbt.is_processed = 0

    -- if none found, then raise an error as it seems that we have a circular dependancy that we can't resolve
    If @object_id Is Null
        RaisError ('A table where all dependant tables have been processed cannot be found. This implies that circular dependencies exist. Batch rollback cannot continue.', 11, 0)

    Begin Try
        -- if referenced then we need to perform a delete otherwise we can perform a truncate
        If @is_referenced = 1 
            Set         @sql = 'Delete From [' + @schema_name + '].[' + @table_name + ']'
            Set         @sql = 'Truncate Table [' + @schema_name + '].[' + @table_name + ']'
        -- exeute the sql
        Exec        sp_executesql @stmt = @sql          
    End Try
    Begin Catch
        -- flag as unsuccessful
        Set         @success = 0

        -- declare variables
        Declare         @error_description varchar(2000)
                       ,@error_severity int
                       ,@error_state int

        -- fetch the error details
        Select         @error_description = ERROR_MESSAGE(), @error_severity = ERROR_SEVERITY(), @error_state = ERROR_STATE()

        -- throw the error
        RaisError (@error_description, @error_severity, @error_state) 
    End Catch
    -- update the table to state that that it has been processed
    Update      #tables
    Set         is_processed = 1
    Where       object_id = @object_id

Temporarily Disabling Referential Integrity Constraints on a SQL Table

I recently came across a situation where part of the data processing would put the data temporarily into a non-referential state. In order to accommodate this, without scripting Drop/Create statements which could get out of sync with other development work, I put together the following stored procedure which enables/disables check constraints.

Feel free to use/copy/amend as appropriate, but please let me have your comments/updates.

Create Procedure [etl].[SetCheckContraintsState]
    Enables/Disables any check contraints that reference the target table.
    This is useful for when referenced data is going to be temporarily modified
    in a way that will invalidate any check constraints.
    02-Jun-2015 S.Wilber
    @target_schema_name sysname
   ,@target_table_name sysname
   ,@do_set_enabled bit

-- declare variables
Declare     @constraint_name sysname
           ,@referencing_schema_name sysname
           ,@referencing_table_name sysname
           ,@sql nvarchar(max)

Declare @constraints table
    ConstraintName sysname
   ,ReferencingSchemaName sysname
   ,ReferencingTableName sysname

Insert      @constraints (ConstraintName, ReferencingSchemaName, ReferencingTableName)
Select As ConstraintName
           , As ReferencingSchemaName
           , As ReferencingTableName
From        sys.foreign_keys fk
Inner Join  sys.tables t On (t.object_id = fk.parent_object_id)
Inner Join  sys.schemas sch On (sch.schema_id = t.schema_id)
Where       fk.referenced_object_id = Object_ID(QuoteName(@target_schema_name) + '.' + QuoteName(@target_table_name))

Declare     constraint_cursor Cursor Fast_Forward For
Select      ConstraintName
From        @constraints

Open constraint_cursor
Fetch From constraint_cursor Into @constraint_name, @referencing_schema_name, @referencing_table_name

While @@Fetch_Status = 0
    If @do_set_enabled = 1
        Set @sql = 'Alter Table ' + QuoteName(@referencing_schema_name) + '.' + QuoteName(@referencing_table_name) + ' With Check Check Constraint ' + QuoteName(@constraint_name)
        Set @sql = 'Alter Table ' + QuoteName(@referencing_schema_name) + '.' + QuoteName(@referencing_table_name) + ' NoCheck Constraint ' + QuoteName(@constraint_name)

    Exec sp_executesql @sql

    -- get the next constraint
    Fetch From constraint_cursor Into @constraint_name, @referencing_schema_name, @referencing_table_name

Close constraint_cursor
Deallocate constraint_cursor

Code Reuse in SSIS – Passing Values To and From Child Packages

August 15, 2014 2 comments

I recently found myself having to replicate the same piece of script code in a SSIS package multiple times, which of course makes any developer feel dirty as copied code tends to drift divergently and not be maintained correctly, especially where the number of copies is beyond a manageable few.

Given restrictions of various clients that I’ve worked with I’ve had to avoid creating SSIS extensions or DLLs that would have to be deployed along with the package. This has meant that code reuse is always a little tricky. The tricky aspect is that in order to reuse a script in any meaningful way parameter values need to be passed in and return values received back. Thus, with plain SSIS this is difficult.

Or so I thought. I’ve now found that it is possible, and in fact easy to pass parameters down to child packages, via the well-known ‘Parent package variable’ configuration, but also to pass values back up to the parent package. Below I’ll walk through a simple example showing how this is done.

Firstly, create a parent package as shown below. We’ll fill in the details shortly.

Contents of Parent Package


The parent package will execute a child package two times in parallel to show that there are no issues with parallel running.

The Sequence Container labelled ‘Start’ is purely a placeholder and contains no sub-tasks.

The two parallel Sequence Containers each contain a call to the child package and a script task to message box out the result. Message boxes clearly are not the best thing in an SSIS package, but they suit our purposes for this demo.

Against each Sequence Container I set up two variables, one as a parameter value to pass to the child package and another to store the result coming back. To show that the parameter value passed down is consumed correctly each container is given a different value as shown in the screenshots below.

Sequence 1 Variables

Sequence 2 Variables


The code to show the message box containing the resultant value is shown below. Don’t forget to add ‘User::Result’ as a read only variable for the script task.

public void Main()
    // display result in a message box

    // return success as the task result
    Dts.TaskResult = (int)ScriptResults.Success;


Before we can set up the task to execute the child package we first need to create the child package.

The child package simply consists of two script tasks. The first to create a resultant value from the parent parameter value and the second to sleep for 5 seconds to show parallel running.

Child Package


In order to access the parent parameter value we need to set up package configuration on the child package. To do this, firstly create a variable on the child package to hold the parent value.

Child Package Variables


Secondly, right click anywhere on the background of the child package and select ‘Package Configurations…’.

Tick the check box labelled ‘Enable package configurations’.

Click ‘Add…’ to add a new configuration item which will pop up the Package Configuration Wizard as per the screenshot below.

Package Configuration Wizard Step 1


Select ‘Parent package variable’ as the configuration type.

For ‘Parent variable’ simply enter the name and not the full path of the parent variable, ‘ParameterValue’. By not entering the path, the variable is searched for initially in the scope of the task calling the child package and if not found, then upwards from there. Since we want to reuse the child task it is important that a full path is not specified so that we can pick up the value from wherever the child task is called from.

Click ‘Next >’.

Now we need to assign the parent value to a property within the child package. Select the ‘Value’ property of the variable we created earlier called ‘ParentParameterValue’ as per the screenshot below.

Package Configuration Wizard Step 2


Click ‘Next >’.

Finally, give the configuration a name as per the screenshot below. Again, I’ve chosen ‘ParentParameterValue’.

Package Configuration Wizard Step 3


Finally, click ‘Finish’ and then ‘Close’.

So now, when the parent package is executed, the variable values set in the parent package can be picked up in the child package. Let’s now see how we can pass result values back. It is surprisingly simple.

Edit the script task in the child package called ‘Set Result Value’. In the initial dialogue box we need to set up the variables that the script can access. Set them up as per the screenshot below, i.e. adding ‘User::ParentParameterValue’ as a read only variable and ‘User::Result’ as a read/write variable.

Child Script Task Configuration


But wait! What is this ‘User::Result’ variable. We’ve not set it up in the child package. The script task won’t complain. What this in fact will refer to, once the package is run, is the nearest variable of that name in scope, starting from the child package, but then bubbling up the search through to the parent package starting at the point where the child package is called and then onwards and upwards. One way to think about it is to imagine that a copy of the contents of the child package is in fact inserted into the parent package in place of the ‘Execute Child Package’ task.

Once you think of it like that, the next logical question is whether it is possible to read parent package variables without having to add in the ‘Parent Package Variable’ configuration, and it is. So long as no explicit paths are given a variable reference will bubble up from the scope of the child package through to the parent calling package just as if the child were in fact part of the parent.

So why do we need ‘Parent Package Variable’ configuration at all? It is needed because validation of the child package for nearly all task types will fail if the variable references don’t point to variables that the validation process can see, i.e. are in scope in the current package as packages are validated in their own right and not in the context of any calling package. For Script Tasks this variable validation is not carried out. Variables within the script itself are available through the Dts.Variables object which is an aggregation of all variables in scope at the time of execution going up from the child package up to and including the parent package. Dts.Variables also has a ‘Contains’ method to allow you to check for the existence of a variable before use.

So, back to the demo. The script itself is shown below and simply shows that the parent value is accessible and also adds in a message from the child.

public void Main()
    // set the result value to be the parent value plus some changes
    Dts.Variables["User::Result"].Value = "Result: '" + Dts.Variables["User::ParentParameterValue"].Value + "'. Child says hello.";

    // return success as the task result
    Dts.TaskResult = (int)ScriptResults.Success;


The script for the ‘Sleep for 5 Seconds’ task is as below. You will need to add a ‘using’ to ‘System.Threading’ at the top of the script.

public void Main()
    // set the result value to be the parent value plus some changes

    // return success as the task result
    Dts.TaskResult = (int)ScriptResults.Success;


Finally, we can return to the parent package and complete the details required in the ‘Execute Child Package’ tasks, as per the screenshot below.

Execute Child Task


Executing the parent package, following the 5 second sleep, results in the following two message boxes popping up simultaneously.

Result 1Result 2


From this we can see that both child tasks executed in parallel, both could access variables in the parent package and could set values in the parent package and that the parent package itself could see those changes.

Going back to the initial code reuse problem this means that script tasks can be moved into child packages and can pick up in scope variables values from the executing parent and return values back for further consumption within the parent package. Brilliant.


Categories: SSIS Tags: , , ,

Show Expression Logic in SSIS Task Links (Constraints)

June 16, 2014 1 comment

As part of my campaign to make logic discovery easier in SSIS for my future self I discovered by chance today a great little feature.
Often key logic can be build into the constraints (the red/green arrows) between tasks in the form of expressions. The logic is not visible without showing the properties of the constraint or by double clicking on it.


A simple SSIS package including some constraint expressions where the logic is not clear.

Often the logic for a package can be hidden in constraint expressions.


However in the properties for a constraint there is an item called “ShowAnnotation”. Set this to “ConstraintOptions” and the logic of the constraint is shown as an annotation next to the constraint. Brilliant!

There are a few other options, such as “ConstraintDescription” which can be useful to summarise the logic if it grows to long. An example is shown below.


Properties window in SSIS highlighting the 'ShowAnnotation' property.

Properties window in SSIS highlighting the ‘ShowAnnotation’ property.


Rather sadly the layout of the constraint annotation is not brilliant, so sometimes it overlaps other items when ‘Auto Layout’ is used. However, in allowing an understanding of process flow logic at a glance, it’s great.


A simple SSIS package including some constraint expressions where the logic is made visible via annotations.

The logic of the constraint expressions shown by setting ‘ShowAnnotation’ to ‘ConstraintOptions’.