Louis's profileThe SQL Doctor is In (Re...PhotosBlogListsMore Tools Help

Blog


    September 06

    Split table using views and instead of triggers

    I was playing around with instead of triggers to solve a problem with two tables being treated as one, and I came upon the following solution for making a view that behaves almost exactly like a single table.
     
    For example, I will use the following tables, both sharing the same primary key:
     
    CREATE TABLE tableNameA
    (
        tableNameId int PRIMARY KEY,
        valueA  varchar(10)
    )
    CREATE TABLE tableNameB
    (
        tableNameId int PRIMARY KEY,
        valueB  varchar(10)
    )
    GO
     
    Why would you do this?  I could see various reasons, none too common.  You might want to attach your table to a third party table.  For example, the tables needn''t be in the same database.  Or one of the tables might have a couple of varchar(max) values which are rarely used.  It might be valuable to keep one of the tables small.  (I was just answering a newsgroup message a while back, and the reader wanted to do something like this, so I popped this solution together.
     
    First, create a view that FULL OUTER joins the tables. A row does not need to exist on both sides to use this view. The FULL OUTER gets all rows from both tables, where there is a match.
     
    CREATE VIEW tableName
    AS
        SELECT  coalesce(tableNameA.tableNameId, tableNameB.tableNameId) AS tableNameId,
                tableNameA.valueA, tableNameB.valueB
        FROM    tableNameA
                    FULL OUTER JOIN tableNameB
                        ON tableNameA.tableNameId = tableNameB.tableNameId
    Now, I create an instead of trigger to insert the rows.  If you insert a duplicate in either side it will fail:
     
    CREATE TRIGGER tableName_insteadOFInsert
    ON tableName
    INSTEAD OF INSERT
    AS
    --note, add error handling for production worthy code
      BEGIN
        SET NOCOUNT ON
        INSERT INTO tableNameA(tableNameId,valueA)
        SELECT tableNameId, valueA
        FROM inserted
        WHERE valueA IS NOT null
     
        INSERT INTO tableNameB(tableNameId,valueB)
        SELECT tableNameId, valueB
        FROM inserted
        WHERE valueB IS NOT null
     END
    Not insert data, noting that you treat the primary key as if it is just a single column.
     
    INSERT INTO tableName(tableNameId, valueA, valueB)
    VALUES (1, NULL, '10')
    INSERT INTO tableName(tableNameId, valueA, valueB)
    VALUES (2, '20', '20')
    INSERT INTO tableName(tableNameId, valueA, valueB)
    VALUES (3, '30', NULL)

    Then viewing the data:
     
    SELECT *
    FROM tableName
    It looks just like a normal table:
     
    tableNameId valueA     valueB
    ----------- ---------- ----------
    1           NULL       10
    2           20         20
    3           30         NULL

    Then you can create an update trigger to manage changes.  It is more tricky, because you have to consider that one of the objects may or may not exist:
     
    CREATE TRIGGER tableName_insteadOFUpdate
    ON tableName
    INSTEAD OF UPDATE
    AS
      BEGIN
        SET NOCOUNT ON
      
        UPDATE tableNameA
        SET    valueA = inserted.valueA
        FROM   tableNameA
                  JOIN inserted
                      ON tableNameA.tableNameId = inserted.tableNameId
       
        --since we want this to behave like a real update, we have to add the related rows only if some row exists
        INSERT INTO tableNameA(tableNameId,valueA)
        SELECT tableNameId, valueA
        FROM inserted
        WHERE valueA IS NOT NULL
          and exists (select *
                      from   tableNameB
                      where  tableNameB.tableNameId = inserted.tableNameId)
          and not exists (select *
                          from   tableNameA
                          where  tableNameA.tableNameId = inserted.tableNameId)
                  
        UPDATE tableNameB
        SET    valueB = inserted.valueB
        FROM   tableNameB
                  JOIN inserted
                      ON tableNameB.tableNameId = inserted.tableNameId
        INSERT INTO tableNameB(tableNameId,valueB)
        SELECT tableNameId, valueB
        FROM inserted
        WHERE valueB IS NOT NULL
          and not exists (select *
                      from   tableNameB
                      where  tableNameB.tableNameId = inserted.tableNameId)
          and exists (select *
                          from   tableNameA
                          where  tableNameA.tableNameId = inserted.tableNameId)
     END
    Now I make a couple of changes to the data, to see that it works:
     
    UPDATE tableName
    SET    valueA = 'U10'
    WHERE  tableNameId = 1
     
    UPDATE tableName
    SET    valueA = 'U20',
             valueB = 'U21'
    WHERE  tableNameId = 2
     
    UPDATE tableName
    SET    valueB = 'U30'
    WHERE  tableNameId = 2
     
    UPDATE tableName
    SET    valueB = 'No new row'
    WHERE  tableNameId = 20
    Then you can see the changes here (including the fact that no changes were affected by the statement with the non-existant tableNameId value.
     
    SELECT *
    FROM tableName
     
    This returns:
     
    tableNameId valueA     valueB
    ----------- ---------- ----------
    1           U10        10
    2           U20        U30
    3           30         NULL
     
    Finally, we implement the delete:
     
    CREATE TRIGGER tableName_insteadOFDelete
    ON tableName
    INSTEAD OF DELETE
    AS
    --note, add error handling for production worthy code
      BEGIN
        SET NOCOUNT ON
        DELETE tableNameA
        FROM   tableNameA
                  JOIN deleted
                      ON tableNameA.tableNameId = deleted.tableNameId
        DELETE tableNameB
        FROM   tableNameB
                  JOIN deleted
                      ON tableNameB.tableNameId = deleted.tableNameId
     END
    Then test it out:
     
    --one row
    DELETE  tableName
    WHERE  tableNameId = 2
     
    --zero rows
    DELETE  tableName
    WHERE  tableNameId = 20

    Check that:
     
    SELECT *
    FROM tableName
     
    tableNameId valueA     valueB
    ----------- ---------- ----------
    1           U10        10
    3           30         NULL
     
    Now delete the rest:
     
    DELETE  tableName
    WHERE  tableNameId in (1,3)
     
    Is it useful?  Only if it is...  Which is pretty much why I write stuff like this.  I find that if I have thought of it, sooner or later it will come in handy.  And I haven't thought of much, that's for sure.
    August 17

    Dates to a person's next birthday

    The question was raised on the forums the other day about how to figure out how many days until the next birthday for a person.  It was a question that I had never answered before, and a technique that I am not sure why I had never had to employ before.  It seems so common really.

    So I put together a faux person table, with just an id and a birthdate (a bit Orwellian I suppose, but I was just feeling too lazy to phony up names, which then made me have at least two columns...so you get an id)

    drop table person
    go
    create table person
    (
        personId int,
        birthdate datetime
    )
    go
    insert into person
    select 1,'19661220'
    union all
    select 2,'19800514'
    union all
    select 3,'19670712'
    union all
    select 4,'19890203'
    union all
    select 5,'19600503'
    union all
    select 6,'19680304'
    union all
    select 7,'19680824'

    go

    The basics are that want to look for the next date, either in 2006 for October, November, etc, and in 2007 for Feb, April, etc. (note that the article is being published in August 2006).

    I use this expression take the birthdate and "move" it into this year:

    cast(year(getdate()) as char(4)) + right(convert(char(8),birthdate,112),4)

    and then see if it is less than today.  If so, I add one to the year and then datediff with today, else I just use today.  As is true with any function based solution in SQL, it is not pretty, and I might consider doing this in a CLR function if I had to do it more than once.

    select birthdate,  
            case when cast(year(getdate()) as char(4)) + right(convert(char(8),birthdate,112),4)
                                                                                                     < convert(char(8),getdate(),112)
                then datediff(day, convert(char(8),getdate(),112),
                                     cast(year(getdate()) + 1 as char(4)) + right(convert(char(8),birthdate,112),4))
            else
                datediff(day, convert(char(8),getdate(),112),cast(year(getdate()) as char(4)) +
                               right(convert(char(8),birthdate,112),4))
            end as dayToBirthday
    from person

    This returns:

    personId    birthdate               dayToBirthday
    ----------- ----------------------- -------------
    1           1966-12-20 00:00:00.000 125
    2           1980-05-14 00:00:00.000 270
    3           1967-07-12 00:00:00.000 329
    4           1989-02-03 00:00:00.000 170
    5           1960-05-03 00:00:00.000 259
    6           1968-03-04 00:00:00.000 199
    7           1968-08-24 00:00:00.000 7

    I cant think of an easier way, even using a date table.  Instinctively I wanted to do some sort of integer math with mod and div, but since the number of dates in a year varies, that wasn't possible. 

    Edit:  Got an email comment from Frank Kalis (he runs this website http://www.insidesql.de amongst other things) with the following solution:

    SELECT personID, birthdate
      , CASE
             WHEN DATEADD(YEAR, YEAR(GETDATE())- YEAR(birthdate), birthdate) <
                       DATEADD(DAY, 0, DATEDIFF(DAY, 0, GETDATE()))
             THEN DATEDIFF(DAY, GETDATE(), DATEADD(YEAR, 1, DATEADD(YEAR,
                       YEAR(GETDATE())- YEAR(birthdate), birthdate)))
             ELSE DATEDIFF(DAY, GETDATE(), DATEADD(YEAR, YEAR(GETDATE())-
                      YEAR(birthdate), birthdate)) END AS dayToBirthdate
    FROM person

    This is really cool, as it relies less on substrings (right is a special version of substring) and more on date math.  Two really interesting things.  First is yet another way to strip off the date from the current day from the date (and probably the best). 

    DATEADD(DAY, 0, DATEDIFF(DAY, 0, GETDATE()))

    This bit: select DATEDIFF(DAY, 0, GETDATE()) gets the number of days since the beginning of time, then the DATEADD turns it back into a datetime.  Then this:

    DATEADD(YEAR, YEAR(GETDATE())- YEAR(birthdate), birthdate)

    Sets the year to the current year.  Quite a nice little solution indeed.

    July 23

    Using @@rowcount in a trigger

    When writing a trigger, it is important to consider the case where no rows are modified by the statement involved.  Furthermore, knowing how many rows are affected can be useful in other circumstances as well.  In some validation routines, I will count the number of valid rows and the number of null rows and sum them up to see if they match the number of total rows affected by the trigger (a trick I learned at least from ERwin macros over the years.

    There are a couple of ways to get the number of rows affected, the most obvious being to count the number of rows in the inserted or deleted table (inserted for insert triggers, deleted for delete triggers, and either for an update trigger.  I rarely advocate having a trigger do more than one action, to avoid any code that does one action on a DELETE, and another on an UPDATE, and yet another on an INSERT.  It isn't a hard and fast rule by any means (and one that I sometimes break when I am building code by hand and not code generators :)

    Anyhow, back to rowcount.  The easy way to do this is to use the @@rowcount system function (or sometimes thought of as a global variable, thought it is really a function.)  The important thing to make sure is that no code that can change the value of @@rowcount is executed before fetching the value of @@rowcount (which will also change the value :)  Pretty much the only thing you can do is declare a variable to hold the value.

    So, consider the following table:

    create table test
    (
        testId  int primary key
    )

    And the following trigger, that simply takes the value, puts it in a varariable and the selects that value as a result set (not something typically done in a trigger, but I am doing it here for demonstration purposes.)

    create trigger testRowcount
    on test
    after insert
    as
      declare @rowcount int
      set @rowcount = @@rowcount

      select @rowcount

    Now, enter a row:

    insert into test(testId)
    select 1

    returns:


    -----------
    1

    (1 row(s) affected)


    (1 row(s) affected)

     

    Or for two rows:

    insert into test(testId)
    select 2
    union all
    select 3


    -----------
    2

    (1 row(s) affected)


    (2 row(s) affected)


    But, you think, I want to turn off those pesky row(s) affected messages for the trigger, so you make the following change:

    alter trigger testRowcount
    on test
    after insert
    as
      set nocount on

      declare @rowcount int
      set @rowcount = @@rowcount

      select @rowcount

    And try to enter the data again:

    insert into test(testId)
    select 4
    union all
    select 5

    Returns:

    -----------
    0

    (2 row(s) affected)

    This turned off the message alright, but now the value of @rowcount is 0.  Turns out, SET commands affect the rowcount, so you change it to:

    alter trigger testRowcount
    on test
    after insert
    as
      declare @rowcount int
      set @rowcount = @@rowcount

      set nocount on

      select @rowcount

    Executing this:

    insert into test(testId)
    select 6
    union all
    select 7

    Now returns:

    -----------
    2

    (2 row(s) affected)

    So now rows affected messages from the trigger are not returned, but those from your current connection are.  If you have written stored procedures, this probably feels wierd, as it is common practice to always start a proc with set nocount on.

    What you can do from here is use this value to simply quit the trigger if no data is modified (ie @rowcount = 0).  Of course if it is interesting to you that no rows were affected, you instead let it go, or log the fact that it occurs:

    alter trigger testRowcount
    on test
    after insert
    as
      declare @rowcount int
      set @rowcount = @@rowcount
      set nocount on

      if @rowcount = 0
         return

      select @rowcount

    Now if you insert 0 rows, the select statement will never be reached:

    insert into test
    select 100
    where  1=2

    Just returns:

    (0 row(s) affected)

    July 19

    Ambiguous column, just one table?

    We have all written queries that return a column list where you see the error:
     
    Msg 209, Level 16, State 1, Line 1
    Ambiguous column name 'columnName'.
     
    If you don't qualify the names of columns (because you are just writing a quick ad-hoc statement of course) then a couple of the tables use the same column name, it is normal enough.  
     
    But it can also crop up in query with only a single table.  For example, take this table:
     
    create table test
    (
        testId  int primary key,
        value   varchar(200),
        sortColumn varchar(10)
    )
     
    Load a bit of data:

    --the sortColumn value is opposite of the testId for demo reasons
    insert into test
    select 1, 'Value','b'
    union all
    select 2, 'Another value','a'
     
    And you run some query like this:
     
    select *
    from   test
    order  by sortColumn
     
    But the data is so long that you can't see the end of it:
     
    testId      value                                                        ...
    ----------- ----------------------------------------------------------------
    2           Another value                                                ...
    1           Value                                                        ...
    So you take the sort column and add it to the select clause:
     
    select  sortColumn, *
    from    test
    order by sortColumn
     
    Wa, wa, waaaa.  Error:
     
    Msg 209, Level 16, State 1, Line 1
    Ambiguous column name 'sortColumn'.
     
    What is sort of surprising is the fix.  Of course you can type out a proper column list (and in your production code, you really ought to,) but you can also use the positional syntax:
     
    select  sortColumn, *  --(or sortColumn, sortColumn, rather than the *)
    from    test
    order by 1
     
    Though the same caveats apply and you shouldn't do this in your production code if at all possible. However, the following works just fine, and should be what you would do to allow the rest of your code to execute unchanged.
     
    select  sortColumn, *
    from    test
    order by test.sortColumn
     
    At first, this is kind of surprising, but really it isn't. In a SELECT statement, the ORDER BY clause executes after the SELECT clause, so it sees two columns named sortColumn in the set.  You can see this by executing this statement:
     
    select  testId as sortColumn
    from    test
    order by sortColumn
    This will sort by the value of testId.  But change it to:
     
    select  testId as sortColumn
    from    test
    order by test.sortColumn
    And instead it virtually adds the test.sortColumn value to the SELECT set, sorts by it, and then discards the value.  Of course the best policy is to just go ahead and qualify your names in your SELECT statements, but that isn't always reasonable when writing ad-hoc statements.
    July 17

    Formatting a datediff

    First off, I had started this post before I saw this post: http://sqlservercode.blogspot.com/2006/07/three-ways-to-get-time-portion-of.html and mine is a bit different anyhow.  (For full disclosure, the blogger in question was the inspiration for part of this post.)
     
    Sometimes it is just advantageous to format a piece of data in SQL, if for no other reason than sometimes you might want to look at data in an administrative capacity.  A good place for that is formatting a date to show the amount of time between some events.
     
    So here is a basic table that might be used to record an event (obviously there would be more information, for that you will have to engage your imagination.)

    create table event
    (
        eventId    int primary key,
        startTime   datetime,
        finishTime  datetime
    )

    And some basic test data
    insert into event
    select 1, '2006-01-01T00:00:00.000', '2006-01-01T02:14:23.000'
    union all
    select 2, '2006-01-01T00:00:00.000', '2006-01-02T03:22:23.233'
    union all
    select 3, '2006-01-01T00:00:00.000', '2006-06-02T13:10:30.233'
    go
     
    As shown in the blog referred to above, you can use convert and 108 style to get the time part (or 114 if you want milliseconds).  For example, executing this:
     
    select convert(varchar(10),getdate(),108)
     
    Tonight, returns:
     
    23:32:22
     
    To get the event time, I use the followng query (well, parts of it anyhow.) 
     
    select eventId,
                    cast(hours as varchar(10)) + timeSeconds as secondsOnly,
                     cast(hours as varchar(10)) + timeMilliSeconds as inclMilliseconds,
                     cast(hours / 24 as varchar(10)) + ' day(s) ' +
                     cast(hours % 24 as varchar(10)) + timeSeconds as formatDays
    from   (    select eventId, datediff(hour,startTime, finishTime) as hours,
                       right(convert(varchar(20),finishTime-startTime,108),6) as timeSeconds,
                       right(convert(varchar(20),finishTime-startTime,114),10) as timeMilliSeconds
                from event) as eventTimesNeedsFormatting
    Returns: 

    eventId     secondsOnly      inclMilliseconds     formatDays
    ----------- ---------------- -------------------- ----------------------------------
    1           2:14:23          2:14:23:000          0 day(s) 2:14:23
    2           27:22:23         27:22:23:233         1 day(s) 3:22:23
    3           3661:10:30       3661:10:30:233       152 day(s) 13:10:30

     
    Breaking this down the derived table: gets the difference in hours, so I can determine days and hours difference, and then I do something that looks really odd, and that is subtract one time value from another. 
     
    If you execute:
     
    select finishTime - startTime
    from  event
     
    You will see that this returns the time difference offset from '1900-01-01 00:00:00.000':
     
    1900-01-01 02:14:23.000
    1900-01-02 03:22:23.233
    1900-06-02 13:10:30.233
    The time part is all we care about, so using right(), I peel off the values and plop onto the end of the N day(s) and or simply the hours.  Months is too dangerous an equation because of variable length months, but you could easily extend to weeks if you cared.
     
    What makes this significant (at least to me) is that when I first answered the question, I posited the following solution:
     
    select eventId, cast(timeSeconds / 3600 as varchar(10)) + ':'
                                          + right('0' + cast((timeseconds % 3600) /60 as varchar(2)),2) + ':' 
                                          + right('0' + cast((timeseconds % 3600) % 60 as varchar(2)),2) as elapsedTime
    from   (    select eventId, datediff(second,startTime, finishTime) as timeseconds
                    from event) as runnerTimesNeedsFormatting
     
    The whole taking seconds and turning into the bits of a result is a ghastly bore and really annoying.  And if you want to include milliseconds it is just too ugly.  A function could be written, likely in CLR if you need this done too many times, but as should be the supposition of any "format using SQL Server" topic, this is not for major production use.  Mostly this kind of thing should be for use when you as the dba need to look at some data and don't want to produce a complet application to do so.  Hmm, that sounds like a blog all on its own :)
     
    Edit: Had the formatting of seconds and milliseconds backwards.  Fixed
    July 13

    datalength() and len() (plus something you may not realize about len())

    Both are very similar, but each of them have a distinct purpose.

    Len is a user oriented function, meant to let you see the "visual" representation size of a column as it can be displayed.  I actually didn't realize this, but it will work on more datatypes than just character data, in fact the one datatype that it failed for me was text (but who cares about that now that 2005 has arrived?!)  This little fact is a good part of why I chose to write this today.

    On the other hand, datalength is more of a physical storage oriented function that tells you about the amount of storage required for the data. So if you have any fixed length columns, or unicode columns nvarchar, ntext, etc, it may give you "confusing" answers compared to what you probably want (I will demonstrate).  You may want to use len instead to get the number of characters in the character string.

    As an example, consider the following table with several data types represented:

    create table testDataLength
    (
        intCol int,
        decimalCol decimal(15,5),
        varcharCol varchar(10),
        nvarcharCol nvarchar(10),
        uniqueidentifierCol uniqueidentifier
    )

    And some sample data (using 12345 as a recurring theme :)

    insert into testDataLength
    select 12345,12.345,'12345','12345',
          
    '12345678-0123-5678-0123-567890123456'

    Now, I will run two selects, one using datalength, the other using len:

    select 'datalength' as operation,
           datalength(intCol) as '12345',
           datalength(decimalCol) as '12.345',
           datalength(varcharCol) as 'ascii-12345',
           datalength(nvarcharCol) as 'unicode-12345',
           datalength(uniqueidentifierCol)
                   
    as '12345678-0123-5678-0123-567890123456'
    from testDataLength
    union all
    select 'len' as operation,
           len(intCol),
           len(decimalCol),
           len(varcharCol),
           len(nvarcharCol),
           len(uniqueidentifierCol)
    from testDataLength

    This returns:

    operation   12345       12.345      ascii-12345 unicode-12345 <the guid>

    ----------  ----------- ----------- ----------- ------------- -----------------------

    datalength  4           5           5           10            16
    len         5           8           5           5             36

    First, note that the length of the decimal col says it is 8 characters long.  We only put 6 digits in, but run this query:

    select cast(intCol as varchar(10)) as intCol,
           cast(decimalCol as varchar(10)) as decimalCol
    from testDataLength

    And you will see that

    intCol     decimalCol
    ---------- ----------
    12345      12.34500

    As mentioned before, another concern with using datalength() is when using unicode data.  Going back to the data look at the to 5 character columns:

    select 'datalength' as operation,
           datalength(varcharCol) as 'ascii-12345',
           datalength(nvarcharCol) as 'unicode-12345'
    from testDataLength
    union all
    select 'len' as operation,
           len(varcharCol),
           len(nvarcharCol)
    from testDataLength

    The following is output:

    operation  ascii-12345 unicode-12345
    ---------- ----------- -------------
    datalength 5           10
    len        5           5

    The problem is, if you are used to using datalength to get the size of string columns, it will give misleading results for unicode columns, which most people in the US don't really make too much use of for the most part (not that it is unheard of, and it is becoming more and more common.  The system I am working with now using all unicode columns, but it is a product meant for international sale.

    So almost always what you will want to use is len() to get the visual representation size of the data, not the storage size.

    July 09

    Procedure to drop all relationships from a table (and a brief discussion on preventing injection attacks)

    Someone asked a while back how to drop a table and foreign key constraints in one statement, and while that is not possible, you can use a procedure to do this pretty easily.   I have had this proc around for a while as I actually create it before I drop a table and drop it afterwards in a tool that I have built in ERwin.  So I figured I would post it here too.
     
    The procedure works on a cursor, since I have to execute multiple statements. When I first posted the procedure this was the select clause of the procedure:
     
     select  'alter table ' + ctu.table_schema + '.' + ctu.table_name +
             ' drop constraint ' + cc.constraint_name
     
    This suffers from two major problems.  One, people can name objects using quoted names that might contain spaces, causing my code to crash. Second, Steve Kass (one of my favorite SQL Server minds because of his incredibly detailed mind) busted me on the fact that it would be possible for a hacker to name a foreign key constraint in such a way to do bad stuff to your table (like drop it.)  For example:
     
    create table test
    (
        testid  int constraint [PKTest] primary key,
        constraint AKTest unique (testId), --redundant constraint
        parentTestId int constraint [AKTest; select 'GOTCHA' drop table test RETURN]
                         references test(testId)
    )
     
    My code would produce the following, that will drop the constraint by dropping the table as well.
     
    exec ('alter table dbo.test drop constraint AKTest; drop table test RETURN')

    So I had to make proper use of the quotename function:
     
     select  'alter table ' + quotename(ctu.table_schema) + '.' + quotename(ctu.table_name) +
             ' drop constraint ' + quotename(cc.constraint_name)
     which changes the output to:
     
    alter table [dbo].[test] drop constraint [AKTest; drop table test RETURN]
     
    Which is perfectly safe (if a somewhat stupid name for a constraint.) The reason I make note of this is that I generally use quotename for all dynamic SQL.  However, I never even thought of doing it for names of objects.  I NEVER name objects with spaces or special characters (NEVER), and I just never even considered this kind of hack.  It just highlights the importants of thinking globally when posting code, rather than how you do things in your limited scope of your "home turf".  (This is the kind of learning experience that I was talking about in my article "Why answer questions".  Learning bits and pieces like this are a key part.)
     
    So, without further adeiu, here is the procedure:
     
    create  procedure utility$removeRelationships
    (
     @table_schema  sysname = 'dbo', --does not do a like comparison
     @parent_table_name sysname = '%', --it is the parent when it is being referred to
     @child_table_name sysname = '%', --it is the child table when it is the table referring
          --to another
     @constraint_name sysname = '%'  --can be used to drop only a single constraint
    ) as
    -- ----------------------------------------------------------------
    -- Drop all of the foreign key contraints on and or to a table
    -- ----------------------------------------------------------------
     begin
     set nocount on
     declare @statements cursor
     set @statements = cursor static for
     select  'alter table ' + quotename(ctu.table_schema) + '.' + quotename(ctu.table_name) +
             ' drop constraint ' + quotename(cc.constraint_name)
     from  information_schema.referential_constraints as cc
              join information_schema.constraint_table_usage as ctu
               on cc.constraint_catalog = ctu.constraint_catalog
                  and cc.constraint_schema = ctu.constraint_schema
                  and cc.constraint_name = ctu.constraint_name
     where   ctu.table_schema = @table_schema 
       and ctu.table_name like @child_table_name
       and cc.constraint_name like @constraint_name
       and   exists (select *
       from information_schema.constraint_table_usage ctu2
       where cc.unique_constraint_catalog = ctu2.constraint_catalog
          and  cc.unique_constraint_schema = ctu2.constraint_schema
          and  cc.unique_constraint_name = ctu2.constraint_name
          and ctu2.table_schema = @table_schema
          and ctu2.table_name like @parent_table_name)
     open @statements
     declare @statement nvarchar(1000)
     While  (1=1)
      begin
             fetch from @statements into @statement
                    if @@fetch_status <> 0
                     break
                    exec (@statement)
             end
     end
    go

    To test it, here are three related tables, with unnamed constraints (which really makes the procedure useful:


    create table parent
    (
        parentId    int primary key
    )
    create table child
    (
        childId     int  primary key,
        parentId    int  references parent(parentId)
    )
    create table grandChild
    (
        grandChildId int primary key,
        childId     int  references child(childId)
    )
    go

    If you try to drop the child table:

    drop table child

    No dice:

    Msg 3726, Level 16, State 1, Line 1
    Could not drop object 'child' because it is referenced by a FOREIGN KEY constraint.

    Instead:

     
    exec utility$removeRelationships @parent_table_name = 'child'
    drop table child
     
    This works (no matter what the constraints are named!)  Over the next few weeks I will post the other procs I have like this for check and default constraints.
     
    June 25

    Unique constraint, or unique index

    The question is often asked about the difference between an unique constraint, and an unique index.  Using them should be thought of as quite different tasks, thought there large similarities as well.

    A unique constraint should be used as an implementation tool primarily to enforce some uniqueness condition, in addition to the primary key (in other words, an alternate key.)  An unique index should be used to enhance performance.

    For example, if you use an identity surrogate for the primary key of a purchase order table, you might also have an purchase order number column that is printed on a paper form that is your users way of identifying the order.  For an order table, you might have a customer purchase order value on an order table that you would want to enforce uniqueness on for a customer. (If the customer purchase order value is optional and unique, you will have to employ a technique known as selective uniqueness handling.)

    My suggestion is that every table has at least one natural (at least non-surrogate) key enforced with a constraint. This is not always possible of course, but it is clearly true in 95-99% of all tables. 

    Unique indexes on the other hand should be used to improve performance.   So when you choose to index a column or set of columns, if the values in this index do turn out to contain unique values then use a unique index.  An obvious case would be when you include the keys of a unique constraint in your index (like an index on orderNumber, customerId.  orderNumber is unique itself because of a constraint, so this is a case where a unique index is called for.) Unique indexes are better than non-unique because the optimizer doesn't have to guess about the max number of rows that might be returned using the index, it is either 1 or 0.

    A key difference in usage should be that you might use a unique constraint on a column or set of columns that might never be searched on, strictly to enforce uniqueness.  On the other hand, a unique index that is never used should likely be dropped (and if not, it might just be a constraint.

    June 15

    Preventing user from modifying data in a column

    This question came up in the forums, so I wrote this to answer the question. 

    There are two ways to do this using a trigger (as well as using column level security, but that won't keep programmers from writing code to modify values!)

    First a bit of sample data:

    create table testTriggerNoUpdateOfColumns
    (                                 --no update of pk or this is a very hard problem
        testTriggerNoUpdateOfColumnsId  int NOT NULL PRIMARY KEY, 
        canChange varchar(10) NOT NULL,
        dontChange varchar(10) NOT NULL,
        dontChangeNullable varchar(10) NULL
    )
    go
    insert into testTriggerNoUpdateOfColumns(testTriggerNoUpdateOfColumnsId, canChange,
                                                                         dontChange, dontChangeNullable)
    values (1,'changeMe','don''t',null)
    insert into testTriggerNoUpdateOfColumns(testTriggerNoUpdateOfColumnsId, canChange,
                                                   dontChange, dontChangeNullable)
    values (2,'changeMe','don''t','not')

    The first method won't even let DML statements reference the data.  This is the most efficient method, but might  be too annoying if you have existing code (note this trigger is just barebones, you may need to add error handling depending on how your other code works with this):

    create trigger testTriggerNoUpdateOfColumns_afterUpdate
    on testTriggerNoUpdateOfColumns
    after update
    as
     begin
        --just see if the column is referenced by the DML statement
        if update(dontChange) or update(dontChangeNullable)
          begin
                raiserror ('You may not change the value of the dontChange or dontChangeNullable columns',16,1)
                rollback transaction
          end
      end
    go

    So this is allowed:

    update testTriggerNoUpdateOfColumns
    set    canChange = 'changed'

    But this fails:

    update testTriggerNoUpdateOfColumns
    set    dontChangeNullable = 'not'
    where  testTriggerNoUpdateOfColumnsId = 2

    With the error coded in the trigger

    Msg 50000, Level 16, State 1, Procedure testTriggerNoUpdateOfColumns_afterUpdate, Line 8
    You may not change the value of the dontChange or dontChangeNullable columns

    The second method allows DML statements to reference the columns, but it validates the data to see if it has changed.:

    alter trigger testTriggerNoUpdateOfColumns_afterUpdate
    on testTriggerNoUpdateOfColumns
    after update
    as
     begin
        --still see if the column is part of the DML
        if update(dontChange) or update(dontChangeNullable)
          begin             --join inserted and deleted tables, no need to touch real tables, so for reasonable sized sets
                                 --will be very fast.  If you update thousands of rows this could be trouble though
                                 --so leaving it out of the DML would be best in that case, or disable the trigger
                if exists ( select *
                                 from   deleted
                                                join inserted
                                                       on deleted.testTriggerNoUpdateOfColumnsId =
                                                                                          inserted.testTriggerNoUpdateOfColumnsId
                                         --not null columns are simple. 
                            where  inserted.dontChange <> deleted.dontChange
                                        --nullable columns require checking for mismatched nulls too
                               or  (inserted.dontChangeNullable <> deleted.dontChangeNullable
                                    or (inserted.dontChangeNullable is null and deleted.dontChangeNullable is not null)
                                    or (inserted.dontChangeNullable is not null and deleted.dontChangeNullable is null)))
                  begin
                        raiserror ('You may not change the value of the dontChange or dontChangeNullable columns',16,1)
                        rollback transaction
                  end
          end
      end

    So this allowed

    update testTriggerNoUpdateOfColumns
    set    canChange = 'changed'

    As well as this is also now allowed, since the data doesn't change

    update testTriggerNoUpdateOfColumns
    set    dontChange = 'don''t'
    where  testTriggerNoUpdateOfColumnsId = 1

    Changing the value will however fail

    update testTriggerNoUpdateOfColumns
    set    dontChange = 'do'
    where  testTriggerNoUpdateOfColumnsId = 1

    Same error

    Msg 50000, Level 16, State 1, Procedure testTriggerNoUpdateOfColumns_afterUpdate, Line 16
    You may not change the value of the dontChange or dontChangeNullable columns

    --the next statements shows that the NULL business works:

    update testTriggerNoUpdateOfColumns
    set    dontChangeNullable = null
    where  testTriggerNoUpdateOfColumnsId = 1

    update testTriggerNoUpdateOfColumns
    set    dontChangeNullable = 'notnull'
    where  testTriggerNoUpdateOfColumnsId = 1

    Msg 50000, Level 16, State 1, Procedure testTriggerNoUpdateOfColumns_afterUpdate, Line 16
    You may not change the value of the dontChange or dontChangeNullable columns

    update testTriggerNoUpdateOfColumns
    set    dontChangeNullable = 'not'
    where  testTriggerNoUpdateOfColumnsId = 2

    update testTriggerNoUpdateOfColumns
    set    dontChangeNullable = null
    where  testTriggerNoUpdateOfColumnsId = 2

    Msg 50000, Level 16, State 1, Procedure testTriggerNoUpdateOfColumns_afterUpdate, Line 16
    You may not change the value of the dontChange or dontChangeNullable columns

     

     

    May 10

    Queries to get filegroup of table/indexes

    Yesterday, when I was looking at disk space utilization for a server, I had a need to see what objects were placed on what filegroups, and these queries were the outcome.  The queries should be pretty self explanatory, but if you want more info, email me or leave a comment and I will add comments.
     
    --2005
    select  case when indexes.type_desc in ('HEAP','CLUSTERED')
                           then 'Table-' + indexes.type_desc
                else '    NC Index' end as indexType,
     
               rtrim(cast(
                           case when indexProperty(objects.object_id,indexes.name,'IsUnique') = 1
                                       then 'unique ' else '' end +
                           case when isNull(objectProperty(object_id(schemas.name + '.'
                                                                                           + indexes.name),'IsConstraint'),0) = 1 
                                      then 'constraint ' else '' end + 
                           case when indexProperty(objects.object_id,indexes.name,'IsAutoStatistics') = 1
                                      then 'auto ' else '' end +
                           case when indexProperty(objects.object_id,indexes.name,'IsStatistics') = 1
                                      then 'statistics ' else '' end +
                           case when indexProperty(objects.object_id,indexes.name,'IsHypothetical') = 1
                                      then 'hypothetical ' else '' end
                  as varchar(30))) as indexProperties,
              schemas.name + '.' + objects.name as tableName,
              coalesce(indexes.name,'') as indexName,
              filegroups.name as filegroup
    from    sys.indexes as indexes
                 join sys.objects
                      on indexes.object_id = objects.object_id
                 join sys.schemas
                      on objects.schema_id = schemas.schema_id
                 join sys.filegroups as filegroups
                      on indexes.data_space_id = filegroups.data_space_id
    where  objectproperty(indexes.object_id,'IsMSShipped') = 0
    order by tableName, case when indexes.type_desc in ('HEAP','CLUSTERED') then 0 else 1 end
     
    --2000 (Assumes that owner is DBO)
    select case when indid = 0 then 'Table-Heap'
                       when indid = 1 then 'Table-Clustered'
                        else '          NC Index'
               end as objectType,
              rtrim(cast(
                              case when indexProperty(sysindexes.id,sysindexes.name,'IsUnique') = 1
                                         then 'unique ' else '' end +
                              case when isNull(objectProperty(object_id(sysindexes.name),'IsConstraint'),0) = 1 
                                         then 'constraint ' else '' end + 
                              case when indexProperty(sysindexes.id,sysindexes.name,'IsAutoStatistics') = 1
                                        then 'auto ' else '' end +
                              case when indexProperty(sysindexes.id,sysindexes.name,'IsStatistics') = 1
                                        then 'statistics ' else '' end +
                              case when indexProperty(sysindexes.id,sysindexes.name,'IsHypothetical') = 1
                                        then 'hypothetical ' else '' end
                as varchar(30))) as indexProperties,
                object_name(sysindexes.id) as tableName,
                coalesce(sysindexes.name,'') as indexName,
                sysfilegroups.groupname as filegroup
    from   sysindexes
               join sysfilegroups
                    on sysindexes.groupId = sysfilegroups.groupId
    where  objectproperty(sysindexes.id,'IsMSShipped') = 0
    order by object_name(sysindexes.id),indid

    Both queries return something like:
     
    indexType        indexProperties      tableName      indexName       fileGroup
    ---------------- -------------------  -------------- --------------- -----------
    Table-CLUSTERED  unique constraint    schema.table   PKschema_table  PRIMARY
        NC Index     unique constraint    schema.table   AKschema_table  PRIMARY
     
     
    May 02

    A scripting idea to allow scripting ALTERs instead of DROP/CREATE pairs...

    I was writing some procedures today, and getting kind of annoyed that I had to keep changing from CREATE PROCEDURE to ALTER PROCEDURE in my database create scripts, or DROP the procedure first, losing the original create date, and when it was altered, and I came up with this as a possible way around that.
     
    Instead of looking for the procedure's existance to drop it, I look for the procedure's existance and if it hasn't been created yet I create a "dummy" procedure.   Then I use ALTER to modify the procedure, so it won't lose any permissions that some other user has created, plus the date it was first created, and in 2005, the date it was modified is also maintained:
     
    if object_id('procedureName') is null
        exec ('create procedure procedureName as select ''replace me''')
    go
    alter procedure procedureName
    as
        select 'this is the real thing!'
    go
     
    Edit:  As commenter Scott notes, we are generaly more interested in when the procedure was modified, not when it was created.  In the 2005 metadata, you get a modified date as well as a create_date:
    After running this script a few times while writing this post, the create date stays constant, but the modify_date changes:
     
    select cast(name as varchar(30)) as name, create_date, modify_date
    from   sys.procedures
     
    This returns (for me):
     
    name                           create_date             modify_date
    ------------------------------ ----------------------- -----------------------
    procedureName                  2006-05-03 08:42:27.780 2006-05-03 08:48:11.480
    For 2005, It might have a downside, but I haven't thought of one yet (this clearly doesn't mean there isn't one :)  I probably wouldn't use this in 2000 and earlier.
    April 25

    PARSENAME and IP Addresses

    I had never heard of this function until the other day when Steve Kass (check out his SQL page too.  Steve has a very detailed and awesome mind) and Denis The SQL Menace, (who has a very nice SQL blog too.)
     
    Anyhow, a person asked in the forums about how to sort by an IP address:
     
    So, I took this data:
     
    create table ipAddress
    (
        ipaddress nvarchar(18)
    )
    insert into ipAddress
    select '0.0.18.1'
    union all
    select '0.1.1.2'
    union all
    select '0.0.0.1'
    And created this query, which has two derived tables to "peel" off the parts of the ip address:

    select oct1, oct2, substring(ipAddress,1,charindex('.',ipaddress) - 1) as oct3,
              substring(ipAddress,charindex('.',ipaddress) + 1,len(ipAddress)) as oct4
    from   (select oct1, substring(ipAddress,1,charindex('.',ipaddress) - 1) as oct2,
               substring(ipAddress,charindex('.',ipaddress) + 1,len(ipAddress)) as ipaddress
               from ( select substring(ipAddress,1,charindex('.',ipaddress) - 1) as oct1,
                                    substring(ipAddress,charindex('.',ipaddress) + 1,len(ipAddress)) as ipaddress
                         from ipAddress) as derivedTable1) as derivedTable2
    order by 1,2,3,4
    Which was a fun query, yet htere were better ways to do it. Steve and Denis noted that:
     
    select cast(parsename(ipaddress,1) as tinyint),
              cast(parsename(ipaddress,2) as tinyint),
              cast(parsename(ipaddress,3) as tinyint),
              cast(parsename(ipaddress,4) as tinyint)
    from ipaddress
    order by cast(parsename(ipaddress,1) as tinyint),
                 cast(parsename(ipaddress,2) as tinyint),
                 cast(parsename(ipaddress,3) as tinyint),
                 cast(parsename(ipaddress,4) as tinyint)
     
    To which I replied, "parse-who-now?"  PARSENAME is a built-in function that was created for parsing object names, like 'servername.databasename.schemaname.objectname'.  Because of this it only will get the first four values, seperated by periods, which just happens to be how ipAddresses are constructed (in case you didn't know :)
     
    A few additional articles on using PARSENAME:
     
     
    Of course, by the looks of this search, I am the last person on the planet to hear of PARSENAME:
     
     
    The thing is, this is just not the best way to store an ip address.  I generally prefer to store ipAddresses as a bigint, or another great way is as four tinyint values.   Check this article for a more complete reference: http://www.aspfaq.com/show.asp?id=2450
    April 17

    You may have a normalization problem if...you have multi purpose character columns

    If you find yourself using substring in a substantial number of queries on the data in your database, you may have a normalization problem. And this means any functionality that uses a substring-like interface (replace, stuff, charindex, patindex, etc). Truly there is very little use for these functions in a well normalized system.
     
    This article will discus the first of the types of issues, the multi-purpose character column.
    The first kind of example is something like a name column that contains a full persons name.  Seems reasonable so you can put ever how many parts of the name in there.  So you code something like:
     
    SELECT <columns>
    FROM   contact
    WHERE  rtrim(substring(fullName,charindex(' ',fullName),len(fullname))) = @lastName
    or

    WHERE  fullName like '%' + @lastName
     
    Or some variation on this theme. Seems innocent enough, but now the SQL engine cannot effectively act on this query effectively using an index to perform the search. And what if the person is a 'Jr' like John Smith Jr. Now you are really in a mess trying to substring your way out of a mess.
     
    Take this to the next level what about updating this value, say to change a persons last name. You are really hosed and are stuck writing ridiculous code to ignore Junior, Jr and anything else that looks like a suffix, and dealing with people with cool names to look at and pronounce, but a nightmare to code against, like 'Van Der Mullen' or 'Mac Donald'   Just try to write the query that decides if this is this a guy named Mac, last name Donald, or perhaps the data entry person screwed up Mrs Delores Mac Donalds name?
     
    It is a very common type of mistake that every one of us has made at one time or another because one column seems easier than two (or three, or whatever).
     
    The rule of thumb here is that if you will "likely" be dealing with part of a column seperately, give it a different place for storage.  You can deal with usage in a more elegant manner by building tools in the client layer, or you can use a view or computed column.  I often use a computed column when I have this kind of need.  For example, take the following abbreviated name table (no primary key, and only the very basic name bits.)
     
    create table nameExample
    (
        firstName varchar(30) not null,
        middleName varchar(30) null,
        lastName varchar(30) not null,
        fullname as firstName + coalesce(' ' + middleName,'') + ' '+ lastName,
        formalName as lastName + ', ' + firstName + coalesce(' ' + middleName,'')
    )
     
    I have included two versions of the name that will allow you to display the data for the user, as well as the three parts of the name (very common in the United States to record these three names)
     
    insert into nameExample(firstName, middleName, lastName)
    select 'John',null,'Smith'
    union all
    select 'Bob','G','Leonard'
    union all
    select 'Fred','Allen','Stein'
     
    Now view the data:
      
    select * from nameExample
     
    firstName  middleName  lastName  fullName          formalName
    ---------- ----------- --------- ---------------   ----------------
    John       NULL        Smith     John Smith        Smith, John
    Bob        G           Leonard   Bob G Leonard     Leonard, Bob G
    Fred       Allen       Stein     Fred Allen Stein  Stein, Fred Allen
     
    The next question that this data puts to mind is "what about the middle initial?"  Should there be a column for just the middle initial?  And what of the case where Bob Leonard only gave a single character response the question of middleName (something I frequently do when I don't feel like sharing!) I would most likely say add a computed column that gets the substring of the first character of the middleName, since the middle initial is functionally dependent on the middleName column, even if the user only put in one character.
     
    alter table nameExample
      add middleInitial as substring(middleName,1,1)
     
    Now you can persist this data by indexing it, or in SQL Server 2005 using the PERSIST keyword for faster access.  The key here is that you have presented the least surface space to be edited, and you will rarely if ever need, in SQL code, to look for parts of a name.  You might need to look for people whose last name starts with 'F', or the such, but it would be rare to look for values that end in 's', but you still have the SQL tools to go there if the user wants to.

    The next in this series will discuss bitmask columns, followed by multi-valued columns (such as '1,2,3')
    April 14

    Kicking users out of your database, now!

    For many years, I had a technique that more or less worked.  Declare a cursor on sysprocesses (or in 2005, sys.dm_exec_requests) and kill each user individually.  Then go back and make sure that they are all dead, or try again.  Not a perfect system, but it does relatively work.
     
    alter database <databaseName>
           set single_user with rollback immediate --(or rollback after <seconds>)
     
    This is really handy, especially when you drop and recreate a database multiple times during testing scripts.  Often I will end up with five connections to the database and when I try to drop the database, this message appears:
     
    Msg 3702, Level 16, State 4, Line 2
    Cannot drop database "<databaseName>" because it is currently in use.
     
    The cost of this is that other connections to the server for the affected database will be immediately severed, so the first time you try something you will have to reconnect.  In SSMS you will likely get this error the first time you try something since the tools don't realize that they have been cut off:
     
    Msg 233, Level 20, State 0, Line 0
    A transport-level error has occurred when sending the request to the server. (provider: Shared Memory Provider, error: 0 - No process is on the other end of the pipe.)

    But the second time will automatically reconnect.
    March 14

    How many concurrent connections can MSDE have?

    If you answered 5 (or even thought it) you would do yourself a favor to go read Euan Garden's SQL Mythbuster blog MSDE/SQL Express has 5 concurrent user limit.  Excellent history lesson, and good clarification on the limits (and lack of the limits you may have thought existed) in the free versions of SQL Server.
    February 26

    Change table collations en masse.

    This question came up today in the newsgroups, so I answered it there, but it is probably something that might be interesting to more than one or two people, so I expanded it here for "future use." 

    Say you have the following database where the collation was case sensitive and you want to change this to case insensitive:

    create database testCollationChange
    collate
    SQL_Latin1_General_CP1_CS_AS
    go

    use testCollationChange
    go

    Then check out the collation to make sure:

    select databasepropertyex(db_name(),'collation') as collation_name
    go

    collation_name
    ---------------------------------
    SQL_Latin1_General_CP1_CS_AS

    Then create a table:

    create table test
    (
          value varchar(20
    ),
          textValue text
    ,
          varchar_max varchar(max
    )
    )

    And check the collation of the columns:

    select column_name, collation_name
    from
    INFORMATION_SCHEMA.COLUMNS
    where table_name = 'test'

    This returns:

    column_name    collation_name

    ---------------     ----------------------------------

    value       SQL_Latin1_General_CP1_CS_AS
    textValue   SQL_Latin1_General_CP1_CS_AS
    varchar_max SQL_Latin1_General_CP1_CS_AS

    Now you decide to change the collation of the database to case insensitive, just alter the database:

    alter database testCollationChange
    collate SQL_Latin1_General_CP1_CI_AS

    go

    select databasepropertyex(db_name(),'collation') as collation_name

    This returns:

    collation_name
    --------------------------------
    SQL_Latin1_General_CP1_CI_AS

    But the tables stay in the previous collation:

    select column_name, collation_name
    from INFORMATION_SCHEMA.COLUMNS

    where table_name = 'test'

    Which returns:

    column_name collation_name
    ----------- ----------------------------------
    value      
    SQL_Latin1_General_CP1_CS_AS
    textValue   SQL_Latin1_General_CP1_CS_AS
    varchar_max SQL_Latin1_General_CP1_CS_AS

    To change this, I will use a query that I Adapted from Justin Mallier's blog: (http://mallier.blogspot.com/2005/11/how-to-change-collation-against-all.html) that builds a script to alter the columns:

    --I added the variables to make it more safe (Some collations may be desired
    --to stay the same. I did use like on the from so you could change them all pretty
    --easily. Also I added text and max datatype support

    declare  @fromCollation sysname,
             
    @toCollation sysname

    set      @fromCollation = 'SQL_Latin1_General_CP1_CS_AS'
    set      @toCollation = 'SQL_Latin1_General_CP1_CI_AS'

    SELECT  

    'ALTER TABLE ' + TABLE_NAME +
          
    '   ALTER COLUMN ' + COLUMN_NAME + ' '
    + DATA_TYPE +
          
    CASE
    WHEN CHARACTER_MAXIMUM_LENGTH = -1 then '(max)'
                WHEN 
    DATA_TYPE in ('text','ntext') then ''
               
    WHEN CHARACTER_MAXIMUM_LENGTH IS NOT NULL 
                 THEN '('+(CONVERT(VARCHAR,CHARACTER_MAXIMUM_LENGTH)+')' )
               
    ELSE isnull(CONVERT(VARCHAR,CHARACTER_MAXIMUM_LENGTH),' ')
    END
          
    +' COLLATE ' + @toCollation+ ' ' + CASE
    IS_NULLABLE
                                               WHEN 'YES' THEN 'NULL'
                                               WHEN 'No' THEN 'NOT NULL'

    END
    FROM
    INFORMATION_SCHEMA.COLUMNS
    WHERE
    DATA_TYPE IN ('varchar' ,'char','nvarchar','nchar','text','ntext')
      and COLLATION_NAME like @fromCollation

    This is the result. Execute this query and you can change all of the collations.

    ALTER TABLE test ALTER COLUMN value varchar(20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
    ALTER TABLE test ALTER COLUMN textValue text COLLATE SQL_Latin1_General_CP1_CI_AS NULL
    ALTER TABLE test ALTER COLUMN varchar_max varchar(max) COLLATE SQL_Latin1_General_CP1_CI_AS NULL

    go

    Checking them again:

    select column_name, collation_name
    from
    INFORMATION_SCHEMA.COLUMNS
    where table_name = 'test'

    Returns:

    column_name collation_name
    ----------- ----------------------------------
    value      
    SQL_Latin1_General_CP1_CI_AS
    textValue   SQL_Latin1_General_CP1_CI_AS
    varchar_max SQL_Latin1_General_CP1_CI_AS

     

    Then clean up after yourself

    use tempdb

    go

    drop database testCollationChange

     

    A couple of gotchas prompted by Adam Machanic's comments

    1. If you use user defined types, this technique won't work.  It only works on base datatypes.  As Adam Machanic commented, in these cases use DTS to take the data out of one database and into another.

    2. You will have to drop indexes, constraints, etc that are dependent on the columns being altered. 

     

    February 17

    Default Cursor setting

    It is not often that I miss such a useful feature completely, but I admit to you here, I have.  This setting existed in 2000 and in 2005, at least, but I never even noticed.  This could be because I have always regarded cursors as more evil than listenting to the entire "Jessica Simpson Sings Opera Classics" box set.  (no, there is not really one, but wouldn't that make a great torture method?)

    The main thing that I have stressed over the years is that if you HAD to use a cursor, make sure to declare them as LOCAL:

    DECLARE evilThing CURSOR LOCAL ...

    I still stand by that.  You should rarely assume anything when you can make sure that the proper setting is set regardless of a setting.  However, I do prefer to set up thing such that when someone, ok, when I get careless, particularly in testing the stricter/safer thing occurs.  Like the ANSI_NULL_DFLT_OF setting, so that columns are NOT NULL unless otherwise specified.

    Part of the ALTER DATABASE command allows the setting of the cursor default:

    CURSOR_DEFAULT LOCAL | GLOBAL

    Controls whether cursor scope defaults to LOCAL or GLOBAL.

    Perhaps you have missed it as well, if so, now you haven't.  On the other hand, if you knew of this setting, sorry to have wasted your time :)

    For example, set it to GLOBAL (it defaults to GLOBAL, but just to be sure for this example.  You can check by executing select databaseProperty(db_name(),'IsLocalCursorsDefault') as IsLocalCursorsDefault):

    ALTER DATABASE tempdb
      SET CURSOR_DEFAULT GLOBAL
    GO

    DECLARE evilReally cursor for select 'hi'
    GO

    DECLARE evilReally cursor for select 'hi'
    GO

    The second one will cause the following

    Msg 16915, Level 16, State 1, Line 2
    A cursor with the name 'evilReally' already exists.

    However deallocate the cursor, then set it to LOCAL:

    DEALLOCATE evilReally
    GO

    ALTER DATABASE tempdb
         SET CURSOR_DEFAULT GLOBAL
    GO

    DECLARE evilReally cursor for select 'hi'
    GO

    DECLARE evilReally cursor for select 'hi'
    GO

    Now if someone forgets the DEALLOCATE it won't hurt anything.  The only caveat is the case where someone is still using a global cursor to pass cursors around from batch to batch and expecting them to be global.  This is pretty rare, and certainly should be very rare indeed. 

    I tend to always use cursors declared as:

    DECLARE @cursor CURSOR

    SET @cursor = CURSOR ...

    On the very rare occasion I may have to write one.  Usually I am building some form of code generation tool when it occurs, and yes I am starting to make excuses. :)

     

    February 13

    SQL Formatter

    A while back, when I was moaning about how annoying it was to have to format code for my book over and over, particularly in a manner that is not natural to me (I hate putting keywords in all caps.  I like camel casing startSmall, and not pascal casing StartBig or even underscores shift_minus_sign.  I admit it, I am picky about my own formatting) and Denis from http://sqlservercode.blogspot.com/ suggested I check out:
    http://www.wangz.net/cgi-bin/pp/gsqlparser/sqlpp/sqlformat.tpl.  It does an admirable job and is now a tool that I keep open when I am in the newsgroups.  For example, consider this statement:
     
    select * FRom tablename join (select * from tablename where value = 'something') as bob on tablename.othervalue = bob.othervalue
    Even that small of a statement is a mess.  Now run the formatter:
     
    SELECT *
    FROM   tablename
                    JOIN (SELECT *
                              FROM   tablename
                             WHERE  VALUE = 'something') AS bob
                        ON tablename.othervalue = bob.othervalue
     
    Far more clear.  The only problem I have had is that if something is wrong with is if the SQL isn't quite right it can choke.  For example, this gives it fits:
     
    select *
    FRom tablename
    where table.othervalue = 'test'
     
    But not a real problem as most code you will want to run through this will probably compile, or you can use SSMS or QA to debug it.
     
    They ask for money and will give you an offline version for 30 bucks, but I haven't given them any cash, yet.  I will eventually when I decide how valuable it is to me.  The only thing I would really like to change is a slightly better version of the identifier casing.  It will do UPPER, lower, or Initcap.  If I could integrate a data dictionary/real dictionary I would probably pay the 30 right now instead of finishing this email.  Then tablename could be automagically changed to tableName (or if you prefer table_name, or TableName, or whatever.)  I might want to change things a tad.
     
    The main thing is, this is a solid tool when you have to read other peoples code, and it cannot always be avoided.
    January 22

    Finding Duplicate Values (any number of columns)

    This is by no means an advanced topic, but the question pops up pretty often, so here you go.

    All

    too often it is necessary to find duplicate values in a table, either because a unique constraint was missed, or simply because you need to find out how many duplicate actions occurred over some time period (how many widgets did we sell to customers of a given type in a given region.) Usually it is the former problem, and we all have done it at least once (I know I have too often not created keys where I have actually modelled them just because of a simple goof.)

    Take this following

    set of data:

    create

    table #findDups
    (
     
        findDupsId
    int primary key
    ,
        somecolumn varchar(10
    ),
        checkColumn1 int
    ,
        checkColumn2
    int
    )

    insert

    into #findDups
    select 1,'barney', 1,
    1
    union
    all
    select 2,'barney', 1,
    2
    union
    all
    select 3,'bam-bam',1,
    1
    union
    all
    select 4,'pebbles',1,
    3
    union
    all
    select 5,'pebbles',1,
    4
    union
    all
    select 6,'fred',1,
    2
    union
    all
    select 7,'george',2,
    1
    union
    all
    select 8,'judy',3,
    1

    Just a simple

    set of pretty much non-sensical data, which is also a common problem when working on someone else's database where your only knowledge of the data is that you need to find rows with non-unique values over some columns.

    First

    , if you want to find duplicates in someColumn (which is why I chose that name,) simply execute:

    select

    somecolumn
    from
    #findDups
    group by
    somecolumn
    having count(*) >
    1

    Which will

    return:

    somecolumn

    ----------
    barney
    pebbles

    If

    you need to work with > 2 columns, like the two check columns, it extends very easily:

    select

    checkColumn1, checkColumn2
    from
    #findDups
    group by checkColumn1,
    checkColumn2
    having count(*) >
    1

    This

    returns:

    checkColumn1      checkColumn2

    ------------ ------------
    1            1
    1            2

    Which you can see are the rows

    with duplicates. (note that if you want to see rows without duplicates, just change the > to =)

    The next thing you probably will want

    to do is see the rows with duplicates. For this, use an exists comparison in a correlated subquery on a derived table in the where clause (that was fun to type!):

    select

    *
    from #findDups as
    mainTable
    where exists (
    select 1
                        
    from
    ( select checkColumn1, checkColumn2
                                   
    from
    #findDups
                                  
    group by checkColumn1,
    checkColumn2
                                  
    having count(*) > 1
    ) as dups
                        
    where dups.checkColumn1 = mainTable.
    checkColumn1
                           
    and dups.checkColumn2 = mainTable.checkColumn2
    )
    order by checkColumn1,
    checkColumn2

    This

    returns:

    findDupsId  somecolumn checkColumn1 checkColumn2

    ----------- ---------- ------------ ------------
    1           barney     1            1
    3           bam-
    bam    1            1
    6           fred       1            2
    2           barney     1            2

    The

    key to this query is the derived table, shown in this color that returns the data from the previous query. This is then used in the correlated subquery in this color, which finally gives you the set of rows in the mainTable that rows will be returned, based on any matches to the subquery.

    If

    you want to delete rows automatically to delete the duplicates, you can run a query like this. It will delete the row with the max primary key value, so it is kind of destructive in that it will just delete "some" data to get the set to have no duplicates (be careful with this.) You would just run this over and over until the @@rowcount = 0, or 1 less than the maximum number of duplicate rows. I will just run it twice:

    delete

    mainTable
    from #findDups as
    mainTable
    where exists ( select
    1
                         from (select max(findDupsId) as
    findDupsId
                                 
    from ( select checkColumn1,
    checkColumn2
                                           
    from
    #findDups
                                           
    group by checkColumn1,
    checkColumn2
                                            having count(*) > 1) as
    dups
                                    join #findDups as
    keyRetriever
                                       on dups.checkColumn1 = keyRetriever.
    checkColumn1
                                            and dups.checkColumn2 = keyRetriever.
    checkColumn2
                                   group by keyRetriever.checkColumn1, keyRetriever.
    checkColumn2
                                   having count(*) > 1) as
    keys
                         where keys.findDupsId = mainTable.findDupsId)

    select

    @@rowcount

    First

    time:
    -----------
    2

    Second

    :
    -----------

    Then checking for duplicates, running the previous statement that showed the rows, and by browsing the data:

    select

    * from #findDups

    findDupsId somecolumn checkColumn1 checkColumn2

    ---------- ---------  ------------ ------------
    1          barney     1            1
    2          barney     1            2
    4          pebbles    1            3
    5          pebbles    1            4
    7          george     2            1
    8          judy       3            1

    No

    more duplicates! This is the fun of building set based solutions to problems. By layering more and more subqueries and derived tables, you can build incredibly powerful statements that in one statement does a great deal of work. Just be sure to indent because:

    delete

    mainTable
    from #findDups as
    mainTable
    where exists ( select 1 from (select max(findDupsId) as
    findDupsId
    from ( select checkColumn1,
    checkColumn2
    from #findDups group by checkColumn1,
    checkColumn2
    having count(*) > 1) as
    dups
    join #findDups as keyRetriever on dups.checkColumn1 = keyRetriever.checkColumn1 and dups.checkColumn2 = keyRetriever.
    checkColumn2
    group by keyRetriever.checkColumn1, keyRetriever.
    checkColumn2
    having count(*) > 1) as
    keys
    where keys.findDupsId = mainTable.findDupsId
    )

    Not easy to read!

    Edit:  Denis10504 left a comment that gave a much easier solution to the deleting problem:

    delete from #findDups
    where findDupsId not in
       (   select min(findDupsId)
        from #findDups as mainTable
        group by checkColumn1, checkColumn2)

    This is a far more elegant solution, of course :)  I made the critical mistake of following along a logical solution to illogical conclusion.