Louis's profileThe SQL Doctor is In (Re...PhotosBlogListsMore ![]() | Help |
|
|
September 06 Split table using views and instead of triggersI was playing around with instead of triggers to solve a problem with two tables being treated as one, and I came upon the following solution for making a view that behaves almost exactly like a single table.
For example, I will use the following tables, both sharing the same primary key:
CREATE TABLE tableNameA
( tableNameId int PRIMARY KEY, valueA varchar(10) ) CREATE TABLE tableNameB ( tableNameId int PRIMARY KEY, valueB varchar(10) ) GO Why would you do this? I could see various reasons, none too common. You might want to attach your table to a third party table. For example, the tables needn''t be in the same database. Or one of the tables might have a couple of varchar(max) values which are rarely used. It might be valuable to keep one of the tables small. (I was just answering a newsgroup message a while back, and the reader wanted to do something like this, so I popped this solution together.
First, create a view that FULL OUTER joins the tables. A row does not need to exist on both sides to use this view. The FULL OUTER gets all rows from both tables, where there is a match.
CREATE VIEW tableName
AS SELECT coalesce(tableNameA.tableNameId, tableNameB.tableNameId) AS tableNameId, tableNameA.valueA, tableNameB.valueB FROM tableNameA FULL OUTER JOIN tableNameB ON tableNameA.tableNameId = tableNameB.tableNameId Now, I create an instead of trigger to insert the rows. If you insert a duplicate in either side it will fail:
CREATE TRIGGER tableName_insteadOFInsert
ON tableName INSTEAD OF INSERT AS --note, add error handling for production worthy code BEGIN SET NOCOUNT ON INSERT INTO tableNameA(tableNameId,valueA)
SELECT tableNameId, valueA FROM inserted WHERE valueA IS NOT null INSERT INTO tableNameB(tableNameId,valueB)
SELECT tableNameId, valueB FROM inserted WHERE valueB IS NOT null END Not insert data, noting that you treat the primary key as if it is just a single column.
INSERT INTO tableName(tableNameId, valueA, valueB)
VALUES (1, NULL, '10') INSERT INTO tableName(tableNameId, valueA, valueB) VALUES (2, '20', '20') INSERT INTO tableName(tableNameId, valueA, valueB) VALUES (3, '30', NULL) Then viewing the data:
SELECT *
FROM tableName It looks just like a normal table:
tableNameId valueA valueB
----------- ---------- ---------- 1 NULL 10 2 20 20 3 30 NULL Then you can create an update trigger to manage changes. It is more tricky, because you have to consider that one of the objects may or may not exist: CREATE TRIGGER tableName_insteadOFUpdate
ON tableName INSTEAD OF UPDATE AS BEGIN SET NOCOUNT ON UPDATE tableNameA
SET valueA = inserted.valueA FROM tableNameA JOIN inserted ON tableNameA.tableNameId = inserted.tableNameId --since we want this to behave like a real update, we have to add the related rows only if some row exists INSERT INTO tableNameA(tableNameId,valueA) SELECT tableNameId, valueA FROM inserted WHERE valueA IS NOT NULL and exists (select * from tableNameB where tableNameB.tableNameId = inserted.tableNameId) and not exists (select * from tableNameA where tableNameA.tableNameId = inserted.tableNameId) UPDATE tableNameB SET valueB = inserted.valueB FROM tableNameB JOIN inserted ON tableNameB.tableNameId = inserted.tableNameId INSERT INTO tableNameB(tableNameId,valueB)
SELECT tableNameId, valueB FROM inserted WHERE valueB IS NOT NULL and not exists (select * from tableNameB where tableNameB.tableNameId = inserted.tableNameId) and exists (select * from tableNameA where tableNameA.tableNameId = inserted.tableNameId) END
Now I make a couple of changes to the data, to see that it works:
UPDATE tableName
SET valueA = 'U10' WHERE tableNameId = 1 UPDATE tableName
SET valueA = 'U20', valueB = 'U21' WHERE tableNameId = 2 UPDATE tableName
SET valueB = 'U30' WHERE tableNameId = 2 UPDATE tableName
SET valueB = 'No new row' WHERE tableNameId = 20 Then you can see the changes here (including the fact that no changes were affected by the statement with the non-existant tableNameId value.
SELECT *
FROM tableName This returns:
tableNameId valueA valueB
----------- ---------- ---------- 1 U10 10 2 U20 U30 3 30 NULL Finally, we implement the delete:
CREATE TRIGGER tableName_insteadOFDelete
ON tableName INSTEAD OF DELETE AS --note, add error handling for production worthy code BEGIN SET NOCOUNT ON DELETE tableNameA
FROM tableNameA JOIN deleted ON tableNameA.tableNameId = deleted.tableNameId DELETE tableNameB
FROM tableNameB JOIN deleted ON tableNameB.tableNameId = deleted.tableNameId END
Then test it out:
--one row
DELETE tableName WHERE tableNameId = 2 --zero rows
DELETE tableName WHERE tableNameId = 20 Check that:
SELECT *
FROM tableName tableNameId valueA valueB
----------- ---------- ---------- 1 U10 10 3 30 NULL Now delete the rest:
DELETE tableName
WHERE tableNameId in (1,3) Is it useful? Only if it is... Which is pretty much why I write stuff like this. I find that if I have thought of it, sooner or later it will come in handy. And I haven't thought of much, that's for sure. August 17 Dates to a person's next birthdayThe question was raised on the forums the other day about how to figure out how many days until the next birthday for a person. It was a question that I had never answered before, and a technique that I am not sure why I had never had to employ before. It seems so common really. So I put together a faux person table, with just an id and a birthdate (a bit Orwellian I suppose, but I was just feeling too lazy to phony up names, which then made me have at least two columns...so you get an id) The basics are that want to look for the next date, either in 2006 for October, November, etc, and in 2007 for Feb, April, etc. (note that the article is being published in August 2006). I use this expression take the birthdate and "move" it into this year: cast(year(getdate()) as char(4)) + right(convert(char(8),birthdate,112),4) and then see if it is less than today. If so, I add one to the year and then datediff with today, else I just use today. As is true with any function based solution in SQL, it is not pretty, and I might consider doing this in a CLR function if I had to do it more than once. select birthdate, This returns:
personId birthdate dayToBirthday I cant think of an easier way, even using a date table. Instinctively I wanted to do some sort of integer math with mod and div, but since the number of dates in a year varies, that wasn't possible. Edit: Got an email comment from Frank Kalis (he runs this website http://www.insidesql.de amongst other things) with the following solution: SELECT personID, birthdate This is really cool, as it relies less on substrings (right is a special version of substring) and more on date math. Two really interesting things. First is yet another way to strip off the date from the current day from the date (and probably the best). DATEADD(DAY, 0, DATEDIFF(DAY, 0, GETDATE())) This bit: select DATEDIFF(DAY, 0, GETDATE()) gets the number of days since the beginning of time, then the DATEADD turns it back into a datetime. Then this: DATEADD(YEAR, YEAR(GETDATE())- YEAR(birthdate), birthdate) Sets the year to the current year. Quite a nice little solution indeed. July 23 Using @@rowcount in a triggerWhen writing a trigger, it is important to consider the case where no rows are modified by the statement involved. Furthermore, knowing how many rows are affected can be useful in other circumstances as well. In some validation routines, I will count the number of valid rows and the number of null rows and sum them up to see if they match the number of total rows affected by the trigger (a trick I learned at least from ERwin macros over the years. There are a couple of ways to get the number of rows affected, the most obvious being to count the number of rows in the inserted or deleted table (inserted for insert triggers, deleted for delete triggers, and either for an update trigger. I rarely advocate having a trigger do more than one action, to avoid any code that does one action on a DELETE, and another on an UPDATE, and yet another on an INSERT. It isn't a hard and fast rule by any means (and one that I sometimes break when I am building code by hand and not code generators :) Anyhow, back to rowcount. The easy way to do this is to use the @@rowcount system function (or sometimes thought of as a global variable, thought it is really a function.) The important thing to make sure is that no code that can change the value of @@rowcount is executed before fetching the value of @@rowcount (which will also change the value :) Pretty much the only thing you can do is declare a variable to hold the value. So, consider the following table: create table test And the following trigger, that simply takes the value, puts it in a varariable and the selects that value as a result set (not something typically done in a trigger, but I am doing it here for demonstration purposes.) create trigger testRowcount select @rowcount Now, enter a row: insert into test(testId) returns:
(1 row(s) affected)
Or for two rows: insert into test(testId)
(1 row(s) affected)
alter trigger testRowcount declare @rowcount int select @rowcount And try to enter the data again: insert into test(testId) Returns: ----------- (2 row(s) affected) This turned off the message alright, but now the value of @rowcount is 0. Turns out, SET commands affect the rowcount, so you change it to: alter trigger testRowcount set nocount on select @rowcount Executing this: insert into test(testId) Now returns: ----------- (2 row(s) affected) So now rows affected messages from the trigger are not returned, but those from your current connection are. If you have written stored procedures, this probably feels wierd, as it is common practice to always start a proc with set nocount on. What you can do from here is use this value to simply quit the trigger if no data is modified (ie @rowcount = 0). Of course if it is interesting to you that no rows were affected, you instead let it go, or log the fact that it occurs: alter trigger testRowcount if @rowcount = 0 select @rowcount Now if you insert 0 rows, the select statement will never be reached: insert into test Just returns: (0 row(s) affected) July 19 Ambiguous column, just one table?We have all written queries that return a column list where you see the error:
Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'columnName'. If you don't qualify the names of columns (because you are just writing a quick ad-hoc statement of course) then a couple of the tables use the same column name, it is normal enough.
But it can also crop up in query with only a single table. For example, take this table:
create table test
( testId int primary key, value varchar(200), sortColumn varchar(10) ) Load a bit of data:
--the sortColumn value is opposite of the testId for demo reasons insert into test
select 1, 'Value','b' union all select 2, 'Another value','a' And you run some query like this:
select *
from test order by sortColumn But the data is so long that you can't see the end of it:
testId value ...
----------- ---------------------------------------------------------------- 2 Another value ... 1 Value ...
So you take the sort column and add it to the select clause:
select sortColumn, *
from test order by sortColumn Wa, wa, waaaa. Error:
Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'sortColumn'. What is sort of surprising is the fix. Of course you can type out a proper column list (and in your production code, you really ought to,) but you can also use the positional syntax:
select sortColumn, * --(or sortColumn, sortColumn, rather than the *)
from test order by 1 Though the same caveats apply and you shouldn't do this in your production code if at all possible. However, the following works just fine, and should be what you would do to allow the rest of your code to execute unchanged.
select sortColumn, *
from test order by test.sortColumn At first, this is kind of surprising, but really it isn't. In a SELECT statement, the ORDER BY clause executes after the SELECT clause, so it sees two columns named sortColumn in the set. You can see this by executing this statement:
select testId as sortColumn
from test order by sortColumn This will sort by the value of testId. But change it to:
select testId as sortColumn
from test order by test.sortColumn And instead it virtually adds the test.sortColumn value to the SELECT set, sorts by it, and then discards the value. Of course the best policy is to just go ahead and qualify your names in your SELECT statements, but that isn't always reasonable when writing ad-hoc statements. July 17 Formatting a datediffFirst off, I had started this post before I saw this post: http://sqlservercode.blogspot.com/2006/07/three-ways-to-get-time-portion-of.html and mine is a bit different anyhow. (For full disclosure, the blogger in question was the inspiration for part of this post.)
Sometimes it is just advantageous to format a piece of data in SQL, if for no other reason than sometimes you might want to look at data in an administrative capacity. A good place for that is formatting a date to show the amount of time between some events.
So here is a basic table that might be used to record an event (obviously there would be more information, for that you will have to engage your imagination.)
create table event ( eventId int primary key, startTime datetime, finishTime datetime ) And some basic test data insert into event
select 1, '2006-01-01T00:00:00.000', '2006-01-01T02:14:23.000' union all select 2, '2006-01-01T00:00:00.000', '2006-01-02T03:22:23.233' union all select 3, '2006-01-01T00:00:00.000', '2006-06-02T13:10:30.233' go
As shown in the blog referred to above, you can use convert and 108 style to get the time part (or 114 if you want milliseconds). For example, executing this:
select convert(varchar(10),getdate(),108)
Tonight, returns:
23:32:22
To get the event time, I use the followng query (well, parts of it anyhow.)
select eventId,
cast(hours as varchar(10)) + timeSeconds as secondsOnly, cast(hours as varchar(10)) + timeMilliSeconds as inclMilliseconds, cast(hours / 24 as varchar(10)) + ' day(s) ' + cast(hours % 24 as varchar(10)) + timeSeconds as formatDays from ( select eventId, datediff(hour,startTime, finishTime) as hours, right(convert(varchar(20),finishTime-startTime,108),6) as timeSeconds, right(convert(varchar(20),finishTime-startTime,114),10) as timeMilliSeconds from event) as eventTimesNeedsFormatting Returns:
eventId secondsOnly inclMilliseconds formatDays Breaking this down the derived table: gets the difference in hours, so I can determine days and hours difference, and then I do something that looks really odd, and that is subtract one time value from another.
If you execute:
select finishTime - startTime
from event You will see that this returns the time difference offset from '1900-01-01 00:00:00.000':
1900-01-01 02:14:23.000
1900-01-02 03:22:23.233 1900-06-02 13:10:30.233 The time part is all we care about, so using right(), I peel off the values and plop onto the end of the N day(s) and or simply the hours. Months is too dangerous an equation because of variable length months, but you could easily extend to weeks if you cared.
What makes this significant (at least to me) is that when I first answered the question, I posited the following solution:
select eventId, cast(timeSeconds / 3600 as varchar(10)) + ':'
+ right('0' + cast((timeseconds % 3600) /60 as varchar(2)),2) + ':' + right('0' + cast((timeseconds % 3600) % 60 as varchar(2)),2) as elapsedTime from ( select eventId, datediff(second,startTime, finishTime) as timeseconds from event) as runnerTimesNeedsFormatting The whole taking seconds and turning into the bits of a result is a ghastly bore and really annoying. And if you want to include milliseconds it is just too ugly. A function could be written, likely in CLR if you need this done too many times, but as should be the supposition of any "format using SQL Server" topic, this is not for major production use. Mostly this kind of thing should be for use when you as the dba need to look at some data and don't want to produce a complet application to do so. Hmm, that sounds like a blog all on its own :)
Edit: Had the formatting of seconds and milliseconds backwards. Fixed July 13 datalength() and len() (plus something you may not realize about len())Both are very similar, but each of them have a distinct purpose. Len is a user oriented function, meant to let you see the "visual" representation size of a column as it can be displayed. I actually didn't realize this, but it will work on more datatypes than just character data, in fact the one datatype that it failed for me was text (but who cares about that now that 2005 has arrived?!) This little fact is a good part of why I chose to write this today. On the other hand, datalength is more of a physical storage oriented function that tells you about the amount of storage required for the data. So if you have any fixed length columns, or unicode columns nvarchar, ntext, etc, it may give you "confusing" answers compared to what you probably want (I will demonstrate). You may want to use len instead to get the number of characters in the character string. As an example, consider the following table with several data types represented: create table testDataLength And some sample data (using 12345 as a recurring theme :) insert into testDataLength Now, I will run two selects, one using datalength, the other using len: select 'datalength' as operation, This returns: operation 12345 12.345 ascii-12345 unicode-12345 <the guid> ---------- ----------- ----------- ----------- ------------- ----------------------- datalength 4 5 5 10 16 First, note that the length of the decimal col says it is 8 characters long. We only put 6 digits in, but run this query: select cast(intCol as varchar(10)) as intCol, And you will see that intCol decimalCol As mentioned before, another concern with using datalength() is when using unicode data. Going back to the data look at the to 5 character columns: select 'datalength' as operation, The following is output: operation ascii-12345 unicode-12345 The problem is, if you are used to using datalength to get the size of string columns, it will give misleading results for unicode columns, which most people in the US don't really make too much use of for the most part (not that it is unheard of, and it is becoming more and more common. The system I am working with now using all unicode columns, but it is a product meant for international sale. So almost always what you will want to use is len() to get the visual representation size of the data, not the storage size. July 09 Procedure to drop all relationships from a table (and a brief discussion on preventing injection attacks)Someone asked a while back how to drop a table and foreign key constraints in one statement, and while that is not possible, you can use a procedure to do this pretty easily. I have had this proc around for a while as I actually create it before I drop a table and drop it afterwards in a tool that I have built in ERwin. So I figured I would post it here too.
The procedure works on a cursor, since I have to execute multiple statements. When I first posted the procedure this was the select clause of the procedure:
select 'alter table ' + ctu.table_schema + '.' + ctu.table_name +
' drop constraint ' + cc.constraint_name This suffers from two major problems. One, people can name objects using quoted names that might contain spaces, causing my code to crash. Second, Steve Kass (one of my favorite SQL Server minds because of his incredibly detailed mind) busted me on the fact that it would be possible for a hacker to name a foreign key constraint in such a way to do bad stuff to your table (like drop it.) For example:
create table test
( testid int constraint [PKTest] primary key, constraint AKTest unique (testId), --redundant constraint parentTestId int constraint [AKTest; select 'GOTCHA' drop table test RETURN] references test(testId)
) My code would produce the following, that will drop the constraint by dropping the table as well.
exec ('alter table dbo.test drop constraint AKTest; drop table test RETURN')
So I had to make proper use of the quotename function: select 'alter table ' + quotename(ctu.table_schema) + '.' + quotename(ctu.table_name) +
' drop constraint ' + quotename(cc.constraint_name) which changes the output to:
alter table [dbo].[test] drop constraint [AKTest; drop table test RETURN]
Which is perfectly safe (if a somewhat stupid name for a constraint.) The reason I make note of this is that I generally use quotename for all dynamic SQL. However, I never even thought of doing it for names of objects. I NEVER name objects with spaces or special characters (NEVER), and I just never even considered this kind of hack. It just highlights the importants of thinking globally when posting code, rather than how you do things in your limited scope of your "home turf". (This is the kind of learning experience that I was talking about in my article "Why answer questions". Learning bits and pieces like this are a key part.)
So, without further adeiu, here is the procedure:
create procedure utility$removeRelationships
( @table_schema sysname = 'dbo', --does not do a like comparison @parent_table_name sysname = '%', --it is the parent when it is being referred to @child_table_name sysname = '%', --it is the child table when it is the table referring --to another @constraint_name sysname = '%' --can be used to drop only a single constraint ) as -- ---------------------------------------------------------------- -- Drop all of the foreign key contraints on and or to a table -- ---------------------------------------------------------------- begin set nocount on declare @statements cursor set @statements = cursor static for select 'alter table ' + quotename(ctu.table_schema) + '.' + quotename(ctu.table_name) + ' drop constraint ' + quotename(cc.constraint_name) from information_schema.referential_constraints as cc join information_schema.constraint_table_usage as ctu on cc.constraint_catalog = ctu.constraint_catalog and cc.constraint_schema = ctu.constraint_schema and cc.constraint_name = ctu.constraint_name where ctu.table_schema = @table_schema and ctu.table_name like @child_table_name and cc.constraint_name like @constraint_name and exists (select * from information_schema.constraint_table_usage ctu2 where cc.unique_constraint_catalog = ctu2.constraint_catalog and cc.unique_constraint_schema = ctu2.constraint_schema and cc.unique_constraint_name = ctu2.constraint_name and ctu2.table_schema = @table_schema and ctu2.table_name like @parent_table_name) open @statements declare @statement nvarchar(1000) While (1=1) begin fetch from @statements into @statement if @@fetch_status <> 0 break exec (@statement) end end
go To test it, here are three related tables, with unnamed constraints (which really makes the procedure useful:
If you try to drop the child table: drop table child No dice: Instead: exec utility$removeRelationships @parent_table_name = 'child'
drop table child This works (no matter what the constraints are named!) Over the next few weeks I will post the other procs I have like this for check and default constraints.
June 25 Unique constraint, or unique indexThe question is often asked about the difference between an unique constraint, and an unique index. Using them should be thought of as quite different tasks, thought there large similarities as well.
A unique constraint should be used as an implementation tool primarily to enforce some uniqueness condition, in addition to the primary key (in other words, an alternate key.) An unique index should be used to enhance performance. For example, if you use an identity surrogate for the primary key of a purchase order table, you might also have an purchase order number column that is printed on a paper form that is your users way of identifying the order. For an order table, you might have a customer purchase order value on an order table that you would want to enforce uniqueness on for a customer. (If the customer purchase order value is optional and unique, you will have to employ a technique known as selective uniqueness handling.) My suggestion is that every table has at least one natural (at least non-surrogate) key enforced with a constraint. This is not always possible of course, but it is clearly true in 95-99% of all tables. Unique indexes on the other hand should be used to improve performance. So when you choose to index a column or set of columns, if the values in this index do turn out to contain unique values then use a unique index. An obvious case would be when you include the keys of a unique constraint in your index (like an index on orderNumber, customerId. orderNumber is unique itself because of a constraint, so this is a case where a unique index is called for.) Unique indexes are better than non-unique because the optimizer doesn't have to guess about the max number of rows that might be returned using the index, it is either 1 or 0. A key difference in usage should be that you might use a unique constraint on a column or set of columns that might never be searched on, strictly to enforce uniqueness. On the other hand, a unique index that is never used should likely be dropped (and if not, it might just be a constraint. June 15 Preventing user from modifying data in a columnThis question came up in the forums, so I wrote this to answer the question. There are two ways to do this using a trigger (as well as using column level security, but that won't keep programmers from writing code to modify values!) First a bit of sample data: create table testTriggerNoUpdateOfColumns The first method won't even let DML statements reference the data. This is the most efficient method, but might be too annoying if you have existing code (note this trigger is just barebones, you may need to add error handling depending on how your other code works with this): create trigger testTriggerNoUpdateOfColumns_afterUpdate So this is allowed: update testTriggerNoUpdateOfColumns But this fails: update testTriggerNoUpdateOfColumns With the error coded in the trigger Msg 50000, Level 16, State 1, Procedure testTriggerNoUpdateOfColumns_afterUpdate, Line 8 The second method allows DML statements to reference the columns, but it validates the data to see if it has changed.: alter trigger testTriggerNoUpdateOfColumns_afterUpdate So this allowed update testTriggerNoUpdateOfColumns As well as this is also now allowed, since the data doesn't change update testTriggerNoUpdateOfColumns Changing the value will however fail update testTriggerNoUpdateOfColumns Same error Msg 50000, Level 16, State 1, Procedure testTriggerNoUpdateOfColumns_afterUpdate, Line 16 --the next statements shows that the NULL business works: update testTriggerNoUpdateOfColumns update testTriggerNoUpdateOfColumns Msg 50000, Level 16, State 1, Procedure testTriggerNoUpdateOfColumns_afterUpdate, Line 16 update testTriggerNoUpdateOfColumns update testTriggerNoUpdateOfColumns Msg 50000, Level 16, State 1, Procedure testTriggerNoUpdateOfColumns_afterUpdate, Line 16
June 14 Someday this post could save you hours...Though I will almost guarantee you will have forgotten about it:
May 10 Queries to get filegroup of table/indexesYesterday, when I was looking at disk space utilization for a server, I had a need to see what objects were placed on what filegroups, and these queries were the outcome. The queries should be pretty self explanatory, but if you want more info, email me or leave a comment and I will add comments.
--2005
select case when indexes.type_desc in ('HEAP','CLUSTERED')
then 'Table-' + indexes.type_desc else ' NC Index' end as indexType, rtrim(cast(
case when indexProperty(objects.object_id,indexes.name,'IsUnique') = 1 then 'unique ' else '' end + case when isNull(objectProperty(object_id(schemas.name + '.' + indexes.name),'IsConstraint'),0) = 1
then 'constraint ' else '' end + case when indexProperty(objects.object_id,indexes.name,'IsAutoStatistics') = 1 then 'auto ' else '' end + case when indexProperty(objects.object_id,indexes.name,'IsStatistics') = 1 then 'statistics ' else '' end + case when indexProperty(objects.object_id,indexes.name,'IsHypothetical') = 1 then 'hypothetical ' else '' end as varchar(30))) as indexProperties, schemas.name + '.' + objects.name as tableName, coalesce(indexes.name,'') as indexName, filegroups.name as filegroup from sys.indexes as indexes join sys.objects on indexes.object_id = objects.object_id join sys.schemas on objects.schema_id = schemas.schema_id join sys.filegroups as filegroups on indexes.data_space_id = filegroups.data_space_id where objectproperty(indexes.object_id,'IsMSShipped') = 0 order by tableName, case when indexes.type_desc in ('HEAP','CLUSTERED') then 0 else 1 end --2000 (Assumes that owner is DBO)
select case when indid = 0 then 'Table-Heap'
when indid = 1 then 'Table-Clustered' else ' NC Index' end as objectType, rtrim(cast( case when indexProperty(sysindexes.id,sysindexes.name,'IsUnique') = 1 then 'unique ' else '' end + case when isNull(objectProperty(object_id(sysindexes.name),'IsConstraint'),0) = 1 then 'constraint ' else '' end + case when indexProperty(sysindexes.id,sysindexes.name,'IsAutoStatistics') = 1 then 'auto ' else '' end + case when indexProperty(sysindexes.id,sysindexes.name,'IsStatistics') = 1 then 'statistics ' else '' end + case when indexProperty(sysindexes.id,sysindexes.name,'IsHypothetical') = 1 then 'hypothetical ' else '' end as varchar(30))) as indexProperties, object_name(sysindexes.id) as tableName, coalesce(sysindexes.name,'') as indexName, sysfilegroups.groupname as filegroup from sysindexes join sysfilegroups on sysindexes.groupId = sysfilegroups.groupId where objectproperty(sysindexes.id,'IsMSShipped') = 0 order by object_name(sysindexes.id),indid Both queries return something like: indexType indexProperties tableName indexName fileGroup
---------------- ------------------- -------------- --------------- -----------
Table-CLUSTERED unique constraint schema.table PKschema_table PRIMARY
NC Index unique constraint schema.table AKschema_table PRIMARY May 02 A scripting idea to allow scripting ALTERs instead of DROP/CREATE pairs...I was writing some procedures today, and getting kind of annoyed that I had to keep changing from CREATE PROCEDURE to ALTER PROCEDURE in my database create scripts, or DROP the procedure first, losing the original create date, and when it was altered, and I came up with this as a possible way around that.
Instead of looking for the procedure's existance to drop it, I look for the procedure's existance and if it hasn't been created yet I create a "dummy" procedure. Then I use ALTER to modify the procedure, so it won't lose any permissions that some other user has created, plus the date it was first created, and in 2005, the date it was modified is also maintained:
if object_id('procedureName') is null
exec ('create procedure procedureName as select ''replace me''') go alter procedure procedureName as select 'this is the real thing!' go
Edit: As commenter Scott notes, we are generaly more interested in when the procedure was modified, not when it was created. In the 2005 metadata, you get a modified date as well as a create_date:
After running this script a few times while writing this post, the create date stays constant, but the modify_date changes:
select cast(name as varchar(30)) as name, create_date, modify_date
from sys.procedures This returns (for me):
name create_date modify_date
------------------------------ ----------------------- ----------------------- procedureName 2006-05-03 08:42:27.780 2006-05-03 08:48:11.480 For 2005, It might have a downside, but I haven't thought of one yet (this clearly doesn't mean there isn't one :) I probably wouldn't use this in 2000 and earlier. April 25 PARSENAME and IP AddressesI had never heard of this function until the other day when Steve Kass (check out his SQL page too. Steve has a very detailed and awesome mind) and Denis The SQL Menace, (who has a very nice SQL blog too.)
Anyhow, a person asked in the forums about how to sort by an IP address:
So, I took this data:
create table ipAddress
( ipaddress nvarchar(18) ) insert into ipAddress select '0.0.18.1' union all select '0.1.1.2' union all select '0.0.0.1' And created this query, which has two derived tables to "peel" off the parts of the ip address:
select oct1, oct2, substring(ipAddress,1,charindex('.',ipaddress) - 1) as oct3, substring(ipAddress,charindex('.',ipaddress) + 1,len(ipAddress)) as oct4 from (select oct1, substring(ipAddress,1,charindex('.',ipaddress) - 1) as oct2, substring(ipAddress,charindex('.',ipaddress) + 1,len(ipAddress)) as ipaddress from ( select substring(ipAddress,1,charindex('.',ipaddress) - 1) as oct1, substring(ipAddress,charindex('.',ipaddress) + 1,len(ipAddress)) as ipaddress from ipAddress) as derivedTable1) as derivedTable2 order by 1,2,3,4 Which was a fun query, yet htere were better ways to do it. Steve and Denis noted that:
select cast(parsename(ipaddress,1) as tinyint),
cast(parsename(ipaddress,2) as tinyint), cast(parsename(ipaddress,3) as tinyint), cast(parsename(ipaddress,4) as tinyint) from ipaddress order by cast(parsename(ipaddress,1) as tinyint), cast(parsename(ipaddress,2) as tinyint), cast(parsename(ipaddress,3) as tinyint), cast(parsename(ipaddress,4) as tinyint) To which I replied, "parse-who-now?" PARSENAME is a built-in function that was created for parsing object names, like 'servername.databasename.schemaname.objectname'. Because of this it only will get the first four values, seperated by periods, which just happens to be how ipAddresses are constructed (in case you didn't know :)
A few additional articles on using PARSENAME:
Of course, by the looks of this search, I am the last person on the planet to hear of PARSENAME:
The thing is, this is just not the best way to store an ip address. I generally prefer to store ipAddresses as a bigint, or another great way is as four tinyint values. Check this article for a more complete reference: http://www.aspfaq.com/show.asp?id=2450. April 17 You may have a normalization problem if...you have multi purpose character columnsIf you find yourself using substring in a substantial number of queries on the data in your database, you may have a normalization problem. And this means any functionality that uses a substring-like interface (replace, stuff, charindex, patindex, etc). Truly there is very little use for these functions in a well normalized system.
This article will discus the first of the types of issues, the multi-purpose character column.
The first kind of example is something like a name column that contains a full persons name. Seems reasonable so you can put ever how many parts of the name in there. So you code something like:
SELECT <columns>
FROM contact WHERE rtrim(substring(fullName,charindex(' ',fullName),len(fullname))) = @lastName or
WHERE fullName like '%' + @lastName Or some variation on this theme. Seems innocent enough, but now the SQL engine cannot effectively act on this query effectively using an index to perform the search. And what if the person is a 'Jr' like John Smith Jr. Now you are really in a mess trying to substring your way out of a mess.
Take this to the next level what about updating this value, say to change a persons last name. You are really hosed and are stuck writing ridiculous code to ignore Junior, Jr and anything else that looks like a suffix, and dealing with people with cool names to look at and pronounce, but a nightmare to code against, like 'Van Der Mullen' or 'Mac Donald' Just try to write the query that decides if this is this a guy named Mac, last name Donald, or perhaps the data entry person screwed up Mrs Delores Mac Donalds name?
It is a very common type of mistake that every one of us has made at one time or another because one column seems easier than two (or three, or whatever).
The rule of thumb here is that if you will "likely" be dealing with part of a column seperately, give it a different place for storage. You can deal with usage in a more elegant manner by building tools in the client layer, or you can use a view or computed column. I often use a computed column when I have this kind of need. For example, take the following abbreviated name table (no primary key, and only the very basic name bits.)
create table nameExample
( firstName varchar(30) not null, middleName varchar(30) null, lastName varchar(30) not null, fullname as firstName + coalesce(' ' + middleName,'') + ' '+ lastName, formalName as lastName + ', ' + firstName + coalesce(' ' + middleName,'') ) I have included two versions of the name that will allow you to display the data for the user, as well as the three parts of the name (very common in the United States to record these three names)
insert into nameExample(firstName, middleName, lastName)
select 'John',null,'Smith' union all select 'Bob','G','Leonard' union all select 'Fred','Allen','Stein' Now view the data:
select * from nameExample firstName middleName lastName fullName formalName
---------- ----------- --------- --------------- ---------------- John NULL Smith John Smith Smith, John Bob G Leonard Bob G Leonard Leonard, Bob G Fred Allen Stein Fred Allen Stein Stein, Fred Allen The next question that this data puts to mind is "what about the middle initial?" Should there be a column for just the middle initial? And what of the case where Bob Leonard only gave a single character response the question of middleName (something I frequently do when I don't feel like sharing!) I would most likely say add a computed column that gets the substring of the first character of the middleName, since the middle initial is functionally dependent on the middleName column, even if the user only put in one character.
alter table nameExample
add middleInitial as substring(middleName,1,1) Now you can persist this data by indexing it, or in SQL Server 2005 using the PERSIST keyword for faster access. The key here is that you have presented the least surface space to be edited, and you will rarely if ever need, in SQL code, to look for parts of a name. You might need to look for people whose last name starts with 'F', or the such, but it would be rare to look for values that end in 's', but you still have the SQL tools to go there if the user wants to.
The next in this series will discuss bitmask columns, followed by multi-valued columns (such as '1,2,3') April 14 Kicking users out of your database, now!For many years, I had a technique that more or less worked. Declare a cursor on sysprocesses (or in 2005, sys.dm_exec_requests) and kill each user individually. Then go back and make sure that they are all dead, or try again. Not a perfect system, but it does relatively work.
alter database <databaseName>
set single_user with rollback immediate --(or rollback after <seconds>)
This is really handy, especially when you drop and recreate a database multiple times during testing scripts. Often I will end up with five connections to the database and when I try to drop the database, this message appears:
Msg 3702, Level 16, State 4, Line 2
Cannot drop database "<databaseName>" because it is currently in use. The cost of this is that other connections to the server for the affected database will be immediately severed, so the first time you try something you will have to reconnect. In SSMS you will likely get this error the first time you try something since the tools don't realize that they have been cut off:
Msg 233, Level 20, State 0, Line 0
A transport-level error has occurred when sending the request to the server. (provider: Shared Memory Provider, error: 0 - No process is on the other end of the pipe.) But the second time will automatically reconnect. March 14 How many concurrent connections can MSDE have?If you answered 5 (or even thought it) you would do yourself a favor to go read Euan Garden's SQL Mythbuster blog MSDE/SQL Express has 5 concurrent user limit. Excellent history lesson, and good clarification on the limits (and lack of the limits you may have thought existed) in the free versions of SQL Server. February 26 Change table collations en masse.This question came up today in the newsgroups, so I answered it there, but it is probably something that might be interesting to more than one or two people, so I expanded it here for "future use." Say you have the following database where the collation was case sensitive and you want to change this to case insensitive: create database testCollationChange Then check out the collation to make sure: select databasepropertyex(db_name(),'collation') as collation_name collation_name Then create a table: create table test And check the collation of the columns: select column_name, collation_name This returns: column_name collation_name --------------- ---------------------------------- value SQL_Latin1_General_CP1_CS_AStextValue SQL_Latin1_General_CP1_CS_AS varchar_max SQL_Latin1_General_CP1_CS_AS Now you decide to change the collation of the database to case insensitive, just alter the database: alter database testCollationChange select databasepropertyex(db_name(),'collation') as collation_name This returns: collation_name But the tables stay in the previous collation: select column_name, collation_name Which returns: column_name collation_name To change this, I will use a query that I Adapted from Justin Mallier's blog: (http://mallier.blogspot.com/2005/11/how-to-change-collation-against-all.html) that builds a script to alter the columns: --I added the variables to make it more safe (Some collations may be desired declare @fromCollation sysname, set @fromCollation = 'SQL_Latin1_General_CP1_CS_AS' SELECT 'ALTER TABLE ' + TABLE_NAME +' ALTER COLUMN ' + COLUMN_NAME + ' ' + DATA_TYPE + CASE WHEN CHARACTER_MAXIMUM_LENGTH = -1 then '(max)' WHEN DATA_TYPE in ('text','ntext') then '' WHEN CHARACTER_MAXIMUM_LENGTH IS NOT NULL THEN '('+(CONVERT(VARCHAR,CHARACTER_MAXIMUM_LENGTH)+')' ) ELSE isnull(CONVERT(VARCHAR,CHARACTER_MAXIMUM_LENGTH),' ') END +' COLLATE ' + @toCollation+ ' ' + CASE IS_NULLABLE WHEN 'YES' THEN 'NULL' WHEN 'No' THEN 'NOT NULL' END This is the result. Execute this query and you can change all of the collations. ALTER TABLE test ALTER COLUMN value varchar(20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL go Checking them again: select column_name, collation_name Returns: column_name collation_name Then clean up after yourself use tempdb go drop database testCollationChange
A couple of gotchas prompted by Adam Machanic's comments 1. If you use user defined types, this technique won't work. It only works on base datatypes. As Adam Machanic commented, in these cases use DTS to take the data out of one database and into another. 2. You will have to drop indexes, constraints, etc that are dependent on the columns being altered.
February 17 Default Cursor settingThe main thing that I have stressed over the years is that if you HAD to use a cursor, make sure to declare them as LOCAL: DECLARE evilThing CURSOR LOCAL ... I still stand by that. You should rarely assume anything when you can make sure that the proper setting is set regardless of a setting. However, I do prefer to set up thing such that when someone, ok, when I get careless, particularly in testing the stricter/safer thing occurs. Like the ANSI_NULL_DFLT_OF setting, so that columns are NOT NULL unless otherwise specified. Part of the ALTER DATABASE command allows the setting of the cursor default: CURSOR_DEFAULT LOCAL | GLOBAL Perhaps you have missed it as well, if so, now you haven't. On the other hand, if you knew of this setting, sorry to have wasted your time :) For example, set it to GLOBAL (it defaults to GLOBAL, but just to be sure for this example. You can check by executing select databaseProperty(db_name(),'IsLocalCursorsDefault') as IsLocalCursorsDefault): ALTER DATABASE tempdb DECLARE evilReally cursor for select 'hi' DECLARE evilReally cursor for select 'hi' The second one will cause the following Msg 16915, Level 16, State 1, Line 2 However deallocate the cursor, then set it to LOCAL: DEALLOCATE evilReally ALTER DATABASE tempdb DECLARE evilReally cursor for select 'hi' DECLARE evilReally cursor for select 'hi' Now if someone forgets the DEALLOCATE it won't hurt anything. The only caveat is the case where someone is still using a global cursor to pass cursors around from batch to batch and expecting them to be global. This is pretty rare, and certainly should be very rare indeed. I tend to always use cursors declared as: DECLARE @cursor CURSOR SET @cursor = CURSOR ... On the very rare occasion I may have to write one. Usually I am building some form of code generation tool when it occurs, and yes I am starting to make excuses. :)
February 13 SQL FormatterA while back, when I was moaning about how annoying it was to have to format code for my book over and over, particularly in a manner that is not natural to me (I hate putting keywords in all caps. I like camel casing startSmall, and not pascal casing StartBig or even underscores shift_minus_sign. I admit it, I am picky about my own formatting) and Denis from http://sqlservercode.blogspot.com/ suggested I check out:
http://www.wangz.net/cgi-bin/pp/gsqlparser/sqlpp/sqlformat.tpl. It does an admirable job and is now a tool that I keep open when I am in the newsgroups. For example, consider this statement:
select * FRom tablename join (select * from tablename where value = 'something') as bob on tablename.othervalue = bob.othervalue
Even that small of a statement is a mess. Now run the formatter:
SELECT *
FROM tablename JOIN (SELECT * FROM tablename WHERE VALUE = 'something') AS bob ON tablename.othervalue = bob.othervalue Far more clear. The only problem I have had is that if something is wrong with is if the SQL isn't quite right it can choke. For example, this gives it fits:
select *
FRom tablename where table.othervalue = 'test' But not a real problem as most code you will want to run through this will probably compile, or you can use SSMS or QA to debug it.
They ask for money and will give you an offline version for 30 bucks, but I haven't given them any cash, yet. I will eventually when I decide how valuable it is to me. The only thing I would really like to change is a slightly better version of the identifier casing. It will do UPPER, lower, or Initcap. If I could integrate a data dictionary/real dictionary I would probably pay the 30 right now instead of finishing this email. Then tablename could be automagically changed to tableName (or if you prefer table_name, or TableName, or whatever.) I might want to change things a tad.
The main thing is, this is a solid tool when you have to read other peoples code, and it cannot always be avoided. January 22 Finding Duplicate Values (any number of columns)This is by no means an advanced topic, but the question pops up pretty often, so here you go.All too often it is necessary to find duplicate values in a table, either because a unique constraint was missed, or simply because you need to find out how many duplicate actions occurred over some time period (how many widgets did we sell to customers of a given type in a given region.) Usually it is the former problem, and we all have done it at least once (I know I have too often not created keys where I have actually modelled them just because of a simple goof.)Take this following set of data:create table #findDups( findDupsId int primary key, somecolumn varchar(10), checkColumn1 int, checkColumn2 int ) insert into #findDupsselect 1,'barney', 1,1 union all select 2,'barney', 1,2 union all select 3,'bam-bam',1,1 union all select 4,'pebbles',1,3 union all select 5,'pebbles',1,4 union all select 6,'fred',1,2 union all select 7,'george',2,1 union all select 8,'judy',3,1 Just a simple set of pretty much non-sensical data, which is also a common problem when working on someone else's database where your only knowledge of the data is that you need to find rows with non-unique values over some columns.First , if you want to find duplicates in someColumn (which is why I chose that name,) simply execute:select somecolumnfrom #findDups group by somecolumn having count(*) > 1 Which will return:somecolumn barney pebbles If you need to work with > 2 columns, like the two check columns, it extends very easily:select checkColumn1, checkColumn2from #findDups group by checkColumn1, checkColumn2 having count(*) > 1 This returns:checkColumn1 checkColumn2 1 1 1 2 Which you can see are the rows with duplicates. (note that if you want to see rows without duplicates, just change the > to =)The next thing you probably will want to do is see the rows with duplicates. For this, use an exists comparison in a correlated subquery on a derived table in the where clause (that was fun to type!):select *from #findDups as mainTable where exists ( select 1 from ( select checkColumn1, checkColumn2 from #findDups group by checkColumn1, checkColumn2 having count(*) > 1) as dups where dups.checkColumn1 = mainTable.checkColumn1 and dups.checkColumn2 = mainTable.checkColumn2) order by checkColumn1, checkColumn2 This returns:findDupsId somecolumn checkColumn1 checkColumn2 1 barney 1 1 3 bam-bam 1 1 6 fred 1 2 2 barney 1 2 The key to this query is the derived table, shown in this color that returns the data from the previous query. This is then used in the correlated subquery in this color, which finally gives you the set of rows in the mainTable that rows will be returned, based on any matches to the subquery.If you want to delete rows automatically to delete the duplicates, you can run a query like this. It will delete the row with the max primary key value, so it is kind of destructive in that it will just delete "some" data to get the set to have no duplicates (be careful with this.) You would just run this over and over until the @@rowcount = 0, or 1 less than the maximum number of duplicate rows. I will just run it twice:delete mainTablefrom #findDups as mainTable where exists ( select 1 from (select max(findDupsId) as findDupsId from ( select checkColumn1, checkColumn2 from #findDups group by checkColumn1, checkColumn2 having count(*) > 1) as dups join #findDups as keyRetriever on dups.checkColumn1 = keyRetriever.checkColumn1 and dups.checkColumn2 = keyRetriever.checkColumn2 group by keyRetriever.checkColumn1, keyRetriever.checkColumn2 having count(*) > 1) as keys where keys.findDupsId = mainTable.findDupsId) select @@rowcountFirst time:----------- 2 Second :----------- Then checking for duplicates, running the previous statement that showed the rows, and by browsing the data: select * from #findDupsfindDupsId somecolumn checkColumn1 checkColumn2 1 barney 1 1 2 barney 1 2 4 pebbles 1 3 5 pebbles 1 4 7 george 2 1 8 judy 3 1 No more duplicates! This is the fun of building set based solutions to problems. By layering more and more subqueries and derived tables, you can build incredibly powerful statements that in one statement does a great deal of work. Just be sure to indent because:delete mainTablefrom #findDups as mainTable where exists ( select 1 from (select max(findDupsId) as findDupsId from ( select checkColumn1, checkColumn2 from #findDups group by checkColumn1, checkColumn2 having count(*) > 1) as dups join #findDups as keyRetriever on dups.checkColumn1 = keyRetriever.checkColumn1 and dups.checkColumn2 = keyRetriever.checkColumn2 group by keyRetriever.checkColumn1, keyRetriever.checkColumn2 having count(*) > 1) as keys where keys.findDupsId = mainTable.findDupsId) Not easy to read! Edit: Denis10504 left a comment that gave a much easier solution to the deleting problem: delete from #findDups This is a far more elegant solution, of course :) I made the critical mistake of following along a logical solution to illogical conclusion.
|
|
|