In SQL Server, it’s possible to insert
into a table using a SELECT
statement:
Is it also possible to update via a SELECT
? I have a temporary table containing the values and would like to update another table using those values. Perhaps something like this:
In SQL Server 2008 (or better), use MERGE
MERGE INTO YourTable T
USING other_table S
ON T.id = S.id
AND S.tsql = 'cool'
WHEN MATCHED THEN
UPDATE
SET col1 = S.col1,
col2 = S.col2;
Alternatively:
UPDATE table
SET Col1 = i.Col1,
Col2 = i.Col2
FROM (
SELECT ID, Col1, Col2
FROM other_table) i
WHERE
i.ID = table.ID
How do I perform an IF...THEN
in an SQL SELECT
statement?
For example:
The CASE
statement is the closest to IF in SQL and is supported on all versions of SQL Server.
SELECT CAST(
CASE
WHEN Obsolete = 'N' or InStock = 'Y'
THEN 1
ELSE 0
END AS bit) as Saleable, *
FROM Product
You only need to do the CAST
if you want the result as a Boolean value. If you are happy with an int
, this works:
CASE
statements can be embedded in other CASE
statements and even included in aggregates.
SQL Server Denali (SQL Server 2012) adds the IIF statement which is also available in access (pointed out by Martin Smith):
The case statement is your friend in this situation, and takes one of two forms:
The simple case:
SELECT CASE <variable> WHEN <value> THEN <returnvalue>
WHEN <othervalue> THEN <returnthis>
ELSE <returndefaultcase>
END AS <newcolumnname>
FROM <table>
The extended case:
SELECT CASE WHEN <test> THEN <returnvalue>
WHEN <othertest> THEN <returnthis>
ELSE <returndefaultcase>
END AS <newcolumnname>
FROM <table>
You can even put case statements in an order by clause for really fancy ordering.
From SQL Server 2012 you can use the IIF
function for this.
This is effectively just a shorthand (albeit not standard SQL) way of writing CASE
.
I prefer the conciseness when compared with the expanded CASE
version.
Both IIF()
and CASE
resolve as expressions within a SQL statement and can only be used in well-defined places.
The CASE expression cannot be used to control the flow of execution of Transact-SQL statements, statement blocks, user-defined functions, and stored procedures.
If your needs can not be satisfied by these limitations (for example, a need to return differently shaped result sets dependent on some condition) then SQL Server does also have a procedural IF
keyword.
IF @IncludeExtendedInformation = 1
BEGIN
SELECT A,B,C,X,Y,Z
FROM T
END
ELSE
BEGIN
SELECT A,B,C
FROM T
END
Care must sometimes be taken to avoid parameter sniffing issues with this approach however.
The OR
operator in the WHEN
clause of a CASE
statement is not supported. How can I do this?
That format requires you to use either:
CASE ebv.db_no
WHEN 22978 THEN 'WECS 9500'
WHEN 23218 THEN 'WECS 9500'
WHEN 23219 THEN 'WECS 9500'
ELSE 'WECS 9520'
END as wecs_system
Otherwise, use:
I have a .sql
file with an export from phpMyAdmin
. I want to import it into a different server using the command line.
I have a Windows Server 2008 R2 installation. I placed the .sql
file on the C drive, and I tried this command
It is not working. I get syntax errors.
Try:
Check MySQL Options.
Note-1: It is better to use the full path of the SQL file file.sql
.
Note-2: Use -R
and --triggers
to keep the routines and triggers of original database. They are not copied by default.
Note-3 You may have to create the (empty) database from mysql if it doesn’t exist already and the exported SQL don’t contain CREATE DATABASE
(exported with --no-create-db
or -n
option), before you can import it.
A common use of mysqldump is for making a backup of an entire database:
You can load the dump file back into the server like this:
UNIX
The same in Windows command prompt:
PowerShell
MySQL command line
Regarding the time taken for importing huge files: most importantly, it takes more time because the default setting of MySQL is autocommit = true
. You must set that off before importing your file and then check how import works like a gem.
You just need to do the following thing:
How can a column with a default value be added to an existing table in SQL Server 2000 / SQL Server 2005?
ALTER TABLE {TABLENAME}
ADD {COLUMNNAME} {TYPE} {NULL|NOT NULL}
CONSTRAINT {CONSTRAINT_NAME} DEFAULT {DEFAULT_VALUE}
WITH VALUES
ALTER TABLE SomeTable
ADD SomeCol Bit NULL --Or NOT NULL.
CONSTRAINT D_SomeTable_SomeCol --When Omitted a Default-Constraint Name is autogenerated.
DEFAULT (0)--Optional Default-Constraint.
WITH VALUES --Add if Column is Nullable and you want the Default Value for Existing Records.
Optional Constraint Name:
If you leave out CONSTRAINT D_SomeTable_SomeCol
then SQL Server will autogenerate
a Default-Contraint with a funny Name like: DF__SomeTa__SomeC__4FB7FEF6
Optional With-Values Statement:
The WITH VALUES
is only needed when your Column is Nullable
and you want the Default Value used for Existing Records.
If your Column is NOT NULL
, then it will automatically use the Default Value
for all Existing Records, whether you specify WITH VALUES
or not.
How Inserts work with a Default-Constraint:
If you insert a Record into SomeTable
and do not Specify SomeCol
’s value, then it will Default to 0
.
If you insert a Record and Specify SomeCol
’s value as NULL
(and your column allows nulls),
then the Default-Constraint will not be used and NULL
will be inserted as the Value.
Notes were based on everyone’s great feedback below.
Special Thanks to:
@Yatrix, @WalterStabosz, @YahooSerious, and @StackMan for their Comments.
The inclusion of the DEFAULT fills the column in existing rows with the default value, so the NOT NULL constraint is not violated.
When adding a nullable column, WITH VALUES
will ensure that the specific DEFAULT value is applied to existing rows:
ALTER TABLE table
ADD column BIT -- Demonstration with NULL-able column added
CONSTRAINT Constraint_name DEFAULT 0 WITH VALUES
Returns: 2008-09-22 15:24:13.790
I want that date part without the time part: 2008-09-22 00:00:00.000
How can I get that?
On SQL Server 2008
and higher, you should CONVERT
to date:
On older versions, you can do the following:
for example
gives me
Pros:
varchar
<->datetime
conversions required
locale
As suggested by Michael
Use this variant: SELECT DATEADD(dd, 0, DATEDIFF(dd, 0, getdate()))
select getdate()
SELECT DATEADD(hh, DATEDIFF(hh, 0, getdate()), 0)
SELECT DATEADD(hh, 0, DATEDIFF(hh, 0, getdate()))
SELECT DATEADD(dd, DATEDIFF(dd, 0, getdate()), 0)
SELECT DATEADD(dd, 0, DATEDIFF(dd, 0, getdate()))
SELECT DATEADD(mm, DATEDIFF(mm, 0, getdate()), 0)
SELECT DATEADD(mm, 0, DATEDIFF(mm, 0, getdate()))
SELECT DATEADD(yy, DATEDIFF(yy, 0, getdate()), 0)
SELECT DATEADD(yy, 0, DATEDIFF(yy, 0, getdate()))
Output:
SQLServer 2008 now has a ‘date’ data type which contains only a date with no time component. Anyone using SQLServer 2008 and beyond can do the following:
If using SQL 2008 and above:
It’s easy to find duplicates with one field:
So if we have a table
ID NAME EMAIL
1 John asd@asd.com
2 Sam asd@asd.com
3 Tom asd@asd.com
4 Bob bob@asd.com
5 Tom asd@asd.com
This query will give us John, Sam, Tom, Tom because they all have the same email
.
However, what I want is to get duplicates with the same email
and name
.
That is, I want to get “Tom”, “Tom”.
The reason I need this: I made a mistake, and allowed to insert duplicate name
and email
values. Now I need to remove/change the duplicates, so I need to find them first.
Simply group on both of the columns.
Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of “functional dependency”:
In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.
Support is not consistent:
MySQL is unpredictable and you need sql_mode=only_full_group_by
:
try this:
declare @YourTable table (id int, name varchar(10), email varchar(50))
INSERT @YourTable VALUES (1,'John','John-email')
INSERT @YourTable VALUES (2,'John','John-email')
INSERT @YourTable VALUES (3,'fred','John-email')
INSERT @YourTable VALUES (4,'fred','fred-email')
INSERT @YourTable VALUES (5,'sam','sam-email')
INSERT @YourTable VALUES (6,'sam','sam-email')
SELECT
name,email, COUNT(*) AS CountOf
FROM @YourTable
GROUP BY name,email
HAVING COUNT(*)>1
OUTPUT:
name email CountOf
---------- ----------- -----------
John John-email 2
sam sam-email 2
(2 row(s) affected)
if you want the IDs of the dups use this:
SELECT
y.id,y.name,y.email
FROM @YourTable y
INNER JOIN (SELECT
name,email, COUNT(*) AS CountOf
FROM @YourTable
GROUP BY name,email
HAVING COUNT(*)>1
) dt ON y.name=dt.name AND y.email=dt.email
OUTPUT:
id name email
----------- ---------- ------------
1 John John-email
2 John John-email
5 sam sam-email
6 sam sam-email
(4 row(s) affected)
to delete the duplicates try:
DELETE d
FROM @YourTable d
INNER JOIN (SELECT
y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank
FROM @YourTable y
INNER JOIN (SELECT
name,email, COUNT(*) AS CountOf
FROM @YourTable
GROUP BY name,email
HAVING COUNT(*)>1
) dt ON y.name=dt.name AND y.email=dt.email
) dt2 ON d.id=dt2.id
WHERE dt2.RowRank!=1
SELECT * FROM @YourTable
OUTPUT:
Try this:
I have multiple set of data to insert at once, say 4 rows. My table has three columns: Person
, Id
and Office
.
INSERT INTO MyTable VALUES ("John", 123, "Lloyds Office");
INSERT INTO MyTable VALUES ("Jane", 124, "Lloyds Office");
INSERT INTO MyTable VALUES ("Billy", 125, "London Office");
INSERT INTO MyTable VALUES ("Miranda", 126, "Bristol Office");
Can I insert all 4 rows in a single SQL statement?
In SQL Server 2008 you can insert multiple rows using a single SQL INSERT statement.
For reference to this have a look at MOC Course 2778A - Writing SQL Queries in SQL Server 2008.
For example:
If you are inserting into a single table, you can write your query like this (maybe only in MySQL):
INSERT INTO table1 (First, Last)
VALUES
('Fred', 'Smith'),
('John', 'Smith'),
('Michael', 'Smith'),
('Robert', 'Smith');
I am trying to INSERT INTO
a table using the input from another table. Although this is entirely feasible for many database engines, I always seem to struggle to remember the correct syntax for the SQL
engine of the day (MySQL, Oracle, SQL Server, Informix, and DB2).
Is there a silver-bullet syntax coming from an SQL standard (for example, SQL-92) that would allow me to insert the values without worrying about the underlying database?
Try:
This is standard ANSI SQL and should work on any DBMS
It definitely works for:
@Shadow_x99: That should work fine, and you can also have multiple columns and other data as well:
INSERT INTO table1 ( column1, column2, someInt, someVarChar )
SELECT table2.column1, table2.column2, 8, 'some string etc.'
FROM table2
WHERE table2.ID = 7;
Edit: I should mention that I’ve only used this syntax with Access, SQL 2000/2005/Express, MySQL, and PostgreSQL, so those should be covered. A commenter has pointed out that it’ll work with SQLite3.
To get only one value in a multi value INSERT
from another table I did the following in SQLite3:
INSERT INTO column_1 ( val_1, val_from_other_table )
VALUES('val_1', (SELECT val_2 FROM table_2 WHERE val_2 = something))
Also how do LEFT JOIN
, RIGHT JOIN
and FULL JOIN
fit in?
Assuming you’re joining on columns with no duplicates, which is a very common case:
An inner join of A and B gives the result of A intersect B, i.e. the inner part of a Venn diagram intersection.
An outer join of A and B gives the results of A union B, i.e. the outer parts of a Venn diagram union.
Examples
Suppose you have two tables, with a single column each, and data as follows:
Note that (1,2) are unique to A, (3,4) are common, and (5,6) are unique to B.
Inner join
An inner join using either of the equivalent queries gives the intersection of the two tables, i.e. the two rows they have in common.
select * from a INNER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a = b.b;
a | b
--+--
3 | 3
4 | 4
Left outer join
A left outer join will give all rows in A, plus any common rows in B.
select * from a LEFT OUTER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a = b.b(+);
a | b
--+-----
1 | null
2 | null
3 | 3
4 | 4
Right outer join
A right outer join will give all rows in B, plus any common rows in A.
select * from a RIGHT OUTER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a(+) = b.b;
a | b
-----+----
3 | 3
4 | 4
null | 5
null | 6
Full outer join
A full outer join will give you the union of A and B, i.e. all the rows in A and all the rows in B. If something in A doesn’t have a corresponding datum in B, then the B portion is null, and vice versa.
The Venn diagrams don’t really do it for me.
They don’t show any distinction between a cross join and an inner join, for example, or more generally show any distinction between different types of join predicate or provide a framework for reasoning about how they will operate.
There is no substitute for understanding the logical processing and it is relatively straightforward to grasp anyway.
on
clause against all rows from step 1 keeping those where the predicate evaluates to true
(NB: In practice the query optimiser may find more efficient ways of executing the query than the purely logical description above but the final result must be the same)
I’ll start off with an animated version of a full outer join. Further explanation follows.
Source Tables
First start with a CROSS JOIN
(AKA Cartesian Product). This does not have an ON
clause and simply returns every combination of rows from the two tables.
SELECT A.Colour, B.Colour FROM A CROSS JOIN B
Inner and Outer joins have an “ON” clause predicate.
SELECT A.Colour, B.Colour FROM A INNER JOIN B ON A.Colour = B.Colour
The above is the classic equi join.
The inner join condition need not necessarily be an equality condition and it need not reference columns from both (or even either) of the tables. Evaluating A.Colour NOT IN ('Green','Blue')
on each row of the cross join returns.
SELECT A.Colour, B.Colour FROM A INNER JOIN B ON 1 =1
The join condition evaluates to true for all rows in the cross join result so this is just the same as a cross join. I won’t repeat the picture of the 16 rows again.
Outer Joins are logically evaluated in the same way as inner joins except that if a row from the left table (for a left join) does not join with any rows from the right hand table at all it is preserved in the result with NULL
values for the right hand columns.
This simply restricts the previous result to only return the rows where B.Colour IS NULL
. In this particular case these will be the rows that were preserved as they had no match in the right hand table and the query returns the single red row not matched in table B
. This is known as an anti semi join.
It is important to select a column for the IS NULL
test that is either not nullable or for which the join condition ensures that any NULL
values will be excluded in order for this pattern to work correctly and avoid just bringing back rows which happen to have a NULL
value for that column in addition to the un matched rows.
Right outer joins act similarly to left outer joins except they preserve non matching rows from the right table and null extend the left hand columns.
Full outer joins combine the behaviour of left and right joins and preserve the non matching rows from both the left and the right tables.
No rows in the cross join match the 1=0
predicate. All rows from both sides are preserved using normal outer join rules with NULL in the columns from the table on the other side.
With a minor amend to the preceding query one could simulate a UNION ALL
of the two tables.
Note that the WHERE
clause (if present) logically runs after the join. One common error is to perform a left outer join and then include a WHERE clause with a condition on the right table that ends up excluding the non matching rows. The above ends up performing the outer join…
… And then the “Where” clause runs. NULL= 'Green'
does not evaluate to true so the row preserved by the outer join ends up discarded (along with the blue one) effectively converting the join back to an inner one.
If the intention was to include only rows from B where Colour is Green and all rows from A regardless the correct syntax would be
See these examples run live at SQLFiddle.com.
Joins are used to combine the data from two tables, with the result being a new, temporary table. Joins are performed based on something called a predicate, which specifies the condition to use in order to perform a join. The difference between an inner join and an outer join is that an inner join will return only the rows that actually match based on the join predicate. For eg- Lets consider Employee and Location table:
Inner Join:- Inner join creates a new result table by combining column values of two tables (Employee and Location) based upon the join-predicate. The query compares each row of Employee with each row of Location to find all pairs of rows which satisfy the join-predicate. When the join-predicate is satisfied by matching non-NULL values, column values for each matched pair of rows of Employee and Location are combined into a result row. Here’s what the SQL for an inner join will look like:
select * from employee inner join location on employee.empID = location.empID
OR
select * from employee, location where employee.empID = location.empID
Now, here is what the result of running that SQL would look like:
Outer Join:- An outer join does not require each record in the two joined tables to have a matching record. The joined table retains each record—even if no other matching record exists. Outer joins subdivide further into left outer joins and right outer joins, depending on which table’s rows are retained (left or right).
Left Outer Join:- The result of a left outer join (or simply left join) for tables Employee and Location always contains all records of the “left” table (Employee), even if the join-condition does not find any matching record in the “right” table (Location). Here is what the SQL for a left outer join would look like, using the tables above:
select * from employee left outer join location on employee.empID = location.empID;
//Use of outer keyword is optional
Now, here is what the result of running this SQL would look like:
Right Outer Join:- A right outer join (or right join) closely resembles a left outer join, except with the treatment of the tables reversed. Every row from the “right” table (Location) will appear in the joined table at least once. If no matching row from the “left” table (Employee) exists, NULL will appear in columns from Employee for those records that have no match in Location. This is what the SQL looks like:
select * from employee right outer join location on employee.empID = location.empID;
//Use of outer keyword is optional
Using the tables above, we can show what the result set of a right outer join would look like:
Full Outer Joins:- Full Outer Join or Full Join is to retain the nonmatching information by including nonmatching rows in the results of a join, use a full outer join. It includes all rows from both tables, regardless of whether or not the other table has a matching value.
MySQL 8.0 Reference Manual - Join Syntax
Consider a database table holding names, with three rows:
Is there an easy way to turn this into a single string of Peter, Paul, Mary
?
If you are on SQL Server 2017 or Azure, see Mathieu Renda answer.
I had a similar issue when I was trying to join two tables with one-to-many relationships. In SQL 2005 I found that XML PATH
method can handle the concatenation of the rows very easily.
If there is a table called STUDENTS
Result I expected was:
I used the following T-SQL
:
SELECT Main.SubjectID,
LEFT(Main.Students,Len(Main.Students)-1) As "Students"
FROM
(
SELECT DISTINCT ST2.SubjectID,
(
SELECT ST1.StudentName + ',' AS [text()]
FROM dbo.Students ST1
WHERE ST1.SubjectID = ST2.SubjectID
ORDER BY ST1.SubjectID
FOR XML PATH ('')
) [Students]
FROM dbo.Students ST2
) [Main]
You can do the same thing in a more compact way if you can concat the commas at the beginning and use substring
to skip the first one so you don’t need to do a sub-query:
This answer may return unexpected results For consistent results, use one of the FOR XML PATH methods detailed in other answers.
Use COALESCE
:
Just some explanation (since this answer seems to get relatively regular views):
No need to initialize @Names
with an empty string value.
No need to strip off an extra separator at the end.
@Names
NULL after that row, and the next row will start over as an empty string again. Easily fixed with one of two solutions:
DECLARE @Names VARCHAR(8000)
SELECT @Names = COALESCE(@Names + ', ', '') + Name
FROM People
WHERE Name IS NOT NULL
or:
DECLARE @Names VARCHAR(8000)
SELECT @Names = COALESCE(@Names + ', ', '') +
ISNULL(Name, 'N/A')
FROM People
Depending on what behavior you want (the first option just filters NULLs out, the second option keeps them in the list with a marker message [replace ‘N/A’ with whatever is appropriate for you]).
Starting with the next version of SQL Server, we can finally concatenate across rows without having to resort to any variable or XML witchery.
Without grouping
With grouping :
SELECT GroupName, STRING_AGG(Name, ', ') AS Departments
FROM HumanResources.Department
GROUP BY GroupName;
With grouping and sub-sorting
SELECT GroupName, STRING_AGG(Name, ', ') WITHIN GROUP (ORDER BY Name ASC) AS Departments
FROM HumanResources.Department
GROUP BY GroupName;
How do I query an Oracle database to display the names of all tables in it?
This is assuming that you have access to the DBA_TABLES
data dictionary view. If you do not have those privileges but need them, you can request that the DBA explicitly grants you privileges on that table, or, that the DBA grants you the SELECT ANY DICTIONARY
privilege or the SELECT_CATALOG_ROLE
role (either of which would allow you to query any data dictionary table). Of course, you may want to exclude certain schemas like SYS
and SYSTEM
which have large numbers of Oracle tables that you probably don’t care about.
Alternatively, if you do not have access to DBA_TABLES
, you can see all the tables that your account has access to through the ALL_TABLES
view:
Although, that may be a subset of the tables available in the database (ALL_TABLES
shows you the information for all the tables that your user has been granted access to).
If you are only concerned with the tables that you own, not those that you have access to, you could use USER_TABLES
:
Since USER_TABLES
only has information about the tables that you own, it does not have an OWNER
column – the owner, by definition, is you.
Oracle also has a number of legacy data dictionary views– TAB
, DICT
, TABS
, and CAT
for example– that could be used. In general, I would not suggest using these legacy views unless you absolutely need to backport your scripts to Oracle 6. Oracle has not changed these views in a long time so they often have problems with newer types of objects. For example, the TAB
and CAT
views both show information about tables that are in the user’s recycle bin while the [DBA|ALL|USER]_TABLES
views all filter those out. CAT
also shows information about materialized view logs with a TABLE_TYPE
of “TABLE” which is unlikely to be what you really want. DICT
combines tables and synonyms and doesn’t tell you who owns the object.
Querying user_tables
and dba_tables
didn’t work.
This one did:
Going one step further, there is another view called cols (all_tab_columns) which can be used to ascertain which tables contain a given column name.
For example:
SELECT table_name, column_name
FROM cols
WHERE table_name LIKE 'EST%'
AND column_name LIKE '%CALLREF%';
to find all tables having a name beginning with EST and columns containing CALLREF anywhere in their names.
This can help when working out what columns you want to join on, for example, depending on your table and column naming conventions.
Is it possible to query for table names which contain columns being
?
Search Tables:
SELECT c.name AS 'ColumnName'
,t.name AS 'TableName'
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
WHERE c.name LIKE '%MyName%'
ORDER BY TableName
,ColumnName;
Search Tables & Views:
We can also use the following syntax:-
How do I do a SELECT * INTO [temp table] FROM [stored procedure]
? Not FROM [Table]
and without defining [temp table]
?
Select
all data from BusinessLine
into tmpBusLine
works fine.
I am trying the same, but using a stored procedure
that returns data, is not quite the same.
Output message:
Msg 156, Level 15, State 1, Line 2 Incorrect syntax near the keyword ‘exec’.
I have read several examples of creating a temporary table with the same structure as the output stored procedure, which works fine, but it would be nice to not supply any columns.
You can use OPENROWSET for this. Have a look. I’ve also included the sp_configure code to enable Ad Hoc Distributed Queries, in case it isn’t already enabled.
CREATE PROC getBusinessLineHistory
AS
BEGIN
SELECT * FROM sys.databases
END
GO
sp_configure 'Show Advanced Options', 1
GO
RECONFIGURE
GO
sp_configure 'Ad Hoc Distributed Queries', 1
GO
RECONFIGURE
GO
SELECT * INTO #MyTempTable FROM OPENROWSET('SQLNCLI', 'Server=(local)\SQL2008;Trusted_Connection=yes;',
'EXEC getBusinessLineHistory')
SELECT * FROM #MyTempTable
If you want to do it without first declaring the temporary table, you could try creating a user-defined function rather than a stored procedure and make that user-defined function return a table. Alternativly, if you want to use the stored procedure, try something like this:
I need a select which would return results like this:
And I need all results, i.e. this includes strings with ‘word2 word3 word1’ or ‘word1 word3 word2’ or any other combination of the three.
All words need to be in the result.
Rather slow, but working method to include any of words:
SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
OR column1 LIKE '%word2%'
OR column1 LIKE '%word3%'
If you need all words to be present, use this:
SELECT * FROM mytable
WHERE column1 LIKE '%word1%'
AND column1 LIKE '%word2%'
AND column1 LIKE '%word3%'
If you want something faster, you need to look into full text search, and this is very specific for each database type.
Note that if you use LIKE
to determine if a string is a substring of another string, you must escape the pattern matching characters in your search string.
If your SQL dialect supports CHARINDEX
, it’s a lot easier to use it instead:
SELECT * FROM MyTable
WHERE CHARINDEX('word1', Column1) > 0
AND CHARINDEX('word2', Column1) > 0
AND CHARINDEX('word3', Column1) > 0
Also, please keep in mind that this and the method in the accepted answer only cover substring matching rather than word matching. So, for example, the string 'word1word2word3'
would still match.
CREATE FUNCTION [dbo].[fnSplit] ( @sep CHAR(1), @str VARCHAR(512) )
RETURNS TABLE AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT 1, 1, CHARINDEX(@sep, @str)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(@sep, @str, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT
pn AS Id,
SUBSTRING(@str, start, CASE WHEN stop > 0 THEN stop - start ELSE 512 END) AS Data
FROM
Pieces
)
DECLARE @FilterTable TABLE (Data VARCHAR(512))
INSERT INTO @FilterTable (Data)
SELECT DISTINCT S.Data
FROM fnSplit(' ', 'word1 word2 word3') S -- Contains words
SELECT DISTINCT
T.*
FROM
MyTable T
INNER JOIN @FilterTable F1 ON T.Column1 LIKE '%' + F1.Data + '%'
LEFT JOIN @FilterTable F2 ON T.Column1 NOT LIKE '%' + F2.Data + '%'
WHERE
F2.Data IS NULL
If user input is inserted without modification into an SQL query, then the application becomes vulnerable to SQL injection, like in the following example:
$unsafe_variable = $_POST['user_input'];
mysql_query("INSERT INTO `table` (`column`) VALUES ('$unsafe_variable')");
That’s because the user can input something like value'); DROP TABLE table;--
, and the query becomes:
What can be done to prevent this from happening?
Use prepared statements and parameterized queries. These are SQL statements that are sent to and parsed by the database server separately from any parameters. This way it is impossible for an attacker to inject malicious SQL.
You basically have two options to achieve this:
Using PDO (for any supported database driver):
$stmt = $pdo->prepare('SELECT * FROM employees WHERE name = :name');
$stmt->execute(array('name' => $name));
foreach ($stmt as $row) {
// Do something with $row
}
```</li>
<li><p>Using <a href="http://php.net/manual/en/book.mysqli.php" rel="noreferrer">MySQLi</a> (for MySQL):</p>
```sql
$stmt = $dbConnection->prepare('SELECT * FROM employees WHERE name = ?');
$stmt->bind_param('s', $name); // 's' specifies the variable type => 'string'
$stmt->execute();
$result = $stmt->get_result();
while ($row = $result->fetch_assoc()) {
// Do something with $row
}
```</li>
</ol>
If you're connecting to a database other than MySQL, there is a driver-specific second option that you can refer to (for example, `pg_prepare()` and `pg_execute()` for PostgreSQL). PDO is the universal option.
<h5>Correctly setting up the connection</h2>
Note that when using `PDO` to access a MySQL database <em>real</em> prepared statements are <strong>not used by default</strong>. To fix this you have to disable the emulation of prepared statements. An example of creating a connection using PDO is:
```sql
$dbConnection = new PDO('mysql:dbname=dbtest;host=127.0.0.1;charset=utf8', 'user', 'password');
$dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
In the above example the error mode isn’t strictly necessary, but it is advised to add it. This way the script will not stop with a Fatal Error
when something goes wrong. And it gives the developer the chance to catch
any error(s) which are throw
n as PDOException
s.
What is mandatory, however, is the first setAttribute()
line, which tells PDO to disable emulated prepared statements and use real prepared statements. This makes sure the statement and the values aren’t parsed by PHP before sending it to the MySQL server (giving a possible attacker no chance to inject malicious SQL).
Although you can set the charset
in the options of the constructor, it’s important to note that ‘older’ versions of PHP (before 5.3.6) silently ignored the charset parameter in the DSN.
The SQL statement you pass to prepare
is parsed and compiled by the database server. By specifying parameters (either a ?
or a named parameter like :name
in the example above) you tell the database engine where you want to filter on. Then when you call execute
, the prepared statement is combined with the parameter values you specify.
The important thing here is that the parameter values are combined with the compiled statement, not an SQL string. SQL injection works by tricking the script into including malicious strings when it creates SQL to send to the database. So by sending the actual SQL separately from the parameters, you limit the risk of ending up with something you didn’t intend.
Any parameters you send when using a prepared statement will just be treated as strings (although the database engine may do some optimization so parameters may end up as numbers too, of course). In the example above, if the $name
variable contains 'Sarah'; DELETE FROM employees
the result would simply be a search for the string "'Sarah'; DELETE FROM employees"
, and you will not end up with an empty table.
Another benefit of using prepared statements is that if you execute the same statement many times in the same session it will only be parsed and compiled once, giving you some speed gains.
Oh, and since you asked about how to do it for an insert, here’s an example (using PDO):
$preparedStatement = $db->prepare('INSERT INTO table (column) VALUES (:column)');
$preparedStatement->execute(array('column' => $unsafeValue));
While you can still use prepared statements for the query parameters, the structure of the dynamic query itself cannot be parametrized and certain query features cannot be parametrized.
For these specific scenarios, the best thing to do is use a whitelist filter that restricts the possible values.
Deprecated Warning: This answer’s sample code (like the question’s sample code) uses PHP’s
Security Warning: This answer is not in line with security best practices. Escaping is inadequate to prevent SQL injection, use prepared statements instead. Use the strategy outlined below at your own risk. (Also,MySQL
extension, which was deprecated in PHP 5.5.0 and removed entirely in PHP 7.0.0.mysql_real_escape_string()
was removed in PHP 7.)
If you’re using a recent version of PHP, the mysql_real_escape_string
option outlined below will no longer be available (though mysqli::escape_string
is a modern equivalent). These days the mysql_real_escape_string
option would only make sense for legacy code on an old version of PHP.
You’ve got two options - escaping the special characters in your unsafe_variable
, or using a parameterized query. Both would protect you from SQL injection. The parameterized query is considered the better practice but will require changing to a newer MySQL extension in PHP before you can use it.
We’ll cover the lower impact string escaping one first.
//Connect
$unsafe_variable = $_POST["user-input"];
$safe_variable = mysql_real_escape_string($unsafe_variable);
mysql_query("INSERT INTO table (column) VALUES ('" . $safe_variable . "')");
//Disconnect
See also, the details of the mysql_real_escape_string
function.
To use the parameterized query, you need to use MySQLi rather than the MySQL functions. To rewrite your example, we would need something like the following.
<?php
$mysqli = new mysqli("server", "username", "password", "database_name");
// TODO - Check that connection was successful.
$unsafe_variable = $_POST["user-input"];
$stmt = $mysqli->prepare("INSERT INTO table (column) VALUES (?)");
// TODO check that $stmt creation succeeded
// "s" means the database expects a string
$stmt->bind_param("s", $unsafe_variable);
$stmt->execute();
$stmt->close();
$mysqli->close();
?>
The key function you’ll want to read up on there would be mysqli::prepare
.
Also, as others have suggested, you may find it useful/easier to step up a layer of abstraction with something like PDO.
Please note that the case you asked about is a fairly simple one and that more complex cases may require more complex approaches. In particular:
mysql_real_escape_string
. In this kind of case, you would be better off passing the user’s input through a whitelist to ensure only ‘safe’ values are allowed through.
mysql_real_escape_string
approach, you will suffer from the problem described by Polynomial in the comments below. This case is trickier because integers would not be surrounded by quotes, so you could deal with by validating that the user input contains only digits.
Every answer here covers only part of the problem. In fact, there are four different query parts which we can add to it dynamically: -
And prepared statements cover only two of them.
But sometimes we have to make our query even more dynamic, adding operators or identifiers as well. So, we will need different protection techniques.
In general, such a protection approach is based on whitelisting.
In this case, every dynamic parameter should be hardcoded in your script and chosen from that set. For example, to do dynamic ordering:
$orders = array("name", "price", "qty"); // Field names
$key = array_search($_GET['sort'], $orders)); // See if we have such a name
$orderby = $orders[$key]; // If not, first one will be set automatically. smart enuf :)
$query = "SELECT * FROM `table` ORDER BY $orderby"; // Value is safe
However, there is another way to secure identifiers - escaping. As long as you have an identifier quoted, you can escape backticks inside by doubling them.
As a further step, we can borrow a truly brilliant idea of using some placeholder (a proxy to represent the actual value in the query) from the prepared statements and invent a placeholder of another type - an identifier placeholder.
So, to make the long story short: it’s a placeholder, not prepared statement can be considered as a silver bullet.
So, a general recommendation may be phrased as As long as you are adding dynamic parts to the query using placeholders (and these placeholders properly processed of course), you can be sure that your query is safe.
Still, there is an issue with SQL syntax keywords (such as AND
, DESC
and such), but white-listing seems the only approach in this case.
Although there is a general agreement on the best practices regarding SQL injection protection, there are still many bad practices as well. And some of them too deeply rooted in the minds of PHP users. For instance, on this very page there are (although invisible to most visitors) more than 80 deleted answers - all removed by the community due to bad quality or promoting bad and outdated practices. Worse yet, some of the bad answers aren’t deleted, but rather prospering.
For example, there(1) are(2) still(3) many(4) answers(5), including the second most upvoted answer suggesting you manual string escaping - an outdated approach that is proven to be insecure.
Or there is a slightly better answer that suggests just another method of string formatting and even boasts it as the ultimate panacea. While of course, it is not. This method is no better than regular string formatting, yet it keeps all its drawbacks: it is applicable to strings only and, like any other manual formatting, it’s essentially optional, non-obligatory measure, prone to human error of any sort.
I think that all this because of one very old superstition, supported by such authorities like OWASP or the PHP manual, which proclaims equality between whatever “escaping” and protection from SQL injections.
Regardless of what PHP manual said for ages, *_escape_string
by no means makes data safe and never has been intended to. Besides being useless for any SQL part other than string, manual escaping is wrong, because it is manual as opposite to automated.
And OWASP makes it even worse, stressing on escaping user input which is an utter nonsense: there should be no such words in the context of injection protection. Every variable is potentially dangerous - no matter the source! Or, in other words - every variable has to be properly formatted to be put into a query - no matter the source again. It’s the destination that matters. The moment a developer starts to separate the sheep from the goats (thinking whether some particular variable is “safe” or not) he/she takes his/her first step towards disaster. Not to mention that even the wording suggests bulk escaping at the entry point, resembling the very magic quotes feature - already despised, deprecated and removed.
So, unlike whatever “escaping”, prepared statements is the measure that indeed protects from SQL injection (when applicable).
If you’re still not convinced, here are a step-by-step explanation I wrote, The Hitchhiker’s Guide to SQL Injection prevention, where I explained all these matters in detail and even compiled a section entirely dedicated to bad practices and their disclosure.
I have a database with account numbers
and card numbers
. I match these to a file to update
any card numbers to the account number, so that I am only working with account numbers.
I created a view linking the table to the account/card database to return the Table ID
and the related account number, and now I need to update those records where the ID matches with the Account Number.
This is the Sales_Import
table, where the account number
field needs to be updated:
And this is the RetrieveAccountNumber
table, where I need to update from:
I tried the below, but no luck so far:
UPDATE [Sales_Lead].[dbo].[Sales_Import]
SET [AccountNumber] = (SELECT RetrieveAccountNumber.AccountNumber
FROM RetrieveAccountNumber
WHERE [Sales_Lead].[dbo].[Sales_Import]. LeadID =
RetrieveAccountNumber.LeadID)
It updates the card numbers to account numbers, but the account numbers gets replaced by NULL
I believe an UPDATE FROM
with a JOIN
will help:
UPDATE
Sales_Import
SET
Sales_Import.AccountNumber = RAN.AccountNumber
FROM
Sales_Import SI
INNER JOIN
RetrieveAccountNumber RAN
ON
SI.LeadID = RAN.LeadID;
The simple Way to copy the content from one table to other is as follow:
UPDATE table2
SET table2.col1 = table1.col1,
table2.col2 = table1.col2,
...
FROM table1, table2
WHERE table1.memberid = table2.memberid
You can also add the condition to get the particular data copied.
I have a start_date
and end_date
. I want to get the list of dates in between these two dates. Can anyone help me pointing the mistake in my query.
select Date,TotalAllowance
from Calculation
where EmployeeId=1
and Date between 2011/02/25 and 2011/02/27
Here Date
is a datetime
variable.
you should put those two dates between single quotes like..
select Date, TotalAllowance from Calculation where EmployeeId = 1
and Date between '2011/02/25' and '2011/02/27'
or can use
Since a datetime without a specified time segment will have a value of date 00:00:00.000
, if you want to be sure you get all the dates in your range, you must either supply the time for your ending date or increase your ending date and use <
.
select Date,TotalAllowance from Calculation where EmployeeId=1
and Date between '2011/02/25' and '2011/02/27 23:59:59.999'
OR
select Date,TotalAllowance from Calculation where EmployeeId=1
and Date >= '2011/02/25' and Date < '2011/02/28'
OR
select Date,TotalAllowance from Calculation where EmployeeId=1
and Date >= '2011/02/25' and Date <= '2011/02/27 23:59:59.999'
DO NOT use the following, as it could return some records from 2011/02/28 if their times are 00:00:00.000.
Try this:
select Date,TotalAllowance from Calculation where EmployeeId=1
and [Date] between '2011/02/25' and '2011/02/27'
The date values need to be typed as strings.
To ensure future-proofing your query for SQL Server 2008 and higher, Date
should be escaped because it’s a reserved word in later versions.
Bear in mind that the dates without times take midnight as their defaults, so you may not have the correct value there.
I would like to query the name of all columns of a table. I found how to do this in:
But I need to know: how can this be done in Microsoft SQL Server (2008 in my case)?
You can obtain this information and much, much more by querying the Information Schema views.
This sample query:
Can be made over all these DB objects:
You can use the stored procedure sp_columns which would return information pertaining to all columns for a given table. More info can be found here http://msdn.microsoft.com/en-us/library/ms176077.aspx
You can also do it by a SQL query. Some thing like this should help:
Or a variation would be:
SELECT o.Name, c.Name
FROM sys.columns c
JOIN sys.objects o ON o.object_id = c.object_id
WHERE o.type = 'U'
ORDER BY o.Name, c.Name
This gets all columns from all tables, ordered by table name and then on column name.
This is better than getting from sys.columns
because it shows DATA_TYPE
directly.
I need to update this table in SQL Server 2005 with data from its ‘parent’ table, see below:
sale
ud
sale.assid
contains the correct value to update ud.assid
.
What query will do this? I’m thinking a join
but I’m not sure if it’s possible.
Syntax strictly depends on which SQL DBMS you’re using. Here are some ways to do it in ANSI/ISO (aka should work on any SQL DBMS), MySQL, SQL Server, and Oracle. Be advised that my suggested ANSI/ISO method will typically be much slower than the other two methods, but if you’re using a SQL DBMS other than MySQL, SQL Server, or Oracle, then it may be the only way to go (e.g. if your SQL DBMS doesn’t support MERGE
):
ANSI/ISO:
update ud
set assid = (
select sale.assid
from sale
where sale.udid = ud.id
)
where exists (
select *
from sale
where sale.udid = ud.id
);
MySQL:
SQL Server:
PostgreSQL:
Note that the target table must not be repeated in the FROM
clause for Postgres.
Oracle:
update
(select
u.assid as new_assid,
s.assid as old_assid
from ud u
inner join sale s on
u.id = s.udid) up
set up.new_assid = up.old_assid
SQLite:
This should work in SQL Server:
postgres
UPDATE table1
SET COLUMN = value
FROM table2,
table3
WHERE table1.column_id = table2.id
AND table1.column_id = table3.id
AND table1.COLUMN = value
AND table2.COLUMN = value
AND table3.COLUMN = value
My table is:
id home datetime player resource
---|-----|------------|--------|---------
1 | 10 | 04/03/2009 | john | 399
2 | 11 | 04/03/2009 | juliet | 244
5 | 12 | 04/03/2009 | borat | 555
3 | 10 | 03/03/2009 | john | 300
4 | 11 | 03/03/2009 | juliet | 200
6 | 12 | 03/03/2009 | borat | 500
7 | 13 | 24/12/2008 | borat | 600
8 | 13 | 01/01/2009 | borat | 700
I need to select each distinct home
holding the maximum value of datetime
.
Result would be:
id home datetime player resource
---|-----|------------|--------|---------
1 | 10 | 04/03/2009 | john | 399
2 | 11 | 04/03/2009 | juliet | 244
5 | 12 | 04/03/2009 | borat | 555
8 | 13 | 01/01/2009 | borat | 700
I have tried:
-- 1 ..by the MySQL manual:
SELECT DISTINCT
home,
id,
datetime AS dt,
player,
resource
FROM topten t1
WHERE datetime = (SELECT
MAX(t2.datetime)
FROM topten t2
GROUP BY home)
GROUP BY datetime
ORDER BY datetime DESC
Doesn’t work. Result-set has 130 rows although database holds 187. Result includes some duplicates of home
.
-- 2 ..join
SELECT
s1.id,
s1.home,
s1.datetime,
s1.player,
s1.resource
FROM topten s1
JOIN (SELECT
id,
MAX(datetime) AS dt
FROM topten
GROUP BY id) AS s2
ON s1.id = s2.id
ORDER BY datetime
Nope. Gives all the records.
With various results.
You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten
table on BOTH fields:
Here goes T-SQL version:
-- Test data
DECLARE @TestTable TABLE (id INT, home INT, date DATETIME,
player VARCHAR(20), resource INT)
INSERT INTO @TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700
-- Answer
SELECT id, home, date, player, resource
FROM (SELECT id, home, date, player, resource,
RANK() OVER (PARTITION BY home ORDER BY date DESC) N
FROM @TestTable
)M WHERE N = 1
-- and if you really want only home with max date
SELECT T.id, T.home, T.date, T.player, T.resource
FROM @TestTable T
INNER JOIN
( SELECT TI.id, TI.home, TI.date,
RANK() OVER (PARTITION BY TI.home ORDER BY TI.date) N
FROM @TestTable TI
WHERE TI.date IN (SELECT MAX(TM.date) FROM @TestTable TM)
)TJ ON TJ.N = 1 AND T.id = TJ.id
EDIT
Unfortunately, there are no RANK() OVER function in MySQL.
But it can be emulated, see Emulating Analytic (AKA Ranking) Functions with MySQL.
So this is MySQL version:
The fastest MySQL
solution, without inner queries and without GROUP BY
:
SELECT m.* -- get the row that contains the max value
FROM topten m -- "m" from "max"
LEFT JOIN topten b -- "b" from "bigger"
ON m.home = b.home -- match "max" row with "bigger" row by `home`
AND m.datetime < b.datetime -- want "bigger" than "max"
WHERE b.datetime IS NULL -- keep only if there is no bigger than max
Explanation:
Join the table with itself using the home
column. The use of LEFT JOIN
ensures all the rows from table m
appear in the result set. Those that don’t have a match in table b
will have NULL
s for the columns of b
.
The other condition on the JOIN
asks to match only the rows from b
that have bigger value on the datetime
column than the row from m
.
Using the data posted in the question, the LEFT JOIN
will produce this pairs:
+------------------------------------------+--------------------------------+
| the row from `m` | the matching row from `b` |
|------------------------------------------|--------------------------------|
| id home datetime player resource | id home datetime ... |
|----|-----|------------|--------|---------|------|------|------------|-----|
| 1 | 10 | 04/03/2009 | john | 399 | NULL | NULL | NULL | ... | *
| 2 | 11 | 04/03/2009 | juliet | 244 | NULL | NULL | NULL | ... | *
| 5 | 12 | 04/03/2009 | borat | 555 | NULL | NULL | NULL | ... | *
| 3 | 10 | 03/03/2009 | john | 300 | 1 | 10 | 04/03/2009 | ... |
| 4 | 11 | 03/03/2009 | juliet | 200 | 2 | 11 | 04/03/2009 | ... |
| 6 | 12 | 03/03/2009 | borat | 500 | 5 | 12 | 04/03/2009 | ... |
| 7 | 13 | 24/12/2008 | borat | 600 | 8 | 13 | 01/01/2009 | ... |
| 8 | 13 | 01/01/2009 | borat | 700 | NULL | NULL | NULL | ... | *
+------------------------------------------+--------------------------------+
Finally, the WHERE
clause keeps only the pairs that have NULL
s in the columns of b
(they are marked with *
in the table above); this means, due to the second condition from the JOIN
clause, the row selected from m
has the biggest value in column datetime
.
Read the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.
I am working on a query in Sql Server 2005 where I need to convert a value in DateTime
variable into a varchar
variable in yyyy-mm-dd
format (without time part). How do I do that?
With Microsoft Sql Server:
Here’s some test sql for all the styles.
DECLARE @now datetime
SET @now = GETDATE()
select convert(nvarchar(MAX), @now, 0) as output, 0 as style
union select convert(nvarchar(MAX), @now, 1), 1
union select convert(nvarchar(MAX), @now, 2), 2
union select convert(nvarchar(MAX), @now, 3), 3
union select convert(nvarchar(MAX), @now, 4), 4
union select convert(nvarchar(MAX), @now, 5), 5
union select convert(nvarchar(MAX), @now, 6), 6
union select convert(nvarchar(MAX), @now, 7), 7
union select convert(nvarchar(MAX), @now, 8), 8
union select convert(nvarchar(MAX), @now, 9), 9
union select convert(nvarchar(MAX), @now, 10), 10
union select convert(nvarchar(MAX), @now, 11), 11
union select convert(nvarchar(MAX), @now, 12), 12
union select convert(nvarchar(MAX), @now, 13), 13
union select convert(nvarchar(MAX), @now, 14), 14
--15 to 19 not valid
union select convert(nvarchar(MAX), @now, 20), 20
union select convert(nvarchar(MAX), @now, 21), 21
union select convert(nvarchar(MAX), @now, 22), 22
union select convert(nvarchar(MAX), @now, 23), 23
union select convert(nvarchar(MAX), @now, 24), 24
union select convert(nvarchar(MAX), @now, 25), 25
--26 to 99 not valid
union select convert(nvarchar(MAX), @now, 100), 100
union select convert(nvarchar(MAX), @now, 101), 101
union select convert(nvarchar(MAX), @now, 102), 102
union select convert(nvarchar(MAX), @now, 103), 103
union select convert(nvarchar(MAX), @now, 104), 104
union select convert(nvarchar(MAX), @now, 105), 105
union select convert(nvarchar(MAX), @now, 106), 106
union select convert(nvarchar(MAX), @now, 107), 107
union select convert(nvarchar(MAX), @now, 108), 108
union select convert(nvarchar(MAX), @now, 109), 109
union select convert(nvarchar(MAX), @now, 110), 110
union select convert(nvarchar(MAX), @now, 111), 111
union select convert(nvarchar(MAX), @now, 112), 112
union select convert(nvarchar(MAX), @now, 113), 113
union select convert(nvarchar(MAX), @now, 114), 114
union select convert(nvarchar(MAX), @now, 120), 120
union select convert(nvarchar(MAX), @now, 121), 121
--122 to 125 not valid
union select convert(nvarchar(MAX), @now, 126), 126
union select convert(nvarchar(MAX), @now, 127), 127
--128, 129 not valid
union select convert(nvarchar(MAX), @now, 130), 130
union select convert(nvarchar(MAX), @now, 131), 131
--132 not valid
order BY style
Here’s the result
output style
Apr 28 2014 9:31AM 0
04/28/14 1
14.04.28 2
28/04/14 3
28.04.14 4
28-04-14 5
28 Apr 14 6
Apr 28, 14 7
09:31:28 8
Apr 28 2014 9:31:28:580AM 9
04-28-14 10
14/04/28 11
140428 12
28 Apr 2014 09:31:28:580 13
09:31:28:580 14
2014-04-28 09:31:28 20
2014-04-28 09:31:28.580 21
04/28/14 9:31:28 AM 22
2014-04-28 23
09:31:28 24
2014-04-28 09:31:28.580 25
Apr 28 2014 9:31AM 100
04/28/2014 101
2014.04.28 102
28/04/2014 103
28.04.2014 104
28-04-2014 105
28 Apr 2014 106
Apr 28, 2014 107
09:31:28 108
Apr 28 2014 9:31:28:580AM 109
04-28-2014 110
2014/04/28 111
20140428 112
28 Apr 2014 09:31:28:580 113
09:31:28:580 114
2014-04-28 09:31:28 120
2014-04-28 09:31:28.580 121
2014-04-28T09:31:28.580 126
2014-04-28T09:31:28.580 127
28 جمادى الثانية 1435 9:31:28:580AM 130
28/06/1435 9:31:28:580AM 131
Make nvarchar(max)
shorter to trim the time. For example:
outputs:
Try the following:
For a full date time and not just date do:
See this page for convert styles:
http://msdn.microsoft.com/en-us/library/ms187928.aspx
OR
SQL Server CONVERT() Function
Which of these queries is the faster?
NOT EXISTS:
SELECT ProductID, ProductName
FROM Northwind..Products p
WHERE NOT EXISTS (
SELECT 1
FROM Northwind..[Order Details] od
WHERE p.ProductId = od.ProductId)
Or NOT IN:
SELECT ProductID, ProductName
FROM Northwind..Products p
WHERE p.ProductID NOT IN (
SELECT ProductID
FROM Northwind..[Order Details])
The query execution plan says they both do the same thing. If that is the case, which is the recommended form?
This is based on the NorthWind database.
[Edit]
Just found this helpful article: http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx
I think I’ll stick with NOT EXISTS.
I always default to NOT EXISTS
.
The execution plans may be the same at the moment but if either column is altered in the future to allow NULL
s the NOT IN
version will need to do more work (even if no NULL
s are actually present in the data) and the semantics of NOT IN
if NULL
s are present are unlikely to be the ones you want anyway.
When neither Products.ProductID
or [Order Details].ProductID
allow NULL
s the NOT IN
will be treated identically to the following query.
SELECT ProductID,
ProductName
FROM Products p
WHERE NOT EXISTS (SELECT *
FROM [Order Details] od
WHERE p.ProductId = od.ProductId)
The exact plan may vary but for my example data I get the following.
A reasonably common misconception seems to be that correlated sub queries are always “bad” compared to joins. They certainly can be when they force a nested loops plan (sub query evaluated row by row) but this plan includes an anti semi join logical operator. Anti semi joins are not restricted to nested loops but can use hash or merge (as in this example) joins too.
/*Not valid syntax but better reflects the plan*/
SELECT p.ProductID,
p.ProductName
FROM Products p
LEFT ANTI SEMI JOIN [Order Details] od
ON p.ProductId = od.ProductId
If [Order Details].ProductID
is NULL
-able the query then becomes
SELECT ProductID,
ProductName
FROM Products p
WHERE NOT EXISTS (SELECT *
FROM [Order Details] od
WHERE p.ProductId = od.ProductId)
AND NOT EXISTS (SELECT *
FROM [Order Details]
WHERE ProductId IS NULL)
The reason for this is that the correct semantics if [Order Details]
contains any NULL
ProductId
s is to return no results. See the extra anti semi join and row count spool to verify this that is added to the plan.
If Products.ProductID
is also changed to become NULL
-able the query then becomes
SELECT ProductID,
ProductName
FROM Products p
WHERE NOT EXISTS (SELECT *
FROM [Order Details] od
WHERE p.ProductId = od.ProductId)
AND NOT EXISTS (SELECT *
FROM [Order Details]
WHERE ProductId IS NULL)
AND NOT EXISTS (SELECT *
FROM (SELECT TOP 1 *
FROM [Order Details]) S
WHERE p.ProductID IS NULL)
The reason for that one is because a NULL
Products.ProductId
should not be returned in the results except if the NOT IN
sub query were to return no results at all (i.e. the [Order Details]
table is empty). In which case it should. In the plan for my sample data this is implemented by adding another anti semi join as below.
The effect of this is shown in the blog post already linked by Buckley. In the example there the number of logical reads increase from around 400 to 500,000.
Additionally the fact that a single NULL
can reduce the row count to zero makes cardinality estimation very difficult. If SQL Server assumes that this will happen but in fact there were no NULL
rows in the data the rest of the execution plan may be catastrophically worse, if this is just part of a larger query, with inappropriate nested loops causing repeated execution of an expensive sub tree for example.
This is not the only possible execution plan for a NOT IN
on a NULL
-able column however. This article shows another one for a query against the AdventureWorks2008
database.
For the NOT IN
on a NOT NULL
column or the NOT EXISTS
against either a nullable or non nullable column it gives the following plan.
When the column changes to NULL
-able the NOT IN
plan now looks like
It adds an extra inner join operator to the plan. This apparatus is explained here. It is all there to convert the previous single correlated index seek on Sales.SalesOrderDetail.ProductID = <correlated_product_id>
to two seeks per outer row. The additional one is on WHERE Sales.SalesOrderDetail.ProductID IS NULL
.
As this is under an anti semi join if that one returns any rows the second seek will not occur. However if Sales.SalesOrderDetail
does not contain any NULL
ProductID
s it will double the number of seek operations required.
Also be aware that NOT IN is not equivalent to NOT EXISTS when it comes to null.
This post explains it very well
http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/
When the subquery returns even one null, NOT IN will not match any rows.
The reason for this can be found by looking at the details of what the NOT IN operation actually means.
Let’s say, for illustration purposes that there are 4 rows in the table called t, there’s a column called ID with values 1..4
is equivalent to
WHERE SomeValue != (SELECT AVal FROM t WHERE ID=1) AND SomeValue != (SELECT AVal FROM t WHERE ID=2) AND SomeValue != (SELECT AVal FROM t WHERE ID=3) AND SomeValue != (SELECT AVal FROM t WHERE ID=4)
Let’s further say that AVal is NULL where ID = 4. Hence that != comparison returns UNKNOWN. The logical truth table for AND states that UNKNOWN and TRUE is UNKNOWN, UNKNOWN and FALSE is FALSE. There is no value that can be AND’d with UNKNOWN to produce the result TRUE
Hence, if any row of that subquery returns NULL, the entire NOT IN operator will evaluate to either FALSE or NULL and no records will be returned
If the execution planner says they’re the same, they’re the same. Use whichever one will make your intention more obvious – in this case, the second.
I need amount
to be amount
if report.type='P'
and -amount
if report.type='N'
. How do I add this to the above query?
See http://dev.mysql.com/doc/refman/5.0/en/control-flow-functions.html.
Additionally, you could handle when the condition is null. In the case of a null amount:
The part IFNULL(amount,0)
means when amount is not null return amount else return 0.
Use a case
statement:
SELECT CompanyName,
CASE WHEN Country IN ('USA', 'Canada') THEN 'North America'
WHEN Country = 'Brazil' THEN 'South America'
ELSE 'Europe' END AS Continent
FROM Suppliers
ORDER BY CompanyName;
What’s the difference between INNER JOIN
, LEFT JOIN
, RIGHT JOIN
and FULL JOIN
in MySQL?
Reading this original article on The Code Project will help you a lot: Visual Representation of SQL Joins.
Also check this post: SQL SERVER – Better Performance – LEFT JOIN or NOT IN?.
Find original one at: Difference between JOIN and OUTER JOIN in MySQL.
INNER JOIN gets all records that are common between both tables based on the supplied ON clause.
LEFT JOIN gets all records from the LEFT linked table but if you have selected some columns from the RIGHT table, if there is no related records, these columns will contain NULL.
RIGHT JOIN is like the above but gets all records in the RIGHT table.
FULL JOIN gets all records from both tables and puts NULL in the columns where related records do not exist in the opposite table.
An SQL JOIN clause is used to combine rows from two or more tables, based on a common field between them.
There are different types of joins available in SQL:
INNER JOIN: returns rows when there is a match in both tables.
LEFT JOIN: returns all rows from the left table, even if there are no matches in the right table.
RIGHT JOIN: returns all rows from the right table, even if there are no matches in the left table.
FULL JOIN: It combines the results of both left and right outer joins.
The joined table will contain all records from both the tables and fill in NULLs for missing matches on either side.
SELF JOIN: is used to join a table to itself as if the table were two tables, temporarily renaming at least one table in the SQL statement.
CARTESIAN JOIN: returns the Cartesian product of the sets of records from the two or more joined tables.
WE can take each first four joins in Details :
We have two tables with the following values.
TableA
id firstName lastName
.......................................
1 arun prasanth
2 ann antony
3 sruthy abc
6 new abc
TableB
…………………………………………………………..
INNER JOIN
Note :it gives the intersection of the two tables, i.e. rows they have common in TableA and TableB
Syntax
SELECT table1.column1, table2.column2...
FROM table1
INNER JOIN table2
ON table1.common_field = table2.common_field;
Apply it in our sample table :
SELECT TableA.firstName,TableA.lastName,TableB.age,TableB.Place
FROM TableA
INNER JOIN TableB
ON TableA.id = TableB.id2;
Result Will Be
firstName lastName age Place
..............................................
arun prasanth 24 kerala
ann antony 24 usa
sruthy abc 25 ekm
LEFT JOIN
Note : will give all selected rows in TableA, plus any common selected rows in TableB.
Syntax
SELECT table1.column1, table2.column2...
FROM table1
LEFT JOIN table2
ON table1.common_field = table2.common_field;
Apply it in our sample table :
SELECT TableA.firstName,TableA.lastName,TableB.age,TableB.Place
FROM TableA
LEFT JOIN TableB
ON TableA.id = TableB.id2;
Result
firstName lastName age Place
...............................................................................
arun prasanth 24 kerala
ann antony 24 usa
sruthy abc 25 ekm
new abc NULL NULL
RIGHT JOIN
Note : will give all selected rows in TableB, plus any common selected rows in TableA.
Syntax
SELECT table1.column1, table2.column2...
FROM table1
RIGHT JOIN table2
ON table1.common_field = table2.common_field;
Apply it in our sample table :
SELECT TableA.firstName,TableA.lastName,TableB.age,TableB.Place
FROM TableA
RIGHT JOIN TableB
ON TableA.id = TableB.id2;
Result
firstName lastName age Place
...............................................................................
arun prasanth 24 kerala
ann antony 24 usa
sruthy abc 25 ekm
NULL NULL 24 chennai
FULL JOIN
Note :It will return all selected values from both tables.
Syntax
SELECT table1.column1, table2.column2...
FROM table1
FULL JOIN table2
ON table1.common_field = table2.common_field;
Apply it in our sample table :
SELECT TableA.firstName,TableA.lastName,TableB.age,TableB.Place
FROM TableA
FULL JOIN TableB
ON TableA.id = TableB.id2;
Result
firstName lastName age Place
...............................................................................
arun prasanth 24 kerala
ann antony 24 usa
sruthy abc 25 ekm
new abc NULL NULL
NULL NULL 24 chennai
Interesting Fact
For INNER joins the order doesn’t matter
For (LEFT, RIGHT or FULL) OUTER joins,the order matter
Better to go check this Link it will give you interesting details about join order
Is there a way to make an Oracle
query behave like it contains a MySQL limit
clause?
In MySQL
, I can do this:
to get the 21st to the 30th rows (skip the first 20, give the next 10). The rows are selected after the order by
, so it really starts on the 20th name alphabetically.
In Oracle
, the only thing people mention is the rownum
pseudo-column, but it is evaluated before order by
, which means this:
will return a random set of ten rows ordered by name, which is not usually what I want. It also doesn’t allow for specifying an offset.
Starting from Oracle 12c R1 (12.1), there is a row limiting clause. It does not use familiar LIMIT
syntax, but it can do the job better with more options. You can find the full syntax here.
To answer the original question, here’s the query:
(For earlier Oracle versions, please refer to other answers in this question)
Following examples were quoted from linked page, in the hope of preventing link rot.
CREATE TABLE rownum_order_test (
val NUMBER
);
INSERT ALL
INTO rownum_order_test
SELECT level
FROM dual
CONNECT BY level <= 10;
COMMIT;
SELECT val
FROM rownum_order_test
ORDER BY val;
VAL
----------
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
20 rows selected.
N
rows
SELECT val
FROM rownum_order_test
ORDER BY val DESC
FETCH FIRST 5 ROWS ONLY;
VAL
----------
10
10
9
9
8
5 rows selected.
N
rows, if N
th row has ties, get all the tied rows
SELECT val
FROM rownum_order_test
ORDER BY val DESC
FETCH FIRST 5 ROWS WITH TIES;
VAL
----------
10
10
9
9
8
8
6 rows selected.
x
% of rows
SELECT val
FROM rownum_order_test
ORDER BY val
FETCH FIRST 20 PERCENT ROWS ONLY;
VAL
----------
1
1
2
2
4 rows selected.
SELECT val
FROM rownum_order_test
ORDER BY val
OFFSET 4 ROWS FETCH NEXT 4 ROWS ONLY;
VAL
----------
3
3
4
4
4 rows selected.
You can use a subquery for this like
Have also a look at the topic On ROWNUM and limiting results at Oracle/AskTom for more information.
Update: To limit the result with both lower and upper bounds things get a bit more bloated with
select * from
( select a.*, ROWNUM rnum from
( <your_query_goes_here, with order by> ) a
where ROWNUM <= :MAX_ROW_TO_FETCH )
where rnum >= :MIN_ROW_TO_FETCH;
(Copied from specified AskTom-article)
Update 2: Starting with Oracle 12c (12.1) there is a syntax available to limit rows or start at offsets.
See this answer for more examples. Thanks to Krumia for the hint.
I did some performance testing for the following approaches:
select * from (
select a.*, ROWNUM rnum from (
<select statement with order by clause>
) a where rownum <= MAX_ROW
) where rnum >= MIN_ROW
select * from (
select statement, rownum as RN with order by clause
) where a.rn >= MIN_ROW and a.rn <= MAX_ROW
Table had 10 million records, sort was on an unindexed datetime row:
Selecting first 10 rows took:
Selecting rows between 100,000 and 100,010:
Selecting rows between 9,000,000 and 9,000,010:
What is the difference between UNION
and UNION ALL
?
UNION
removes duplicate records (where all columns in the results are the same), UNION ALL
does not.
There is a performance hit when using UNION
instead of UNION ALL
, since the database server must do additional work to remove the duplicate rows, but usually you do not want the duplicates (especially when developing reports).
Result:
Result:
Both UNION and UNION ALL concatenate the result of two different SQLs. They differ in the way they handle duplicates.
UNION performs a DISTINCT on the result set, eliminating any duplicate rows.
UNION ALL does not remove duplicates, and it therefore faster than UNION.
Note: While using this commands all selected columns need to be of the same data type.
Example: If we have two tables, 1) Employee and 2) Customer
UNION
removes duplicates, whereas UNION ALL
does not.
In order to remove duplicates the result set must be sorted, and this may have an impact on the performance of the UNION, depending on the volume of data being sorted, and the settings of various RDBMS parameters ( For Oracle PGA_AGGREGATE_TARGET
with WORKAREA_SIZE_POLICY=AUTO
or SORT_AREA_SIZE
and SOR_AREA_RETAINED_SIZE
if WORKAREA_SIZE_POLICY=MANUAL
).
Basically, the sort is faster if it can be carried out in memory, but the same caveat about the volume of data applies.
Of course, if you need data returned without duplicates then you must use UNION, depending on the source of your data.
I would have commented on the first post to qualify the “is much less performant” comment, but have insufficient reputation (points) to do so.
I want to search a text from all my database stored procedures. I use the below SQL:
SELECT DISTINCT
o.name AS Object_Name,
o.type_desc
FROM sys.sql_modules m
INNER JOIN
sys.objects o
ON m.object_id = o.object_id
WHERE m.definition Like '%[ABD]%';
I want to search for [ABD]
in all stored procedures including square brackets, but it’s not giving the proper result. How can I change my query to achieve this?
Escape the square brackets:
Then the square brackets will be treated as a string literals not as wild cards.
Try this request:
Query
Have you tried using some of the third party tools to do the search? There are several available out there that are free and that saved me a ton of time in the past.
Below are two SSMS Addins I used with good success.
ApexSQL Search – Searches both schema and data in databases and has additional features such as dependency tracking and more…
SSMS Tools pack – Has same search functionality as previous one and several other cool features. Not free for SQL Server 2012 but still very affordable.
I know this answer is not 100% related to the questions (which was more specific) but hopefully others will find this useful.
I want to delete using INNER JOIN
in SQL Server 2008.
But I get this error:
Msg 156, Level 15, State 1, Line 15
Incorrect syntax near the keyword ‘INNER’.
My code:
You need to specify what table you are deleting from, here is a version with an alias:
Just add the name of the table between DELETE
and FROM
from where you want to delete records because we have to specify the table to delete. Also remove ORDER BY
clause because there is nothing to order while deleting records.
So your final query should be like this:
Possible this be helpful for you -
DELETE FROM dbo.WorkRecord2
WHERE EmployeeRun IN (
SELECT e.EmployeeNo
FROM dbo.Employee e
WHERE ...
)
Or try this -
DELETE FROM dbo.WorkRecord2
WHERE EXISTS(
SELECT 1
FROM dbo.Employee e
WHERE EmployeeRun = e.EmployeeNo
AND ....
)
I’m trying to join 3 tables in a view; here is the situation:
I have a table that contains information of students who are applying to live on this College Campus. I have another table that lists the Hall Preferences (3 of them) for each Student. But each of these preferences are merely an ID Number, and the ID Number has a corresponding Hall Name in a third table (did not design this database…).
Pretty much, I have INNER JOIN
on the table with their preferences, and their information, the result is something like…
Where 005
would be the HallID
. So Now I want to match that HallID
to a third table, where this table contains a HallID
and HallName
.
So pretty much, I want my result to be like…
Here is what I currently have:
You can do the following (I guessed on table fields,etc)
SELECT s.studentname
, s.studentid
, s.studentdesc
, h.hallname
FROM students s
INNER JOIN hallprefs hp
on s.studentid = hp.studentid
INNER JOIN halls h
on hp.hallid = h.hallid
Based on your request for multiple halls you could do it this way. You just join on your Hall table multiple times for each room pref id:
SELECT s.StudentID
, s.FName
, s.LName
, s.Gender
, s.BirthDate
, s.Email
, r.HallPref1
, h1.hallName as Pref1HallName
, r.HallPref2
, h2.hallName as Pref2HallName
, r.HallPref3
, h3.hallName as Pref3HallName
FROM dbo.StudentSignUp AS s
INNER JOIN RoomSignUp.dbo.Incoming_Applications_Current AS r
ON s.StudentID = r.StudentID
INNER JOIN HallData.dbo.Halls AS h1
ON r.HallPref1 = h1.HallID
INNER JOIN HallData.dbo.Halls AS h2
ON r.HallPref2 = h2.HallID
INNER JOIN HallData.dbo.Halls AS h3
ON r.HallPref3 = h3.HallID
If you have 3 tables with the same ID
to be joined, I think it would be like this:
Just replace *
with what you want to get from the tables.
How can I reset the AUTO_INCREMENT
of a field? I want it to start counting from 1
again.
You can reset the counter with:
For InnoDB you cannot set the auto_increment
value lower or equal to the highest current index. (quote from ViralPatel):
Note that you cannot reset the counter to a value less than or equal to any that have already been used. For MyISAM, if the value is less than or equal to the maximum value currently in the AUTO_INCREMENT column, the value is reset to the current maximum plus one. For InnoDB, if the value is less than the current maximum value in the column, no error occurs and the current sequence value is not changed.
See How to Reset an MySQL AutoIncrement using a MAX value from another table? on how to dynamically get an acceptable value.
SET @num := 0;
UPDATE your_table SET id = @num := (@num+1);
ALTER TABLE your_table AUTO_INCREMENT =1;
I think this will do it
I am using Sybase and I am doing a select which returns me a column called “iftype”, but its type is int and I need to convert into varchar. When I try to do the select without the convert function I get this error:
Error code 257, SQL state 37000: Implicit conversion from datatype ‘VARCHAR’ to ‘INT’ is not allowed. Use the CONVERT function to run this query.
I dont know how to implement the function CONVERT
. Can anyone help me, please ?
Use the convert function.
Use the STR
function:
float_expression
Is an expression of approximate numeric (float) data type with a decimal point.
length
Is the total length. This includes decimal point, sign, digits, and spaces. The default is 10.
decimal
Is the number of places to the right of the decimal point. decimal must be less than or equal to 16. If decimal is more than 16 then the result is truncated to sixteen places to the right of the decimal point.
source: https://msdn.microsoft.com/en-us/library/ms189527.aspx
You can use CAST
function:
I have a MS SQL CTE query from which I want to create a temporary table. I am not sure how to do it as it gives an Invalid Object name
error.
Below is the whole query for reference
SELECT * INTO TEMPBLOCKEDDATES FROM
;with Calendar as (
select EventID, EventTitle, EventStartDate, EventEndDate, EventEnumDays,EventStartTime,EventEndTime, EventRecurring, EventStartDate as PlannedDate
,EventType from EventCalender
where EventActive = 1 AND LanguageID =1 AND EventBlockDate = 1
union all
select EventID, EventTitle, EventStartDate, EventEndDate, EventEnumDays,EventStartTime,EventEndTime, EventRecurring, dateadd(dd, 1, PlannedDate)
,EventType from Calendar
where EventRecurring = 1
and dateadd(dd, 1, PlannedDate) <= EventEndDate
)
select EventID, EventStartDate, EventEndDate, PlannedDate as [EventDates], Cast(PlannedDate As datetime) AS DT, Cast(EventStartTime As time) AS ST,Cast(EventEndTime As time) AS ET, EventTitle
,EventType from Calendar
where (PlannedDate >= GETDATE()) AND ',' + EventEnumDays + ',' like '%,' + cast(datepart(dw, PlannedDate) as char(1)) + ',%'
or EventEnumDays is null
order by EventID, PlannedDate
option (maxrecursion 0)
I would appreciate a point in the right direction or if I can create a temporary table from this CTE query
create table #Temp
(
EventID int,
EventTitle Varchar(50),
EventStartDate DateTime,
EventEndDate DatetIme,
EventEnumDays int,
EventStartTime Datetime,
EventEndTime DateTime,
EventRecurring Bit,
EventType int
)
;WITH Calendar
AS (SELECT /*...*/)
Insert Into #Temp
Select EventID, EventStartDate, EventEndDate, PlannedDate as [EventDates], Cast(PlannedDate As datetime) AS DT, Cast(EventStartTime As time) AS ST,Cast(EventEndTime As time) AS ET, EventTitle
,EventType from Calendar
where (PlannedDate >= GETDATE()) AND ',' + EventEnumDays + ',' like '%,' + cast(datepart(dw, PlannedDate) as char(1)) + ',%'
or EventEnumDays is null
Make sure that the table is deleted after use
Really the format can be quite simple - sometimes there’s no need to predefine a temp table - it will be created from results of the select.
So unless you want different types or are very strict on definition, keep things simple. Note also that any temporary table created inside a stored procedure is automatically dropped when the stored procedure finishes executing. If stored procedure A creates a temp table and calls stored procedure B, then B will be able to use the temporary table that A created.
However, it’s generally considered good coding practice to explicitly drop every temporary table you create anyway.
The SELECT ... INTO
needs to be in the select from the CTE.
;WITH Calendar
AS (SELECT /*... Rest of CTE definition removed for clarity*/)
SELECT EventID,
EventStartDate,
EventEndDate,
PlannedDate AS [EventDates],
Cast(PlannedDate AS DATETIME) AS DT,
Cast(EventStartTime AS TIME) AS ST,
Cast(EventEndTime AS TIME) AS ET,
EventTitle,
EventType
INTO TEMPBLOCKEDDATES /* <---- INTO goes here*/
FROM Calendar
WHERE ( PlannedDate >= Getdate() )
AND ',' + EventEnumDays + ',' LIKE '%,' + Cast(Datepart(dw, PlannedDate) AS CHAR(1)) + ',%'
OR EventEnumDays IS NULL
ORDER BY EventID,
PlannedDate
OPTION (maxrecursion 0)
I am using SQL Server 2008 and Navicat. I need to rename a column in a table using SQL.
This statement doesn’t work.
Use sp_rename
See: SQL SERVER – How to Rename a Column Name or Table Name
Documentation: sp_rename (Transact-SQL)
For your case it would be:
Remember to use single quotes to enclose your values.
Alternatively to SQL
, you can do this in Microsoft SQL Server Management Studio. Here are a few quick ways using the GUI:
Slow double-click on the column. The column name will become an editable text box.
Right click on column and choose Rename from the context menu.
For example:
This way is preferable for when you need to rename multiple columns in one go.
For example:
NOTE: I know OP specifically asked for SQL solution, thought this might help others :)
Try:
I am working on application which can deal with multiple database servers like “MySQL” and “MS SQL Server”.
I want to get tables’ names of a particular database using a general query which should suitable for all database types. I have tried following:
But it is giving table names of all databases of a particular server but I want to get tables names of selected database only. How can I restrict this query to get tables of a particular database?
Probably due to the way different sql dbms deal with schemas.
Try the following
For SQL Server:
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_CATALOG='dbName'
For MySQL:
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE' AND TABLE_SCHEMA='dbName'
For Oracle I think the equivalent would be to use DBA_TABLES
.
Stolen from here:
The following query will select all of the Tables
in the database named DBName
:
I have this table for documents (simplified version here):
+------+-------+--------------------------------------+
| id | rev | content |
+------+-------+--------------------------------------+
| 1 | 1 | ... |
| 2 | 1 | ... |
| 1 | 2 | ... |
| 1 | 3 | ... |
+------+-------+--------------------------------------+
How do I select one row per id and only the greatest rev?
With the above data, the result should contain two rows: [1, 3, ...]
and [2, 1, ..]
. I’m using MySQL.
Currently I use checks in the while
loop to detect and over-write old revs from the resultset. But is this the only method to achieve the result? Isn’t there a SQL solution?
Update
As the answers suggest, there is a SQL solution, and here a sqlfiddle demo.
Update 2
I noticed after adding the above sqlfiddle, the rate at which the question is upvoted has surpassed the upvote rate of the answers. That has not been the intention! The fiddle is based on the answers, especially the accepted answer.
All you need is a GROUP BY
clause with the MAX
aggregate function:
I just noticed you need the content
column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job’s technical interview.
It is, actually, so common that StackOverflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
group-identifier, max-value-in-group
Sub-query
In this approach, you first find the group-identifier, max-value-in-group
(already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier
and max-value-in-group
:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
In this approach, you left join the table with itself. Equality, of course, goes in the group-identifier
. Then, 2 smart moves:
NULL
in the right side (it’s a LEFT JOIN
, remember?). Then, we filter the joined result, showing only the rows where the right side is NULL
.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Both approaches bring the exact same result.
If you have two rows with max-value-in-group
for group-identifier
, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its “flavor”.
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
My preference is to use as little code as possible…
You can do it using IN
try this:
to my mind it is less complicated… easier to read and maintain.
Yet another solution is to use a correlated subquery:
select yt.id, yt.rev, yt.contents
from YourTable yt
where rev =
(select max(rev) from YourTable st where yt.id=st.id)
Having an index on (id,rev) renders the subquery almost as a simple lookup…
Following are comparisons to the solutions in @AdrianCarneiro’s answer (subquery, leftjoin), based on MySQL measurements with InnoDB table of ~1million records, group size being: 1-3.
While for full table scans subquery/leftjoin/correlated timings relate to each other as 6/8/9, when it comes to direct lookups or batch (id in (1,2,3)
), subquery is much slower then the others (Due to rerunning the subquery). However I couldnt differentiate between leftjoin and correlated solutions in speed.
One final note, as leftjoin creates n*(n+1)/2 joins in groups, its performance can be heavily affected by the size of groups…
I have seen SQL
that uses both !=
and <>
for not equal. What is the preferred syntax and why?
I like !=
, because <>
reminds me of Visual Basic
.
Technically they function the same if you’re using SQL Server AKA T-SQL. If you’re using it in stored procedures there is no performance reason to use one over the other. It then comes down to personal preference. I prefer to use <> as it is ANSI compliant.
You can find links to the various ANSI standards at…
Most databases support !=
(popular programming languages) and <>
(ANSI).
Databases that support both !=
and <>
:
!=
and <>
!=
and <>
!=
and <>
!=
and <>
!=
and <>
!=
and <>
!=
and <>
!=
and <>
!=
and <>
Databases that support the ANSI standard operator, exclusively:
'<>'
is from the SQL-92 standard and '!='
is a proprietary T-SQL operator. It’s available in other databases as well, but since it isn’t standard you have to take it on a case-by-case basis.
In most cases, you’ll know what database you’re connecting to so this isn’t really an issue. At worst you might have to do a search and replace in your SQL.
I’m trying to insert
some text data into a table in SQL Server
9.
The text includes a single quote(’).
How do I escape that?
I tried using two single quotes, but it threw me some errors.
eg. insert into my_table values('hi, my name''s tim.');
Single quotes are escaped by doubling them up, just as you’ve shown us in your example. The following SQL illustrates this functionality. I tested it on SQL Server 2008:
DECLARE @my_table TABLE (
[value] VARCHAR(200)
)
INSERT INTO @my_table VALUES ('hi, my name''s tim.')
SELECT * FROM @my_table
If escaping your single quote with another single quote isn’t working for you (like it didn’t for one of my recent REPLACE()
queries), you can use SET QUOTED_IDENTIFIER OFF
before your query, then SET QUOTED_IDENTIFIER ON
after your query.
For example
How about:
I’m trying:
A.Date
looks like: 2010-03-04 00:00:00.000
However, this is not working.
Can anyone provide a reference for why?
In your query, 2010-4-01
is treated as a mathematical expression, so in essence it read
(2010 minus 4 minus 1 is 2005
Converting it to a proper datetime
, and using single quotes will fix this issue.)
Technically, the parser might allow you to get away with
it will do the conversion for you, but in my opinion it is less readable than explicitly converting to a DateTime
for the maintenance programmer that will come after you.
Try enclosing your date into a character string.
We can use like below as well
SELECT *
FROM dbo.March2010 A
WHERE CAST(A.Date AS Date) >= '2017-03-22';
SELECT *
FROM dbo.March2010 A
WHERE CAST(A.Date AS Datetime) >= '2017-03-22 06:49:53.840';
I understand the point of GROUP BY x
But how does GROUP BY x, y
work, and what does it mean?
Group By X
means put all those with the same value for X in the one group.
Group By X, Y
means put all those with the same values for both X and Y in the one group.
To illustrate using an example, let’s say we have the following table, to do with who is attending what subject at a university:
Table: Subject_Selection
Subject Semester Attendee
---------------------------------
ITB001 1 John
ITB001 1 Bob
ITB001 1 Mickey
ITB001 2 Jenny
ITB001 2 James
MKB114 1 John
MKB114 1 Erica
When you use a group by
on the subject column only; say:
You will get something like:
…because there are 5 entries for ITB001, and 2 for MKB114
If we were to group by
two columns:
we would get this:
This is because, when we group by two columns, it is saying “Group them so that all of those with the same Subject and Semester are in the same group, and then calculate all the aggregate functions (Count, Sum, Average, etc.) for each of those groups”. In this example, this is demonstrated by the fact that, when we count them, there are three people doing ITB001 in semester 1, and two doing it in semester 2. Both of the people doing MKB114 are in semester 1, so there is no row for semester 2 (no data fits into the group “MKB114, Semester 2”)
Hopefully that makes sense.
The GROUP BY
clause is used in conjunction with the aggregate functions to group the result-set by one or more columns. e.g.:
SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name;
Remember this order:
SELECT (is used to select data from a database)
FROM (clause is used to list the tables)
WHERE (clause is used to filter records)
- GROUP BY (clause can be used in a SELECT statement to collect data across multiple records and group the results by one or more columns)
- HAVING (clause is used in combination with the GROUP BY clause to restrict the groups of returned rows to only those whose the condition is TRUE)
- ORDER BY (keyword is used to sort the result-set)
You can use all of these if you are using aggregate functions, and this is the order that they must be set, otherwise you can get an error.
Aggregate Functions are:
MIN returns the smallest value in a given column
SUM returns the sum of the numeric values in a given column
AVG returns the average value of a given column
COUNT returns the total number of values in a given column
COUNT(*) returns the number of rows in a table
What SQL can be used to list the tables, and the rows within those tables in a SQLite database file - once I have attached it with the ATTACH
command on the SQLite 3 command line tool?
The .tables
, and .schema
“helper” functions don’t look into ATTACHed databases: they just query the SQLITE_MASTER
table for the “main” database. Consequently, if you used
then you need to do
Note that temporary tables don’t show up with .tables
either: you have to list sqlite_temp_master
for that:
There are a few steps to see the tables in an SQLite database:
List the tables in your database:
.tables
```</li>
<li><p>List how the table looks:</p>
```sql
.schema tablename
```</li>
<li><p>Print the entire table:</p>
```sql
SELECT * FROM tablename;
```</li>
<li><p>List all of the available SQLite prompt commands:</p>
```sql
.help
```</li>
</ol>
#### Answer 3 (score 438)
It appears you need to go through the <strong>sqlite_master</strong> table, like this:
```sql
SELECT * FROM dbname.sqlite_master WHERE type='table';
And then manually go through each table with a SELECT
or similar to look at the rows.
The .DUMP
and .SCHEMA
commands doesn’t appear to see the database at all.
I have months stored in SQL Server as 1,2,3,4,…12. I would like to display them as January,February etc. Is there a function in SQL Server like MonthName(1) = January? I am trying to avoid a CASE statement, if possible.
A little hacky but should work:
I think this is the best way to get the month name when you have the month number
Or
As the title suggests, I’d like to select the first row of each set of rows grouped with a GROUP BY
.
Specifically, if I’ve got a purchases
table that looks like this:
My Output:
id | customer | total ---+----------+------ 1 | Joe | 5 2 | Sally | 3 3 | Joe | 2 4 | Sally | 1
I’d like to query for the id
of the largest purchase (total
) made by each customer
. Something like this:
Expected Output:
FIRST(id) | customer | FIRST(total) ----------+----------+------------- 1 | Joe | 5 2 | Sally | 3
WITH summary AS (
SELECT p.id,
p.customer,
p.total,
ROW_NUMBER() OVER(PARTITION BY p.customer
ORDER BY p.total DESC) AS rk
FROM PURCHASES p)
SELECT s.*
FROM summary s
WHERE s.rk = 1
But you need to add logic to break ties:
In PostgreSQL this is typically simpler and faster (more performance optimization below):
SELECT <b>DISTINCT ON</b> (customer)
id, customer, total
FROM purchases
ORDER BY customer, total DESC, id;```
Or shorter (if not as clear) with ordinal numbers of output columns:
```sql
SELECT DISTINCT ON (2)
id, customer, total
FROM purchases
ORDER BY 2, 3 DESC, 1;
If total
can be NULL (won’t hurt either way, but you’ll want to match existing indexes):
...
ORDER BY customer, total DESC <b>NULLS LAST</b>, id;```
<h5>Major points</h3>
<ul>
<li><p><a href="https://www.postgresql.org/docs/current/sql-select.html#SQL-DISTINCT" rel="noreferrer"><strong>`DISTINCT ON`</strong></a> is a PostgreSQL extension of the standard (where only `DISTINCT` on the whole `SELECT` list is defined).</p></li>
<li><p>List any number of expressions in the `DISTINCT ON` clause, the combined row value defines duplicates. <a href="https://www.postgresql.org/docs/current/queries-select-lists.html#QUERIES-DISTINCT" rel="noreferrer">The manual:</a></p>
<blockquote>
<p>Obviously, two rows are considered distinct if they differ in at least
one column value. <strong>Null values are considered equal in this comparison.</strong></p>
</blockquote>
Bold emphasis mine. </li>
<li><p>`DISTINCT ON` can be combined with <strong>`ORDER BY`</strong>. Leading expressions have to match leading `DISTINCT ON` expressions in the same order. You can add <em>additional</em> expressions to `ORDER BY` to pick a particular row from each group of peers. I added `id` as last item to break ties:</p>
<em>"Pick the row with the smallest `id` from each group sharing the highest `total`."</em>
To order results in a way that disagrees with the sort order determining the first per group, you can nest above query in an outer query with another `ORDER BY`. Like:
<ul>
<li><a href="https://stackoverflow.com/questions/9795660/postgresql-distinct-on-with-different-order-by/9796104#9796104">PostgreSQL DISTINCT ON with different ORDER BY</a></li>
</ul></li>
<li><p>If `total` can be NULL, you <em>most probably</em> want the row with the greatest non-null value. Add <strong>`NULLS LAST`</strong> like demonstrated. Details:</p>
<ul>
<li><a href="https://stackoverflow.com/questions/9510509/postgresql-sort-by-datetime-asc-null-first/9511492#9511492">PostgreSQL sort by datetime asc, null first?</a></li>
</ul></li>
<li><p><strong>The `SELECT` list</strong> is not constrained by expressions in `DISTINCT ON` or `ORDER BY` in any way. (Not needed in the simple case above):</p>
<ul>
<li><p>You <em>don't have to</em> include any of the expressions in `DISTINCT ON` or `ORDER BY`.</p></li>
<li><p>You <em>can</em> include any other expression in the `SELECT` list. This is instrumental for replacing much more complex queries with subqueries and aggregate / window functions.</p></li>
</ul></li>
<li><p>I tested with Postgres versions 8.3 – 12. But the feature has been there at least since version 7.1, so basically always.</p></li>
</ul>
<h5>Index</h2>
The <em>perfect</em> index for the above query would be a <a href="https://www.postgresql.org/docs/current/indexes-multicolumn.html" rel="noreferrer">multi-column index</a> spanning all three columns in matching sequence and with matching sort order:
```sql
CREATE INDEX purchases_3c_idx ON purchases (customer, total DESC, id);
May be too specialized. But use it if read performance for the particular query is crucial. If you have DESC NULLS LAST
in the query, use the same in the index so that sort order matches and the index is applicable.
Weigh cost and benefit before creating tailored indexes for each query. The potential of above index largely depends on data distribution.
The index is used because it delivers pre-sorted data. In Postgres 9.2 or later the query can also benefit from an index only scan if the index is smaller than the underlying table. The index has to be scanned in its entirety, though.
For few rows per customer (high cardinality in column customer
), this is very efficient. Even more so if you need sorted output anyway. The benefit shrinks with a growing number of rows per customer.
Ideally, you have enough work_mem
to process the involved sort step in RAM and not spill to disk. But generally setting work_mem
too high can have adverse effects. Consider SET LOCAL
for exceptionally big queries. Find how much you need with EXPLAIN ANALYZE
. Mention of “Disk:” in the sort step indicates the need for more:
For many rows per customer (low cardinality in column customer
), a loose index scan (a.k.a. “skip scan”) would be (much) more efficient, but that’s not implemented up to Postgres 12. (An implementation for index-only scans is in development for Postgres 13. See here and here.)
For now, there are faster query techniques to substitute for this. In particular if you have a separate table holding unique customers, which is the typical use case. But also if you don’t:
I had a simple benchmark here which is outdated by now. I replaced it with a detailed benchmark in this separate answer.
Testing the most interesting candidates with Postgres 9.4 and 9.5 with a halfway realistic table of 200k rows in purchases
and 10k distinct customer_id
(avg. 20 rows per customer).
For Postgres 9.5 I ran a 2nd test with effectively 86446 distinct customers. See below (avg. 2.3 rows per customer).
Main table
CREATE TABLE purchases (
id serial
, customer_id int -- REFERENCES customer
, total int -- could be amount of money in Cent
, some_column text -- to make the row bigger, more realistic
);
I use a serial
(PK constraint added below) and an integer customer_id
since that’s a more typical setup. Also added some_column
to make up for typically more columns.
Dummy data, PK, index - a typical table also has some dead tuples:
INSERT INTO purchases (customer_id, total, some_column) -- insert 200k rows
SELECT (random() * 10000)::int AS customer_id -- 10k customers
, (random() * random() * 100000)::int AS total
, 'note: ' || repeat('x', (random()^2 * random() * random() * 500)::int)
FROM generate_series(1,200000) g;
ALTER TABLE purchases ADD CONSTRAINT purchases_id_pkey PRIMARY KEY (id);
DELETE FROM purchases WHERE random() > 0.9; -- some dead rows
INSERT INTO purchases (customer_id, total, some_column)
SELECT (random() * 10000)::int AS customer_id -- 10k customers
, (random() * random() * 100000)::int AS total
, 'note: ' || repeat('x', (random()^2 * random() * random() * 500)::int)
FROM generate_series(1,20000) g; -- add 20k to make it ~ 200k
CREATE INDEX purchases_3c_idx ON purchases (customer_id, total DESC, id);
VACUUM ANALYZE purchases;
customer
table - for superior query
CREATE TABLE customer AS
SELECT customer_id, 'customer_' || customer_id AS customer
FROM purchases
GROUP BY 1
ORDER BY 1;
ALTER TABLE customer ADD CONSTRAINT customer_customer_id_pkey PRIMARY KEY (customer_id);
VACUUM ANALYZE customer;
In my second test for 9.5 I used the same setup, but with random() * 100000
to generate customer_id
to get only few rows per customer_id
.
purchases
Generated with this query.
what | bytes/ct | bytes_pretty | bytes_per_row
-----------------------------------+----------+--------------+---------------
core_relation_size | 20496384 | 20 MB | 102
visibility_map | 0 | 0 bytes | 0
free_space_map | 24576 | 24 kB | 0
table_size_incl_toast | 20529152 | 20 MB | 102
indexes_size | 10977280 | 10 MB | 54
total_size_incl_toast_and_indexes | 31506432 | 30 MB | 157
live_rows_in_text_representation | 13729802 | 13 MB | 68
------------------------------ | | |
row_count | 200045 | |
live_tuples | 200045 | |
dead_tuples | 19955 | |
row_number()
in CTE, (see other answer)
WITH cte AS (
SELECT id, customer_id, total
, row_number() OVER(PARTITION BY customer_id ORDER BY total DESC) AS rn
FROM purchases
)
SELECT id, customer_id, total
FROM cte
WHERE rn = 1;
row_number()
in subquery (my optimization)
SELECT id, customer_id, total
FROM (
SELECT id, customer_id, total
, row_number() OVER(PARTITION BY customer_id ORDER BY total DESC) AS rn
FROM purchases
) sub
WHERE rn = 1;
DISTINCT ON
(see other answer)
SELECT DISTINCT ON (customer_id)
id, customer_id, total
FROM purchases
ORDER BY customer_id, total DESC, id;
LATERAL
subquery (see here)
WITH RECURSIVE cte AS (
( -- parentheses required
SELECT id, customer_id, total
FROM purchases
ORDER BY customer_id, total DESC
LIMIT 1
)
UNION ALL
SELECT u.*
FROM cte c
, LATERAL (
SELECT id, customer_id, total
FROM purchases
WHERE customer_id > c.customer_id -- lateral reference
ORDER BY customer_id, total DESC
LIMIT 1
) u
)
SELECT id, customer_id, total
FROM cte
ORDER BY customer_id;
customer
table with LATERAL
(see here)
SELECT l.*
FROM customer c
, LATERAL (
SELECT id, customer_id, total
FROM purchases
WHERE customer_id = c.customer_id -- lateral reference
ORDER BY total DESC
LIMIT 1
) l;
array_agg()
with ORDER BY
(see other answer)
SELECT (array_agg(id ORDER BY total DESC))[1] AS id
, customer_id
, max(total) AS total
FROM purchases
GROUP BY customer_id;
Execution time for above queries with EXPLAIN ANALYZE
(and all options off), best of 5 runs.
All queries used an Index Only Scan on purchases2_3c_idx
(among other steps). Some of them just for the smaller size of the index, others more effectively.
customer_id
customer_id
Here is a new one by “ogr” testing with 10M rows and 60k unique “customers” on Postgres 11.5 (current as of Sep. 2019). Results are still in line with what we have seen so far:
I ran three tests with PostgreSQL 9.1 on a real life table of 65579 rows and single-column btree indexes on each of the three columns involved and took the best execution time of 5 runs.
Comparing @OMGPonies’ first query (A
) to the above DISTINCT ON
solution (B
):
Select the whole table, results in 5958 rows in this case.
A: 567.218 ms
B: 386.673 ms
```</li>
<li><p>Use condition `WHERE customer BETWEEN x AND y` resulting in 1000 rows.</p>
```sql
A: 249.136 ms
B: 55.111 ms
```</li>
<li><p>Select a single customer with `WHERE customer = x`.</p>
```sql
A: 0.143 ms
B: 0.072 ms
```</li>
</ol>
Same test repeated with the index described in the other answer
```sql
CREATE INDEX purchases_3c_idx ON purchases (customer, total DESC, id);
1A: 277.953 ms
1B: 193.547 ms
2A: 249.796 ms -- special index not used
2B: 28.679 ms
3A: 0.120 ms
3B: 0.048 ms
I didn’t see any similar questions asked on this topic, and I had to research this for something I’m working on right now. Thought I would post the answer for it in case anyone else had the same question.
I found the answer here: http://blog.sqlauthority.com/2007/08/22/sql-server-t-sql-script-to-insert-carriage-return-and-new-line-feed-in-code/
You just concatenate the string and insert a CHAR(13)
where you want your line break.
Example:
DECLARE @text NVARCHAR(100)
SET @text = 'This is line 1.' + CHAR(13) + 'This is line 2.'
SELECT @text
This prints out the following:
This is line 1.
This is line 2.
char(13)
is CR
. For DOS-/Windows-style CRLF
linebreaks, you want char(13)+char(10)
, like:
Another way to do this is as such:
That is, simply inserting a line break in your query while writing it will add the like break to the database. This works in SQL server Management studio and Query Analyzer. I believe this will also work in C# if you use the @ sign on strings.
I am using SQL Server 2008 Enterprise. I want to see any active SQL Server connections, and the related information of all the connections, like from which IP address, connect to which database or something.
Are there existing commands to solve this issue?
You can use the sp_who
stored procedure.
Provides information about current users, sessions, and processes in an instance of the Microsoft SQL Server Database Engine. The information can be filtered to return only those processes that are not idle, that belong to a specific user, or that belong to a specific session.
SELECT
DB_NAME(dbid) as DBName,
COUNT(dbid) as NumberOfConnections,
loginame as LoginName
FROM
sys.sysprocesses
WHERE
dbid > 0
GROUP BY
dbid, loginame
;
See also the Microsoft documentation for sys.sysprocesses.
Apart from sp_who
, you can also use the “undocumented” sp_who2
system stored procedure which gives you more detailed information. See Difference between sp_who and sp_who2.
I need to access only Month.Year from Date field in SQL Server.
As well as the suggestions given already, there is one other possiblity I can infer from your question:
- You still want the result to be a date
- But you want to ‘discard’ the Days, Hours, etc
- Leaving a year/month only date field
SELECT
DATEADD(MONTH, DATEDIFF(MONTH, 0, <dateField>), 0) AS [year_month_date_field]
FROM
<your_table>
This gets the number of whole months from a base date (0) and then adds them to that base date. Thus rounding Down to the month in which the date is in.
NOTE: In SQL Server 2008, You will still have the TIME attached as 00:00:00.000 This is not exactly the same as “removing” any notation of day and time altogether. Also the DAY set to the first. e.g. 2009-10-01 00:00:00.000
I want to query something with SQL’s like
query:
How to do I achieve the same in MongoDB? I can’t find an operator for like
in the documentation.
That would have to be:
or, similar:
You’re looking for something that contains “m” somewhere (SQL’s ‘%
’ operator is equivalent to Regexp’s ‘.*
’), not something that has “m” anchored to the beginning of the string.
db.users.insert({name: 'paulo'})
db.users.insert({name: 'patric'})
db.users.insert({name: 'pedro'})
db.users.find({name: /a/}) //like '%a%'
out: paulo, patric
out: paulo, patric
out: pedro
In
you can do:
I have a query in sql, I have to get date in a format of dd/mmm/yy
Example: 25/jun/2013
.
How can I convert it for SQL server?
I’m not sure there is an exact match for the format you want. But you can get close with convert()
and style 106
. Then, replace the spaces:
There are already multiple answers and formatting types for SQL server 2008. But this method somewhat ambiguous and it would be difficult for you to remember the number with respect to Specific Date Format. That’s why in next versions of SQL server there is better option.
With culture option, you can specify date as per your viewers.
DECLARE @d DATETIME = '10/01/2011';
SELECT FORMAT ( @d, 'd', 'en-US' ) AS 'US English Result'
,FORMAT ( @d, 'd', 'en-gb' ) AS 'Great Britain English Result'
,FORMAT ( @d, 'd', 'de-de' ) AS 'German Result'
,FORMAT ( @d, 'd', 'zh-cn' ) AS 'Simplified Chinese (PRC) Result';
SELECT FORMAT ( @d, 'D', 'en-US' ) AS 'US English Result'
,FORMAT ( @d, 'D', 'en-gb' ) AS 'Great Britain English Result'
,FORMAT ( @d, 'D', 'de-de' ) AS 'German Result'
,FORMAT ( @d, 'D', 'zh-cn' ) AS 'Chinese (Simplified PRC) Result';
US English Result Great Britain English Result German Result Simplified Chinese (PRC) Result
---------------- ----------------------------- ------------- -------------------------------------
10/1/2011 01/10/2011 01.10.2011 2011/10/1
US English Result Great Britain English Result German Result Chinese (Simplified PRC) Result
---------------------------- ----------------------------- ----------------------------- ---------------------------------------
Saturday, October 01, 2011 01 October 2011 Samstag, 1. Oktober 2011 2011年10月1日
For OP’s solution, we can use following format, which is already mentioned by @Martin Smith:
Some sample date formats:
If you want more date formats of SQL server, you should visit:
we can convert date into many formats like
This returns dd mon yyyy
More Here This may help you
Table 1:
Table 2:
In oracle SQL, how do I run an sql update query that can update Table 1 with Table 2’s name
and desc
using the same id
? So the end result I would get is
Table 1:
Question is taken from update one table with data from another, but specifically for oracle SQL.
This is called a correlated update
UPDATE table1 t1
SET (name, desc) = (SELECT t2.name, t2.desc
FROM table2 t2
WHERE t1.id = t2.id)
WHERE EXISTS (
SELECT 1
FROM table2 t2
WHERE t1.id = t2.id )
Assuming the join results in a key-preserved view, you could also
Try this:
try
UPDATE Table1 T1 SET
T1.name = (SELECT T2.name FROM Table2 T2 WHERE T2.id = T1.id),
T1.desc = (SELECT T2.desc FROM Table2 T2 WHERE T2.id = T1.id)
WHERE T1.id IN (SELECT T2.id FROM Table2 T2 WHERE T2.id = T1.id);
I have inserted records into a SQL Server database table. The table had a primary key defined and the auto increment identity seed is set to “Yes”. This is done primarily because in SQL Azure, each table has to have a primary key and identity defined.
But since I have to delete some records from the table, the identity seed for those tables will be disturbed and the index column (which is auto-generated with an increment of 1) will get disturbed.
How can I reset the identity column after I deleted the records so that the column has sequence in ascending numerical order?
The identity column is not used as a foreign key anywhere in database.
The DBCC CHECKIDENT
management command is used to reset identity counter. The command syntax is:
Example:
It was not supported in a previous versions of Azure SQL Database, but is supported now.
Please note that new_reseed_value
argument is varied across SQL Server versions according to documentation:
If rows are present in the table, the next row is inserted with the new_reseed_value value. In version SQL Server 2008 R2 and earlier, the next row inserted uses new_reseed_value + the current increment value.
However, I find this information misleading (just plain wrong actually) because observed behaviour indicates that at least SQL Server 2012 is still uses new_reseed_value + the current increment value logic. Microsoft even contradicts with its own Example C
found on same page:
C. Forcing the current identity value to a new value
The following example forces the current identity value in the AddressTypeID column in the AddressType table to a value of 10. Because the table has existing rows, the next row inserted will use 11 as the value, that is, the new current increment value defined for the column value plus 1.
Still, this all leaves an option for different behaviour on newer SQL Server versions. I guess the only way to be sure, until Microsoft clear up things in its own documentation, is to do actual tests before usage.
Where 0 is identity
Start value
It should be noted that IF all of the data is being removed from the table via the DELETE
(i.e. no WHERE
clause), then as long as a) permissions allow for it, and b) there are no FKs referencing the table (which appears to be the case here), using TRUNCATE TABLE
would be preferred as it does a more efficient DELETE
and resets the IDENTITY
seed at the same time. The following details are taken from the MSDN page for TRUNCATE TABLE:
Compared to the DELETE statement, TRUNCATE TABLE has the following advantages:
If the table contains an identity column, the counter for that column is reset to the seed value defined for the column. If no seed was defined, the default value 1 is used. To retain the identity counter, use DELETE instead.
Less transaction log space is used.
The DELETE statement removes rows one at a time and records an entry in the transaction log for each deleted row. TRUNCATE TABLE removes the data by deallocating the data pages used to store the table data and records only the page deallocations in the transaction log.Fewer locks are typically used.
When the DELETE statement is executed using a row lock, each row in the table is locked for deletion. TRUNCATE TABLE always locks the table (including a schema (SCH-M) lock) and page but not each row.Without exception, zero pages are left in the table.
After a DELETE statement is executed, the table can still contain empty pages. For example, empty pages in a heap cannot be deallocated without at least an exclusive (LCK_M_X) table lock. If the delete operation does not use a table lock, the table (heap) will contain many empty pages. For indexes, the delete operation can leave empty pages behind, although these pages will be deallocated quickly by a background cleanup process.
So the following:
Becomes just:
Please see the TRUNCATE TABLE
documentation (linked above) for additional information on restrictions, etc.
It appears that there is no concept of AUTO_INCREMENT in Oracle, up until and including version 11g.
How can I create a column that behaves like auto increment in Oracle 11g?
There is no such thing as “auto_increment” or “identity” columns in Oracle as of Oracle 11g. However, you can model it easily with a sequence and a trigger:
Table definition:
CREATE TABLE departments (
ID NUMBER(10) NOT NULL,
DESCRIPTION VARCHAR2(50) NOT NULL);
ALTER TABLE departments ADD (
CONSTRAINT dept_pk PRIMARY KEY (ID));
CREATE SEQUENCE dept_seq START WITH 1;
Trigger definition:
CREATE OR REPLACE TRIGGER dept_bir
BEFORE INSERT ON departments
FOR EACH ROW
BEGIN
SELECT dept_seq.NEXTVAL
INTO :new.id
FROM dual;
END;
/
IDENTITY
column is now available on Oracle 12c:
or specify starting and increment values, also preventing any insert into the identity column (GENERATED ALWAYS
) (again, Oracle 12c+ only)
create table t1 (
c1 NUMBER GENERATED ALWAYS as IDENTITY(START with 1 INCREMENT by 1),
c2 VARCHAR2(10)
);
Alternatively, Oracle 12 also allows to use a sequence as a default value:
SYS_GUID
returns a GUID– a globally unique ID. A SYS_GUID
is a RAW(16)
. It does not generate an incrementing numeric value.
If you want to create an incrementing numeric key, you’ll want to create a sequence.
You would then either use that sequence in your INSERT
statement
INSERT INTO name_of_table( primary_key_column, <<other columns>> )
VALUES( name_of_sequence.nextval, <<other values>> );
Or you can define a trigger that automatically populates the primary key value using the sequence
CREATE OR REPLACE TRIGGER trigger_name
BEFORE INSERT ON table_name
FOR EACH ROW
BEGIN
SELECT name_of_sequence.nextval
INTO :new.primary_key_column
FROM dual;
END;
If you are using Oracle 11.1 or later, you can simplify the trigger a bit
CREATE OR REPLACE TRIGGER trigger_name
BEFORE INSERT ON table_name
FOR EACH ROW
BEGIN
:new.primary_key_column := name_of_sequence.nextval;
END;
If you really want to use SYS_GUID
In Oracle 12c onward you could do something like,
CREATE TABLE MAPS
(
MAP_ID INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1) NOT NULL,
MAP_NAME VARCHAR(24) NOT NULL,
UNIQUE (MAP_ID, MAP_NAME)
);
And in Oracle (Pre 12c).
-- create table
CREATE TABLE MAPS
(
MAP_ID INTEGER NOT NULL ,
MAP_NAME VARCHAR(24) NOT NULL,
UNIQUE (MAP_ID, MAP_NAME)
);
-- create sequence
CREATE SEQUENCE MAPS_SEQ;
-- create tigger using the sequence
CREATE OR REPLACE TRIGGER MAPS_TRG
BEFORE INSERT ON MAPS
FOR EACH ROW
WHEN (new.MAP_ID IS NULL)
BEGIN
SELECT MAPS_SEQ.NEXTVAL
INTO :new.MAP_ID
FROM dual;
END;
/
Why am I getting an error doing an insert when IDENTITY_INSERT
is set to OFF?
How do I turn it on properly in SQL Server 2008? Is it by using SQL Server Management Studio?
I have run this query:
Then I got the message back in the console that the Command(s) completed successfully. However when I run the application, it still gives me the error shown below:
Via SQL as per MSDN
SET IDENTITY_INSERT sometableWithIdentity ON
INSERT sometableWithIdentity (IdentityColumn, col2, col3, ...)
VALUES (AnIdentityValue, col2value, col3value, ...)
SET IDENTITY_INSERT sometableWithIdentity OFF
The complete error message tells you exactly what is wrong…
Cannot insert explicit value for identity column in table ‘sometableWithIdentity’ when IDENTITY_INSERT is set to OFF.
I had a problem where it did not allow me to insert it even after setting the IDENTITY_INSERT ON.
The problem was that i did not specify the column names and for some reason it did not like it.
So basically do the full INSERT INTO tbl(cols) Values(vals)
Import: You must write columns in INSERT
statement
Is not correct.
Is correct
In SQL Server 2005 I have a table cm_production that lists all the code that’s been put into production. The table has a ticket_number, program_type, and program_name and push_number along with some other columns.
GOAL: Count all the DISTINCT program names by program type and push number
What I have so far is:
DECLARE @push_number INT;
SET @push_number = [HERE_ADD_NUMBER];
SELECT DISTINCT COUNT(*) AS Count, program_type AS [Type]
FROM cm_production
WHERE push_number=@push_number
GROUP BY program_type
This gets me partway there, but it’s counting all the program names, not the distinct ones (which I don’t expect it to do in that query). I guess I just can’t wrap my head around how to tell it to count only the distinct program names without selecting them. Or something.
Count all the DISTINCT program names by program type and push number
SELECT COUNT(DISTINCT program_name) AS Count,
program_type AS [Type]
FROM cm_production
WHERE push_number=@push_number
GROUP BY program_type
DISTINCT COUNT(*)
will return a row for each unique count. What you want is COUNT(DISTINCT <expression>)
: evaluates expression for each row in a group and returns the number of unique, non-null values.
I needed to get the number of occurrences of each distinct value. The column contained Region info. The simple SQL query I ended up with was:
Which would give me a list like, say:
You have to create a derived table for the distinct columns and then query the count from that table:
Here dt
is a derived table.
I need to retrieve all rows from a table where 2 columns combined are all different. So I want all the sales that do not have any other sales that happened on the same day for the same price. The sales that are unique based on day and price will get updated to an active status.
So I’m thinking:
UPDATE sales
SET status = 'ACTIVE'
WHERE id IN (SELECT DISTINCT (saleprice, saledate), id, count(id)
FROM sales
HAVING count = 1)
But my brain hurts going any farther than that.
is roughly equivalent to:
It’s a good idea to get used to the GROUP BY syntax, as it’s more powerful.
For your query, I’d do it like this:
If you put together the answers so far, clean up and improve, you would arrive at this superior query:
UPDATE sales
SET status = 'ACTIVE'
WHERE (saleprice, saledate) IN (
SELECT saleprice, saledate
FROM sales
GROUP BY saleprice, saledate
HAVING count(*) = 1
);
Which is much faster than either of them. Nukes the performance of the currently accepted answer by factor 10 - 15 (in my tests on PostgreSQL 8.4 and 9.1).
But this is still far from optimal. Use a NOT EXISTS
(anti-)semi-join for even better performance. EXISTS
is standard SQL, has been around forever (at least since PostgreSQL 7.2, long before this question was asked) and fits the presented requirements perfectly:
UPDATE sales s
SET status = 'ACTIVE'
WHERE NOT EXISTS (
SELECT FROM sales s1 -- SELECT list can be empty for EXISTS
WHERE s.saleprice = s1.saleprice
AND s.saledate = s1.saledate
AND s.id <> s1.id -- except for row itself
)
AND s.status IS DISTINCT FROM 'ACTIVE'; -- avoid empty updates. see below
db<>fiddle here
Old SQL Fiddle
If you don’t have a primary or unique key for the table (id
in the example), you can substitute with the system column ctid
for the purpose of this query (but not for some other purposes):
Every table should have a primary key. Add one if you didn’t have one, yet. I suggest a serial
or an IDENTITY
column in Postgres 10+.
Related:
The subquery in the EXISTS
anti-semi-join can stop evaluating as soon as the first dupe is found (no point in looking further). For a base table with few duplicates this is only mildly more efficient. With lots of duplicates this becomes way more efficient.
For rows that already have status = 'ACTIVE'
this update would not change anything, but still insert a new row version at full cost (minor exceptions apply). Normally, you do not want this. Add another WHERE
condition like demonstrated above to avoid this and make it even faster:
If status
is defined NOT NULL
, you can simplify to:
This query (unlike the currently accepted answer by Joel) does not treat NULL values as equal. The following two rows for (saleprice, saledate)
would qualify as “distinct” (though looking identical to the human eye):
Also passes in a unique index and almost anywhere else, since NULL values do not compare equal according to the SQL standard. See:
OTOH, GROUP BY
, DISTINCT
or DISTINCT ON ()
treat NULL values as equal. Use an appropriate query style depending on what you want to achieve. You can still use this faster query with IS NOT DISTINCT FROM
instead of =
for any or all comparisons to make NULL compare equal. More:
If all columns being compared are defined NOT NULL
, there is no room for disagreement.
The problem with your query is that when using a GROUP BY clause (which you essentially do by using distinct) you can only use columns that you group by or aggregate functions. You cannot use the column id because there are potentially different values. In your case there is always only one value because of the HAVING clause, but most RDBMS are not smart enough to recognize that.
This should work however (and doesn’t need a join):
UPDATE sales
SET status='ACTIVE'
WHERE id IN (
SELECT MIN(id) FROM sales
GROUP BY saleprice, saledate
HAVING COUNT(id) = 1
)
You could also use MAX or AVG instead of MIN, it is only important to use a function that returns the value of the column if there is only one matching row.
I have to update a field with a value which is returned by a join of 3 tables.
Example:
select
im.itemid
,im.sku as iSku
,gm.SKU as GSKU
,mm.ManufacturerId as ManuId
,mm.ManufacturerName
,im.mf_item_number
,mm.ManufacturerID
from
item_master im, group_master gm, Manufacturer_Master mm
where
im.mf_item_number like 'STA%'
and im.sku=gm.sku
and gm.ManufacturerID = mm.ManufacturerID
and gm.manufacturerID=34
I want to update the mf_item_number
field values of table item_master
with some other value which is joined in the above condition.
How can I do this in MS SQL Server?
UPDATE im
SET mf_item_number = gm.SKU --etc
FROM item_master im
JOIN group_master gm
ON im.sku = gm.sku
JOIN Manufacturer_Master mm
ON gm.ManufacturerID = mm.ManufacturerID
WHERE im.mf_item_number like 'STA%' AND
gm.manufacturerID = 34
To make it clear… The UPDATE
clause can refer to an table alias specified in the FROM
clause. So im
in this case is valid
One of the easiest way is to use a common table expression (since you’re already on SQL 2005):
with cte as (
select
im.itemid
,im.sku as iSku
,gm.SKU as GSKU
,mm.ManufacturerId as ManuId
,mm.ManufacturerName
,im.mf_item_number
,mm.ManufacturerID
, <your other field>
from
item_master im, group_master gm, Manufacturer_Master mm
where
im.mf_item_number like 'STA%'
and im.sku=gm.sku
and gm.ManufacturerID = mm.ManufacturerID
and gm.manufacturerID=34)
update cte set mf_item_number = <your other field>
The query execution engine will figure out on its own how to update the record.
Adapting this to MySQL – there is no FROM
clause in UPDATE
, but this works:
UPDATE
item_master im
JOIN
group_master gm ON im.sku=gm.sku
JOIN
Manufacturer_Master mm ON gm.ManufacturerID=mm.ManufacturerID
SET
im.mf_item_number = gm.SKU --etc
WHERE
im.mf_item_number like 'STA%'
AND
gm.manufacturerID=34
How to find third or nth
maximum salary from salary table(EmpID,EmpName,EmpSalary)
in Optimized way?
Use ROW_NUMBER
(if you want a single) or DENSE_RANK
(for all related rows):
Row Number :
SELECT Salary,EmpName
FROM
(
SELECT Salary,EmpName,ROW_NUMBER() OVER(ORDER BY Salary) As RowNum
FROM EMPLOYEE
) As A
WHERE A.RowNum IN (2,3)
Sub Query :
SELECT *
FROM Employee Emp1
WHERE (N-1) = (
SELECT COUNT(DISTINCT(Emp2.Salary))
FROM Employee Emp2
WHERE Emp2.Salary > Emp1.Salary
)
Top Keyword :
Try this
SELECT TOP 1 salary FROM (
SELECT TOP 3 salary
FROM employees
ORDER BY salary DESC) AS emp
ORDER BY salary ASC
For 3 you can replace any value…
I want to insert a datetime
value into a table(SQL Server) using the sql query below
But I get this Error msg. Incorrect syntax near '10'.
I tried it with the quotes
I get this error message Cannot convert varchar to datetime
Kindly help! Thanks.
You will want to use the YYYYMMDD for unambiguous date determination in SQL Server.
If you are married to the dd-mm-yy hh:mm:ss xm
format, you will need to use CONVERT with the specific style.
5
here is the style for Italian dates. Well, not just Italians, but that’s the culture it’s attributed to in Books Online.
A more language-independent choice for string literals is the international standard ISO 8601 format “YYYY-MM-DDThh:mm:ss”. I used the SQL query below to test the format, and it does indeed work in all SQL languages in sys.syslanguages:
declare @sql nvarchar(4000)
declare @LangID smallint
declare @Alias sysname
declare @MaxLangID smallint
select @MaxLangID = max(langid) from sys.syslanguages
set @LangID = 0
while @LangID <= @MaxLangID
begin
select @Alias = alias
from sys.syslanguages
where langid = @LangID
if @Alias is not null
begin
begin try
set @sql = N'declare @TestLang table (langdate datetime)
set language ''' + @alias + N''';
insert into @TestLang (langdate)
values (''2012-06-18T10:34:09'')'
print 'Testing ' + @Alias
exec sp_executesql @sql
end try
begin catch
print 'Error in language ' + @Alias
print ERROR_MESSAGE()
end catch
end
select @LangID = min(langid)
from sys.syslanguages
where langid > @LangID
end
According to the String Literal Date and Time Formats section in Microsoft TechNet, the standard ANSI Standard SQL date format “YYYY-MM-DD hh:mm:ss” is supposed to be “multi-language”. However, using the same query, the ANSI format does not work in all SQL languages.
For example, in Danish, you will many errors like the following:
Error in language Danish The conversion of a varchar data type to a datetime data type resulted in an out-of-range value.
If you want to build a query in C# to run on SQL Server, and you need to pass a date in the ISO 8601 format, use the Sortable “s” format specifier:
Management studio creates scripts like:
I want to create a table from select query result in SQL Server, I tried
but I got an error
Incorrect syntax near the keyword ‘AS’
Use following syntax to create new table from old table in SQL server 2008
use SELECT...INTO
The SELECT INTO statement creates a new table and populates it with the result set of the SELECT statement. SELECT INTO can be used to combine data from several tables or views into one table. It can also be used to create a new table that contains data selected from a linked server.
Example,
Standard Syntax,
Please be careful, MSSQL: "SELECT * INTO NewTable FROM OldTable"
is not always the same as MYSQL: "create table temp AS select.."
I think that there are occasions when this (in MSSQL) does not guarantee that all the fields in the new table are of the same type as the old.
For example :
create table oldTable (field1 varchar(10), field2 integer, field3 float)
insert into oldTable (field1,field2,field3) values ('1', 1, 1)
select top 1 * into newTable from oldTable
does not always yield:
but may be:
Using SQL Server, how do I split a string so I can access item x?
Take a string “Hello John Smith”. How can I split the string by space and access the item at index 1 which should return “John”?
You may find the solution in SQL User Defined Function to Parse a Delimited String helpful (from The Code Project).
You can use this simple logic:
Declare @products varchar(200) = '1|20|3|343|44|6|8765'
Declare @individual varchar(20) = null
WHILE LEN(@products) > 0
BEGIN
IF PATINDEX('%|%', @products) > 0
BEGIN
SET @individual = SUBSTRING(@products,
0,
PATINDEX('%|%', @products))
SELECT @individual
SET @products = SUBSTRING(@products,
LEN(@individual + '|') + 1,
LEN(@products))
END
ELSE
BEGIN
SET @individual = @products
SET @products = NULL
SELECT @individual
END
END
I don’t believe SQL Server has a built-in split function, so other than a UDF, the only other answer I know is to hijack the PARSENAME function:
PARSENAME takes a string and splits it on the period character. It takes a number as its second argument, and that number specifies which segment of the string to return (working from back to front).
Obvious problem is when the string already contains a period. I still think using a UDF is the best way…any other suggestions?
First, create a function (using CTE, common table expression does away with the need for a temp table)
create function dbo.SplitString
(
@str nvarchar(4000),
@separator char(1)
)
returns table
AS
return (
with tokens(p, a, b) AS (
select
1,
1,
charindex(@separator, @str)
union all
select
p + 1,
b + 1,
charindex(@separator, @str, b + 1)
from tokens
where b > 0
)
select
p-1 zeroBasedOccurance,
substring(
@str,
a,
case when b > 0 then b-a ELSE 4000 end)
AS s
from tokens
)
GO
Then, use it as any table (or modify it to fit within your existing stored proc) like this.
Update
Previous version would fail for input string longer than 4000 chars. This version takes care of the limitation:
create function dbo.SplitString
(
@str nvarchar(max),
@separator char(1)
)
returns table
AS
return (
with tokens(p, a, b) AS (
select
cast(1 as bigint),
cast(1 as bigint),
charindex(@separator, @str)
union all
select
p + 1,
b + 1,
charindex(@separator, @str, b + 1)
from tokens
where b > 0
)
select
p-1 ItemIndex,
substring(
@str,
a,
case when b > 0 then b-a ELSE LEN(@str) end)
AS s
from tokens
);
GO
Usage remains the same.
I have a database called foo and a database called bar. I have a table in foo called tblFoobar that I want to move (data and all) to database bar from database foo. What is the SQL statement to do this?
On SQL Server? and on the same database server? Use three part naming.
This just moves the data. If you want to move the table definition (and other attributes such as permissions and indexes), you’ll have to do something else.
SQL Server Management Studio’s “Import Data” task (right-click on the DB name, then tasks) will do most of this for you. Run it from the database you want to copy the data into.
If the tables don’t exist it will create them for you, but you’ll probably have to recreate any indexes and such. If the tables do exist, it will append the new data by default but you can adjust that (edit mappings) so it will delete all existing data.
I use this all the time and it works fairly well.
This should work:
It will not copy constaints, defaults or indexes. The table created will not have a clustered index.
Alternatively you could:
If your destination table exists and is empty.
I have a stored procedure that returns rows:
My actual procedure is a little more complicated, which is why a sproc is necessary.
Is it possible to select the output by calling this procedure?
Something like:
I need to use SELECT TOP X
, ROW_NUMBER
, and an additional WHERE
clause to page my data, and I don’t really want to pass these values as parameters.
You can use a User-defined function or a view instead of a procedure.
A procedure can return multiple result sets, each with its own schema. It’s not suitable for using in a SELECT
statement.
You can
… sql ….
You should look at this excellent article by Erland Sommarskog:
It basically lists all available options for your scenario.
I need to change the primary key of a table to an identity column, and there’s already a number of rows in table.
I’ve got a script to clean up the IDs to ensure they’re sequential starting at 1, works fine on my test database.
What’s the SQL command to alter the column to have an identity property?
You can’t alter the existing columns for identity.
You have 2 options,
Create a new table with identity & drop the existing table
Create a new column with identity & drop the existing column
Approach 1. (New table) Here you can retain the existing data values on the newly created identity column.
CREATE TABLE dbo.Tmp_Names
(
Id int NOT NULL
IDENTITY(1, 1),
Name varchar(50) NULL
)
ON [PRIMARY]
go
SET IDENTITY_INSERT dbo.Tmp_Names ON
go
IF EXISTS ( SELECT *
FROM dbo.Names )
INSERT INTO dbo.Tmp_Names ( Id, Name )
SELECT Id,
Name
FROM dbo.Names TABLOCKX
go
SET IDENTITY_INSERT dbo.Tmp_Names OFF
go
DROP TABLE dbo.Names
go
Exec sp_rename 'Tmp_Names', 'Names'
Approach 2 (New column) You can’t retain the existing data values on the newly created identity column, The identity column will hold the sequence of number.
Alter Table Names
Add Id_new Int Identity(1, 1)
Go
Alter Table Names Drop Column ID
Go
Exec sp_rename 'Names.Id_new', 'ID', 'Column'
See the following Microsoft SQL Server Forum post for more details:
In SQL 2005 and above, there’s a trick to solve this problem without changing the table’s data pages. This is important for large tables where touching every data page can take minutes or hours. The trick also works even if the identity column is a primary key, is part of a clustered or non-clustered index, or other gotchas which can trip up the the simpler “add/remove/rename column” solution.
Here’s the trick: you can use SQL Server’s ALTER TABLE…SWITCH statement to change the schema of a table without changing the data, meaning you can replace a table with an IDENTITY with an identical table schema, but without an IDENTITY column. The same trick works to add IDENTITY to an existing column.
Normally, ALTER TABLE…SWITCH is used to efficiently replace a full partition in a partitioned table with a new, empty partition. But it can also be used in non-partitioned tables too.
I’ve used this trick to convert, in under 5 seconds, a column of a of a 2.5 billion row table from IDENTITY to a non-IDENTITY (in order to run a multi-hour query whose query plan worked better for non-IDENTITY columns), and then restored the IDENTITY setting, again in less than 5 seconds.
Here’s a code sample of how it works.
CREATE TABLE Test
(
id int identity(1,1),
somecolumn varchar(10)
);
INSERT INTO Test VALUES ('Hello');
INSERT INTO Test VALUES ('World');
-- copy the table. use same schema, but no identity
CREATE TABLE Test2
(
id int NOT NULL,
somecolumn varchar(10)
);
ALTER TABLE Test SWITCH TO Test2;
-- drop the original (now empty) table
DROP TABLE Test;
-- rename new table to old table's name
EXEC sp_rename 'Test2','Test';
-- update the identity seed
DBCC CHECKIDENT('Test');
-- see same records
SELECT * FROM Test;
This is obviously more involved than the solutions in other answers, but if your table is large this can be a real life-saver. There are some caveats:
There’s a good article on TechNet detailing the requirements above.
UPDATE - Eric Wu had a comment below that adds important info about this solution. Copying it here to make sure it gets more attention:
There’s another caveat here that is worth mentioning. Although the new table will happily receive data from the old table, and all the new rows will be inserted following a identity pattern, they will start at 1 and potentially break if the said column is a primary key. Consider running
DBCC CHECKIDENT('<newTableName>')
immediately after switching. See msdn.microsoft.com/en-us/library/ms176057.aspx for more info.
If the table is actively being extended with new rows (meaning you don’t have much if any downtime between adding IDENTITY and adding new rows, then instead of DBCC CHECKIDENT
you’ll want to manually set the identity seed value in the new table schema to be larger than the largest existing ID in the table, e.g. IDENTITY (2435457, 1)
. You might be able to include both the ALTER TABLE...SWITCH
and the DBCC CHECKIDENT
in a transaction (or not– haven’t tested this) but seems like setting the seed value manually will be easier and safer.
Obviously, if no new rows are being added to the table (or they’re only added occasionally, like a daily ETL process) then this race condition won’t happen so DBCC CHECKIDENT
is fine.
You cannot alter a column to be an IDENTITY column. What you’ll need to do is create a new column which is defined as an IDENTITY from the get-go, then drop the old column, and rename the new one to the old name.
ALTER TABLE (yourTable) ADD NewColumn INT IDENTITY(1,1)
ALTER TABLE (yourTable) DROP COLUMN OldColumnName
EXEC sp_rename 'yourTable.NewColumn', 'OldColumnName', 'COLUMN'
Marc
Given that indexing is so important as your data set increases in size, can someone explain how indexing works at a database-agnostic level?
For information on queries to index a field, check out How do I index a database column.
Why is it needed?
When data is stored on disk-based storage devices, it is stored as blocks of data. These blocks are accessed in their entirety, making them the atomic disk access operation. Disk blocks are structured in much the same way as linked lists; both contain a section for data, a pointer to the location of the next node (or block), and both need not be stored contiguously.
Due to the fact that a number of records can only be sorted on one field, we can state that searching on a field that isn’t sorted requires a Linear Search which requires N/2
block accesses (on average), where N
is the number of blocks that the table spans. If that field is a non-key field (i.e. doesn’t contain unique entries) then the entire tablespace must be searched at N
block accesses.
Whereas with a sorted field, a Binary Search may be used, which has log2 N
block accesses. Also since the data is sorted given a non-key field, the rest of the table doesn’t need to be searched for duplicate values, once a higher value is found. Thus the performance increase is substantial.
What is indexing?
Indexing is a way of sorting a number of records on multiple fields. Creating an index on a field in a table creates another data structure which holds the field value, and a pointer to the record it relates to. This index structure is then sorted, allowing Binary Searches to be performed on it.
The downside to indexing is that these indices require additional space on the disk since the indices are stored together in a table using the MyISAM engine, this file can quickly reach the size limits of the underlying file system if many fields within the same table are indexed.
How does it work?
Firstly, let’s outline a sample database table schema;
Field name Data type Size on disk id (Primary key) Unsigned INT 4 bytes firstName Char(50) 50 bytes lastName Char(50) 50 bytes emailAddress Char(100) 100 bytes
Note: char was used in place of varchar to allow for an accurate size on disk value. This sample database contains five million rows and is unindexed. The performance of several queries will now be analyzed. These are a query using the id (a sorted key field) and one using the firstName (a non-key unsorted field).
Example 1 - sorted vs unsorted fields
Given our sample database of r = 5,000,000
records of a fixed size giving a record length of R = 204
bytes and they are stored in a table using the MyISAM engine which is using the default block size B = 1,024
bytes. The blocking factor of the table would be bfr = (B/R) = 1024/204 = 5
records per disk block. The total number of blocks required to hold the table is N = (r/bfr) = 5000000/5 = 1,000,000
blocks.
A linear search on the id field would require an average of N/2 = 500,000
block accesses to find a value, given that the id field is a key field. But since the id field is also sorted, a binary search can be conducted requiring an average of log2 1000000 = 19.93 = 20
block accesses. Instantly we can see this is a drastic improvement.
Now the firstName field is neither sorted nor a key field, so a binary search is impossible, nor are the values unique, and thus the table will require searching to the end for an exact N = 1,000,000
block accesses. It is this situation that indexing aims to correct.
Given that an index record contains only the indexed field and a pointer to the original record, it stands to reason that it will be smaller than the multi-field record that it points to. So the index itself requires fewer disk blocks than the original table, which therefore requires fewer block accesses to iterate through. The schema for an index on the firstName field is outlined below;
Field name Data type Size on disk firstName Char(50) 50 bytes (record pointer) Special 4 bytes
Note: Pointers in MySQL are 2, 3, 4 or 5 bytes in length depending on the size of the table.
Example 2 - indexing
Given our sample database of r = 5,000,000
records with an index record length of R = 54
bytes and using the default block size B = 1,024
bytes. The blocking factor of the index would be bfr = (B/R) = 1024/54 = 18
records per disk block. The total number of blocks required to hold the index is N = (r/bfr) = 5000000/18 = 277,778
blocks.
Now a search using the firstName field can utilize the index to increase performance. This allows for a binary search of the index with an average of log2 277778 = 18.08 = 19
block accesses. To find the address of the actual record, which requires a further block access to read, bringing the total to 19 + 1 = 20
block accesses, a far cry from the 1,000,000 block accesses required to find a firstName match in the non-indexed table.
When should it be used?
Given that creating an index requires additional disk space (277,778 blocks extra from the above example, a ~28% increase), and that too many indices can cause issues arising from the file systems size limits, careful thought must be used to select the correct fields to index.
Since indices are only used to speed up the searching for a matching field within the records, it stands to reason that indexing fields used only for output would be simply a waste of disk space and processing time when doing an insert or delete operation, and thus should be avoided. Also given the nature of a binary search, the cardinality or uniqueness of the data is important. Indexing on a field with a cardinality of 2 would split the data in half, whereas a cardinality of 1,000 would return approximately 1,000 records. With such a low cardinality the effectiveness is reduced to a linear sort, and the query optimizer will avoid using the index if the cardinality is less than 30% of the record number, effectively making the index a waste of space.
The first time I read this it was very helpful to me. Thank you.
Since then I gained some insight about the downside of creating indexes: if you write into a table (UPDATE
or INSERT
) with one index, you have actually two writing operations in the file system. One for the table data and another one for the index data (and the resorting of it (and - if clustered - the resorting of the table data)). If table and index are located on the same hard disk this costs more time. Thus a table without an index (a heap) , would allow for quicker write operations. (if you had two indexes you would end up with three write operations, and so on)
However, defining two different locations on two different hard disks for index data and table data can decrease/eliminate the problem of increased cost of time. This requires definition of additional file groups with according files on the desired hard disks and definition of table/index location as desired.
Another problem with indexes is their fragmentation over time as data is inserted. REORGANIZE
helps, you must write routines to have it done.
In certain scenarios a heap is more helpful than a table with indexes,
e.g:- If you have lots of rivalling writes but only one nightly read outside business hours for reporting.
Also, a differentiation between clustered and non-clustered indexes is rather important.
Helped me:- What do Clustered and Non clustered index actually mean?
Classic example “Index in Books”
Consider a “Book” of 1000 pages, divided by 100 sections, each section with X pages.
Simple, huh?
Now, without an index page, to find a particular section that starts with letter “S”, you have no other option than scanning through the entire book. i.e: 1000 pages
But with an index page at the beginning, you are there. And more, to read any particular section that matters, you just need to look over the index page, again and again, every time. After finding the matching index you can efficiently jump to the section by skipping other sections.
But then, in addition to 1000 pages, you will need another ~10 pages to display the index page, so totally 1010 pages.
Thus, the index is a separate section that stores values of indexed column + pointer to the indexed row in a sorted order for efficient look-ups.
Things are simple in schools, isn’t it? :P
I am trying to accomplish the following sql statement but I want it to return all columns is this possible? Something like:
You’re looking for a group by:
Which can occasionally be written with a distinct on statement:
On most platforms, however, neither of the above will work because the behavior on the other columns is unspecified. (The first works in MySQL, if that’s what you’re using.)
You could fetch the distinct fields and stick to picking a single arbitrary row each time.
On some platforms (e.g. PostgreSQL, Oracle, T-SQL) this can be done directly using window functions:
select *
from (
select *,
row_number() over (partition by field1 order by field2) as row_number
from table
) as rows
where row_number = 1
On others (MySQL, SQLite), you’ll need to write subqueries that will make you join the entire table with itself (example), so not recommended.
From the phrasing of your question, I understand that you want to select the distinct values for a given field and for each such value to have all the other column values in the same row listed. Most DBMSs will not allow this with neither DISTINCT
nor GROUP BY
, because the result is not determined.
Think of it like this: if your field1
occurs more than once, what value of field2
will be listed (given that you have the same value for field1
in two rows but two distinct values of field2
in those two rows).
You can however use aggregate functions (explicitely for every field that you want to be shown) and using a GROUP BY
instead of DISTINCT
:
If I understood your problem correctly, it’s similar to one I just had. You want to be able limit the usability of DISTINCT to a specified field, rather than applying it to all the data.
If you use GROUP BY without an aggregate function, which ever field you GROUP BY will be your DISTINCT filed.
If you make your query:
It will show all your results based on a single instance of field1.
For example, if you have a table with name, address and city. A single person has multiple addresses recorded, but you just want a single address for the person, you can query as follows:
The result will be that only one instance of that name will appear with its address, and the other one will be omitted from the resulting table. Caution: if your fileds have atomic values such as firstName, lastName you want to group by both.
because if two people have the same last name and you only group by lastName, one of those persons will be omitted from the results. You need to keep those things into consideration. Hope this helps.
Possible Duplicate:
Difference between CTE and SubQuery?
I was trying to understand how to use the WITH
clause and the purpose of the WITH
clause.
All I understood was, the WITH
clause was a replacement for normal sub-queries.
Can anyone explain this to me with a small example in detail ?
The SQL WITH clause was introduced by Oracle in the Oracle 9i release 2 database. The SQL WITH clause allows you to give a sub-query block a name (a process also called sub-query refactoring), which can be referenced in several places within the main SQL query. The name assigned to the sub-query is treated as though it was an inline view or table. The SQL WITH clause is basically a drop-in replacement to the normal sub-query.
Syntax For The SQL WITH Clause
The following is the syntax of the SQL WITH clause when using a single sub-query alias.
WITH <alias_name> AS (sql_subquery_statement)
SELECT column_list FROM <alias_name>[,table_name]
[WHERE <join_condition>]
When using multiple sub-query aliases, the syntax is as follows.
WITH <alias_name_A> AS (sql_subquery_statement),
<alias_name_B> AS(sql_subquery_statement_from_alias_name_A
or sql_subquery_statement )
SELECT <column_list>
FROM <alias_name_A>, <alias_name_B> [,table_names]
[WHERE <join_condition>]
In the syntax documentation above, the occurrences of alias_name
is a meaningful name you would give to the sub-query after the AS clause. Each sub-query should be separated with a comma Example for WITH statement. The rest of the queries follow the standard formats for simple and complex SQL SELECT queries.
For more information: http://www.brighthub.com/internet/web-development/articles/91893.aspx
This has been fully answered here.
See Oracle’s docs on SELECT to see how subquery factoring works, and Mark’s example:
WITH employee AS (SELECT * FROM Employees)
SELECT * FROM employee WHERE ID < 20
UNION ALL
SELECT * FROM employee WHERE Sex = 'M'
I was given a MySQL database file that I need to restore as a database on my Windows Server 2008 machine.
I tried using MySQL Administrator, but I got the following error:
The selected file was generated by mysqldump and cannot be restored by this application.
How do I get this working?
It should be as simple as running this:
If the dump is of a single database you may have to add a line at the top of the file:
If it was a dump of many databases, the use statements are already in there.
To run these commands, open up a command prompt (in Windows) and cd
to the directory where the mysql.exe
executable is (you may have to look around a bit for it, it’ll depend on how you installed mysql, i.e. standalone or as part of a package like WAMP). Once you’re in that directory, you should be able to just type the command as I have it above.
If the database you want to restore doesn’t already exist, you need to create it first.
On the command-line, if you’re in the same directory that contains the dumped file, use these commands (with appropriate substitutions):
You simply need to run this:
If the dump contains multiple databases you should omit the database name:
To run these commands, open up a command prompt (in Windows) and cd
to the directory where the mysql.exe
executable is (you may have to look around a bit for it, it’ll depend on how you installed mysql, i.e. standalone or as part of a package like WAMP). Once you’re in that directory, you should be able to just type the command.
I want to use the ROW_NUMBER()
to get…
max(ROW_NUMBER())
–> Or i guess this would also be the count of all rows
I tried doing:
but it didn’t seem to work…
ROW_NUMBER()
using a given piece of information, ie. if I have a name and I want to know what row the name came from.
I assume it would be something similar to what I tried for #1
but this didn’t work either…
Any Ideas?
For the first question, why not just use?
to get the count.
And for the second question, the primary key of the row is what should be used to identify a particular row. Don’t try and use the row number for that.
If you returned Row_Number() in your main query,
Then when you want to go 5 rows back then you can take the current row number and use the following query to determine the row with currentrow -5
Though I agree with others that you could use count()
to get the total number of rows, here is how you can use the row_count()
:
To get the total no of rows:
with temp as (
select row_number() over (order by id) as rownum
from table_name
)
select max(rownum) from temp```</li>
<li><p>To get the row numbers where name is Matt:</p>
```sql
with temp as (
select name, row_number() over (order by id) as rownum
from table_name
)
select rownum from temp where name like 'Matt'```</li>
</ol>
You can further use `min(rownum)` or `max(rownum)` to get the first or last row for Matt respectively.
These were very simple implementations of `row_number()`. You can use it for more complex grouping. Check out my response on <a href="https://stackoverflow.com/questions/968305/advanced-grouping-without-using-a-sub-query/968983#968983">Advanced grouping without using a sub query</a>
</b> </em> </i> </small> </strong> </sub> </sup>
### 68: SQL - Rounding off to 2 decimal places (score [791521](https://stackoverflow.com/q/10380197.html) in 2012)
#### Question
I need to convert minutes to hours, rounded off to 2 decimal places.I also need to display only up to 2 numbers after the decimal point. So if I have minutes as 650.Then hours should be 10.83
Here's what I have so far:
```sql
Select round(Minutes/60.0,2) from ....
But in this case, if my minutes is, say,630 - hours is 10.5000000. But I want it as 10.50 only(after rounding). How do I achieve this?
Could you not cast your result as numeric(x,2)
? Where x <= 38
Returns
As with SQL Server 2012, you can use the built-in format function:
(just for further readings…)
you can use
I am trying to sort by multiple columns in SQL, and in different directions. column1
would be sorted descending, and column2
ascending.
How can I do this?
This sorts everything by column1
(descending) first, and then by column2
(ascending, which is the default) whenever the column1
fields for two or more rows are equal.
The other answers lack a concrete example, so here it goes:
Given the following People table:
FirstName | LastName | YearOfBirth
----------------------------------------
Thomas | Alva Edison | 1847
Benjamin | Franklin | 1706
Thomas | More | 1478
Thomas | Jefferson | 1826
If you execute the query below:
The result set will look like this:
I’m looking for an efficient way to convert rows to columns in SQL server, I heard that PIVOT is not very fast, and I need to deal with lot of records.
This is my example:
-------------------------------
| Id | Value | ColumnName |
-------------------------------
| 1 | John | FirstName |
| 2 | 2.4 | Amount |
| 3 | ZH1E4A | PostalCode |
| 4 | Fork | LastName |
| 5 | 857685 | AccountNumber |
-------------------------------
This is my result:
---------------------------------------------------------------------
| FirstName |Amount| PostalCode | LastName | AccountNumber |
---------------------------------------------------------------------
| John | 2.4 | ZH1E4A | Fork | 857685 |
---------------------------------------------------------------------
How can I build the result?
There are several ways that you can transform data from multiple rows into columns.
PIVOT
In SQL Server you can use the PIVOT
function to transform the data from rows to columns:
select Firstname, Amount, PostalCode, LastName, AccountNumber
from
(
select value, columnname
from yourtable
) d
pivot
(
max(value)
for columnname in (Firstname, Amount, PostalCode, LastName, AccountNumber)
) piv;
See Demo.
columnnames
If you have an unknown number of columnnames
that you want to transpose, then you can use dynamic SQL:
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
select @cols = STUFF((SELECT ',' + QUOTENAME(ColumnName)
from yourtable
group by ColumnName, id
order by id
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = N'SELECT ' + @cols + N' from
(
select value, ColumnName
from yourtable
) x
pivot
(
max(value)
for ColumnName in (' + @cols + N')
) p '
exec sp_executesql @query;
See Demo.
If you do not want to use the PIVOT
function, then you can use an aggregate function with a CASE
expression:
select
max(case when columnname = 'FirstName' then value end) Firstname,
max(case when columnname = 'Amount' then value end) Amount,
max(case when columnname = 'PostalCode' then value end) PostalCode,
max(case when columnname = 'LastName' then value end) LastName,
max(case when columnname = 'AccountNumber' then value end) AccountNumber
from yourtable
See Demo.
This could also be completed using multiple joins, but you will need some column to associate each of the rows which you do not have in your sample data. But the basic syntax would be:
select fn.value as FirstName,
a.value as Amount,
pc.value as PostalCode,
ln.value as LastName,
an.value as AccountNumber
from yourtable fn
left join yourtable a
on fn.somecol = a.somecol
and a.columnname = 'Amount'
left join yourtable pc
on fn.somecol = pc.somecol
and pc.columnname = 'PostalCode'
left join yourtable ln
on fn.somecol = ln.somecol
and ln.columnname = 'LastName'
left join yourtable an
on fn.somecol = an.somecol
and an.columnname = 'AccountNumber'
where fn.columnname = 'Firstname'
as pivoting data is still a hot one I decided to add something form me. This is rather a method than just a single script but gives you much more possibilities. First of all There are 3 scripts you need to deploy: 1) User defined TABLE type [ColumnActionList
] -> holds data as parameter 2) SP [proc_PivotPrepare
] -> prepares our data 3) SP [proc_PivotExecute
] -> execute the script
CREATE TYPE [dbo].[ColumnActionList] AS TABLE(
[ID] [smallint] NOT NULL,
[ColumnName] [nvarchar](128.html) NOT NULL,
[Action] [nchar](1.html) NOT NULL
);
GO
CREATE PROCEDURE [dbo].[proc_PivotPrepare]
(
@DB_Name nvarchar(128),
@TableName nvarchar(128)
)
AS
----------------------------------------------------------------------------------------------------
-----| Author: Bartosz
----------------------------------------------------------------------------------------------------
SELECT @DB_Name = ISNULL(@DB_Name,db_name())
DECLARE @SQL_Code nvarchar(max)
DECLARE @MyTab TABLE (ID smallint identity(1,1), [Column_Name] nvarchar(128), [Type] nchar(1), [Set Action SQL] nvarchar(max));
SELECT @SQL_Code = 'SELECT [<| SQL_Code |>] = '' '' '
+ 'UNION ALL '
+ 'SELECT ''----------------------------------------------------------------------------------------------------'' '
+ 'UNION ALL '
+ 'SELECT ''-----| Declare user defined type [ID] / [ColumnName] / [PivotAction] '' '
+ 'UNION ALL '
+ 'SELECT ''----------------------------------------------------------------------------------------------------'' '
+ 'UNION ALL '
+ 'SELECT ''DECLARE @ColumnListWithActions ColumnActionList;'''
+ 'UNION ALL '
+ 'SELECT ''----------------------------------------------------------------------------------------------------'' '
+ 'UNION ALL '
+ 'SELECT ''-----| Set [PivotAction] (''''S'''' as default) to select dimentions and values '' '
+ 'UNION ALL '
+ 'SELECT ''-----|'''
+ 'UNION ALL '
+ 'SELECT ''-----| ''''S'''' = Stable column || ''''D'''' = Dimention column || ''''V'''' = Value column '' '
+ 'UNION ALL '
+ 'SELECT ''----------------------------------------------------------------------------------------------------'' '
+ 'UNION ALL '
+ 'SELECT ''INSERT INTO @ColumnListWithActions VALUES ('' + CAST( ROW_NUMBER() OVER (ORDER BY [NAME]) as nvarchar(10)) + '', '' + '''''''' + [NAME] + ''''''''+ '', ''''S'''');'''
+ 'FROM [' + @DB_Name + '].sys.columns '
+ 'WHERE object_id = object_id(''[' + @DB_Name + ']..[' + @TableName + ']'') '
+ 'UNION ALL '
+ 'SELECT ''----------------------------------------------------------------------------------------------------'' '
+ 'UNION ALL '
+ 'SELECT ''-----| Execute sp_PivotExecute with parameters: columns and dimentions and main table name'' '
+ 'UNION ALL '
+ 'SELECT ''----------------------------------------------------------------------------------------------------'' '
+ 'UNION ALL '
+ 'SELECT ''EXEC [dbo].[sp_PivotExecute] @ColumnListWithActions, ' + '''''' + @TableName + '''''' + ';'''
+ 'UNION ALL '
+ 'SELECT ''----------------------------------------------------------------------------------------------------'' '
EXECUTE SP_EXECUTESQL @SQL_Code;
GO
CREATE PROCEDURE [dbo].[proc_PivotExecute]
(
@ColumnListWithActions ColumnActionList ReadOnly
,@TableName nvarchar(128)
)
AS
--#######################################################################################################################
--###| Author: Bartosz
--#######################################################################################################################
--#######################################################################################################################
--###| Step 1 - Select our user-defined-table-variable into temp table
--#######################################################################################################################
IF OBJECT_ID('tempdb.dbo.#ColumnListWithActions', 'U') IS NOT NULL DROP TABLE #ColumnListWithActions;
SELECT * INTO #ColumnListWithActions FROM @ColumnListWithActions;
--#######################################################################################################################
--###| Step 2 - Preparing lists of column groups as strings:
--#######################################################################################################################
DECLARE @ColumnName nvarchar(128)
DECLARE @Destiny nchar(1)
DECLARE @ListOfColumns_Stable nvarchar(max)
DECLARE @ListOfColumns_Dimension nvarchar(max)
DECLARE @ListOfColumns_Variable nvarchar(max)
--############################
--###| Cursor for List of Stable Columns
--############################
DECLARE ColumnListStringCreator_S CURSOR FOR
SELECT [ColumnName]
FROM #ColumnListWithActions
WHERE [Action] = 'S'
OPEN ColumnListStringCreator_S;
FETCH NEXT FROM ColumnListStringCreator_S
INTO @ColumnName
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @ListOfColumns_Stable = ISNULL(@ListOfColumns_Stable, '') + ' [' + @ColumnName + '] ,';
FETCH NEXT FROM ColumnListStringCreator_S INTO @ColumnName
END
CLOSE ColumnListStringCreator_S;
DEALLOCATE ColumnListStringCreator_S;
--############################
--###| Cursor for List of Dimension Columns
--############################
DECLARE ColumnListStringCreator_D CURSOR FOR
SELECT [ColumnName]
FROM #ColumnListWithActions
WHERE [Action] = 'D'
OPEN ColumnListStringCreator_D;
FETCH NEXT FROM ColumnListStringCreator_D
INTO @ColumnName
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @ListOfColumns_Dimension = ISNULL(@ListOfColumns_Dimension, '') + ' [' + @ColumnName + '] ,';
FETCH NEXT FROM ColumnListStringCreator_D INTO @ColumnName
END
CLOSE ColumnListStringCreator_D;
DEALLOCATE ColumnListStringCreator_D;
--############################
--###| Cursor for List of Variable Columns
--############################
DECLARE ColumnListStringCreator_V CURSOR FOR
SELECT [ColumnName]
FROM #ColumnListWithActions
WHERE [Action] = 'V'
OPEN ColumnListStringCreator_V;
FETCH NEXT FROM ColumnListStringCreator_V
INTO @ColumnName
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @ListOfColumns_Variable = ISNULL(@ListOfColumns_Variable, '') + ' [' + @ColumnName + '] ,';
FETCH NEXT FROM ColumnListStringCreator_V INTO @ColumnName
END
CLOSE ColumnListStringCreator_V;
DEALLOCATE ColumnListStringCreator_V;
SELECT @ListOfColumns_Variable = LEFT(@ListOfColumns_Variable, LEN(@ListOfColumns_Variable) - 1);
SELECT @ListOfColumns_Dimension = LEFT(@ListOfColumns_Dimension, LEN(@ListOfColumns_Dimension) - 1);
SELECT @ListOfColumns_Stable = LEFT(@ListOfColumns_Stable, LEN(@ListOfColumns_Stable) - 1);
--#######################################################################################################################
--###| Step 3 - Preparing table with all possible connections between Dimension columns excluding NULLs
--#######################################################################################################################
DECLARE @DIM_TAB TABLE ([DIM_ID] smallint, [ColumnName] nvarchar(128))
INSERT INTO @DIM_TAB
SELECT [DIM_ID] = ROW_NUMBER() OVER(ORDER BY [ColumnName]), [ColumnName] FROM #ColumnListWithActions WHERE [Action] = 'D';
DECLARE @DIM_ID smallint;
SELECT @DIM_ID = 1;
DECLARE @SQL_Dimentions nvarchar(max);
IF OBJECT_ID('tempdb.dbo.##ALL_Dimentions', 'U') IS NOT NULL DROP TABLE ##ALL_Dimentions;
SELECT @SQL_Dimentions = 'SELECT ID = ROW_NUMBER() OVER (ORDER BY ' + @ListOfColumns_Dimension + '), ' + @ListOfColumns_Dimension
+ ' INTO ##ALL_Dimentions '
+ ' FROM (SELECT DISTINCT' + @ListOfColumns_Dimension + ' FROM ' + @TableName
+ ' WHERE ' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = @DIM_ID) + ' IS NOT NULL ';
SELECT @DIM_ID = @DIM_ID + 1;
WHILE @DIM_ID <= (SELECT MAX([DIM_ID]) FROM @DIM_TAB)
BEGIN
SELECT @SQL_Dimentions = @SQL_Dimentions + 'AND ' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = @DIM_ID) + ' IS NOT NULL ';
SELECT @DIM_ID = @DIM_ID + 1;
END
SELECT @SQL_Dimentions = @SQL_Dimentions + ' )x';
EXECUTE SP_EXECUTESQL @SQL_Dimentions;
--#######################################################################################################################
--###| Step 4 - Preparing table with all possible connections between Stable columns excluding NULLs
--#######################################################################################################################
DECLARE @StabPos_TAB TABLE ([StabPos_ID] smallint, [ColumnName] nvarchar(128))
INSERT INTO @StabPos_TAB
SELECT [StabPos_ID] = ROW_NUMBER() OVER(ORDER BY [ColumnName]), [ColumnName] FROM #ColumnListWithActions WHERE [Action] = 'S';
DECLARE @StabPos_ID smallint;
SELECT @StabPos_ID = 1;
DECLARE @SQL_MainStableColumnTable nvarchar(max);
IF OBJECT_ID('tempdb.dbo.##ALL_StableColumns', 'U') IS NOT NULL DROP TABLE ##ALL_StableColumns;
SELECT @SQL_MainStableColumnTable = 'SELECT ID = ROW_NUMBER() OVER (ORDER BY ' + @ListOfColumns_Stable + '), ' + @ListOfColumns_Stable
+ ' INTO ##ALL_StableColumns '
+ ' FROM (SELECT DISTINCT' + @ListOfColumns_Stable + ' FROM ' + @TableName
+ ' WHERE ' + (SELECT [ColumnName] FROM @StabPos_TAB WHERE [StabPos_ID] = @StabPos_ID) + ' IS NOT NULL ';
SELECT @StabPos_ID = @StabPos_ID + 1;
WHILE @StabPos_ID <= (SELECT MAX([StabPos_ID]) FROM @StabPos_TAB)
BEGIN
SELECT @SQL_MainStableColumnTable = @SQL_MainStableColumnTable + 'AND ' + (SELECT [ColumnName] FROM @StabPos_TAB WHERE [StabPos_ID] = @StabPos_ID) + ' IS NOT NULL ';
SELECT @StabPos_ID = @StabPos_ID + 1;
END
SELECT @SQL_MainStableColumnTable = @SQL_MainStableColumnTable + ' )x';
EXECUTE SP_EXECUTESQL @SQL_MainStableColumnTable;
--#######################################################################################################################
--###| Step 5 - Preparing table with all options ID
--#######################################################################################################################
DECLARE @FULL_SQL_1 NVARCHAR(MAX)
SELECT @FULL_SQL_1 = ''
DECLARE @i smallint
IF OBJECT_ID('tempdb.dbo.##FinalTab', 'U') IS NOT NULL DROP TABLE ##FinalTab;
SELECT @FULL_SQL_1 = 'SELECT t.*, dim.[ID] '
+ ' INTO ##FinalTab '
+ 'FROM ' + @TableName + ' t '
+ 'JOIN ##ALL_Dimentions dim '
+ 'ON t.' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = 1) + ' = dim.' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = 1);
SELECT @i = 2
WHILE @i <= (SELECT MAX([DIM_ID]) FROM @DIM_TAB)
BEGIN
SELECT @FULL_SQL_1 = @FULL_SQL_1 + ' AND t.' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = @i) + ' = dim.' + (SELECT [ColumnName] FROM @DIM_TAB WHERE [DIM_ID] = @i)
SELECT @i = @i +1
END
EXECUTE SP_EXECUTESQL @FULL_SQL_1
--#######################################################################################################################
--###| Step 6 - Selecting final data
--#######################################################################################################################
DECLARE @STAB_TAB TABLE ([STAB_ID] smallint, [ColumnName] nvarchar(128))
INSERT INTO @STAB_TAB
SELECT [STAB_ID] = ROW_NUMBER() OVER(ORDER BY [ColumnName]), [ColumnName]
FROM #ColumnListWithActions WHERE [Action] = 'S';
DECLARE @VAR_TAB TABLE ([VAR_ID] smallint, [ColumnName] nvarchar(128))
INSERT INTO @VAR_TAB
SELECT [VAR_ID] = ROW_NUMBER() OVER(ORDER BY [ColumnName]), [ColumnName]
FROM #ColumnListWithActions WHERE [Action] = 'V';
DECLARE @y smallint;
DECLARE @x smallint;
DECLARE @z smallint;
DECLARE @FinalCode nvarchar(max)
SELECT @FinalCode = ' SELECT ID1.*'
SELECT @y = 1
WHILE @y <= (SELECT MAX([ID]) FROM ##FinalTab)
BEGIN
SELECT @z = 1
WHILE @z <= (SELECT MAX([VAR_ID]) FROM @VAR_TAB)
BEGIN
SELECT @FinalCode = @FinalCode + ', [ID' + CAST((@y) as varchar(10)) + '.' + (SELECT [ColumnName] FROM @VAR_TAB WHERE [VAR_ID] = @z) + '] = ID' + CAST((@y + 1) as varchar(10)) + '.' + (SELECT [ColumnName] FROM @VAR_TAB WHERE [VAR_ID] = @z)
SELECT @z = @z + 1
END
SELECT @y = @y + 1
END
SELECT @FinalCode = @FinalCode +
' FROM ( SELECT * FROM ##ALL_StableColumns)ID1';
SELECT @y = 1
WHILE @y <= (SELECT MAX([ID]) FROM ##FinalTab)
BEGIN
SELECT @x = 1
SELECT @FinalCode = @FinalCode
+ ' LEFT JOIN (SELECT ' + @ListOfColumns_Stable + ' , ' + @ListOfColumns_Variable
+ ' FROM ##FinalTab WHERE [ID] = '
+ CAST(@y as varchar(10)) + ' )ID' + CAST((@y + 1) as varchar(10))
+ ' ON 1 = 1'
WHILE @x <= (SELECT MAX([STAB_ID]) FROM @STAB_TAB)
BEGIN
SELECT @FinalCode = @FinalCode + ' AND ID1.' + (SELECT [ColumnName] FROM @STAB_TAB WHERE [STAB_ID] = @x) + ' = ID' + CAST((@y+1) as varchar(10)) + '.' + (SELECT [ColumnName] FROM @STAB_TAB WHERE [STAB_ID] = @x)
SELECT @x = @x +1
END
SELECT @y = @y + 1
END
SELECT * FROM ##ALL_Dimentions;
EXECUTE SP_EXECUTESQL @FinalCode;
--#######################################################################################################################
From executing the first query (by passing source DB and table name) you will get a pre-created execution query for the second SP, all you have to do is define is the column from your source: + Stable + Value (will be used to concentrate values based on that) + Dim (column you want to use to pivot by)
Names and datatypes will be defined automatically!
I cant recommend it for any production environments but does the job for adhoc BI requests.
What is the best way to get IDENTITY
of inserted row?
I know about @@IDENTITY
and IDENT_CURRENT
and SCOPE_IDENTITY
but don’t understand the pros and cons attached to each.
Can someone please explain the differences and when I should be using each?
@@IDENTITY
returns the last identity value generated for any table in the current session, across all scopes. You need to be careful here, since it’s across scopes. You could get a value from a trigger, instead of your current statement.
SCOPE_IDENTITY()
returns the last identity value generated for any table in the current session and the current scope. Generally what you want to use.
IDENT_CURRENT('tableName')
returns the last identity value generated for a specific table in any session and any scope. This lets you specify which table you want the value from, in case the two above aren’t quite what you need (very rare). Also, as @Guy Starbuck mentioned, “You could use this if you want to get the current IDENTITY value for a table that you have not inserted a record into.”
The OUTPUT
clause of the INSERT
statement will let you access every row that was inserted via that statement. Since it’s scoped to the specific statement, it’s more straightforward than the other functions above. However, it’s a little more verbose (you’ll need to insert into a table variable/temp table and then query that) and it gives results even in an error scenario where the statement is rolled back. That said, if your query uses a parallel execution plan, this is the only guaranteed method for getting the identity (short of turning off parallelism). However, it is executed before triggers and cannot be used to return trigger-generated values.
I believe the safest and most accurate method of retrieving the inserted id would be using the output clause.
for example (taken from the following MSDN article)
USE AdventureWorks2008R2;
GO
DECLARE @MyTableVar table( NewScrapReasonID smallint,
Name varchar(50),
ModifiedDate datetime);
INSERT Production.ScrapReason
OUTPUT INSERTED.ScrapReasonID, INSERTED.Name, INSERTED.ModifiedDate
INTO @MyTableVar
VALUES (N'Operator error', GETDATE());
--Display the result set of the table variable.
SELECT NewScrapReasonID, Name, ModifiedDate FROM @MyTableVar;
--Display the result set of the table.
SELECT ScrapReasonID, Name, ModifiedDate
FROM Production.ScrapReason;
GO
I’m saying the same thing as the other guys, so everyone’s correct, I’m just trying to make it more clear.
@@IDENTITY
returns the id of the last thing that was inserted by your client’s connection to the database.
Most of the time this works fine, but sometimes a trigger will go and insert a new row that you don’t know about, and you’ll get the ID from this new row, instead of the one you want
SCOPE_IDENTITY()
solves this problem. It returns the id of the last thing that you inserted in the SQL code you sent to the database. If triggers go and create extra rows, they won’t cause the wrong value to get returned. Hooray
IDENT_CURRENT
returns the last ID that was inserted by anyone. If some other app happens to insert another row at an unforunate time, you’ll get the ID of that row instead of your one.
If you want to play it safe, always use SCOPE_IDENTITY()
. If you stick with @@IDENTITY
and someone decides to add a trigger later on, all your code will break.
I am working with a query which contains “CASE” statement within “WHERE” clause. But SQL Server 2008 is giving some errors while executing it. Can anyone please help me with the correct query? Here is the query:
SELECT
tl.storenum 'Store #',
co.ccnum 'FuelFirst Card #',
co.dtentered 'Date Entered',
CASE st.reasonid
WHEN 1 THEN 'Active'
WHEN 2 THEN 'Not Active'
WHEN 0 THEN st.ccstatustypename
ELSE 'Unknown'
END 'Status',
CASE st.ccstatustypename
WHEN 'Active' THEN ' '
WHEN 'Not Active' THEN ' '
ELSE st.ccstatustypename
END 'Reason',
UPPER(REPLACE(REPLACE(co.personentered,'RT\\\\',''),'RACETRAC\\\\','')) 'Person Entered',
co.comments 'Comments or Notes'
FROM
comments co
INNER JOIN cards cc ON co.ccnum=cc.ccnum
INNER JOIN customerinfo ci ON cc.customerinfoid=ci.customerinfoid
INNER JOIN ccstatustype st ON st.ccstatustypeid=cc.ccstatustypeid
INNER JOIN customerstatus cs ON cs.customerstatuscd=ci.customerstatuscd
INNER JOIN transactionlog tl ON tl.transactionlogid=co.transactionlogid
LEFT JOIN stores s ON s.StoreNum = tl.StoreNum
WHERE
CASE LEN('TestPerson')
WHEN 0 THEN co.personentered = co.personentered
ELSE co.personentered LIKE '%TestPerson'
END
AND cc.ccnum = CASE LEN('TestFFNum')
WHEN 0 THEN cc.ccnum
ELSE 'TestFFNum'
END
AND CASE LEN('2011-01-09 11:56:29.327')
WHEN 0 THEN co.DTEntered = co.DTEntered
ELSE
CASE LEN('2012-01-09 11:56:29.327')
WHEN 0 THEN co.DTEntered >= '2011-01-09 11:56:29.327'
ELSE co.DTEntered BETWEEN '2011-01-09 11:56:29.327' AND '2012-01-09 11:56:29.327'
END
END
AND tl.storenum < 699
ORDER BY tl.StoreNum
First off, the CASE
statement must be part of the expression, not the expression itself.
In other words, you can have:
But it won’t work the way you have written them eg:
WHERE
CASE LEN('TestPerson')
WHEN 0 THEN co.personentered = co.personentered
ELSE co.personentered LIKE '%TestPerson'
END
You may have better luck using combined OR statements like this:
WHERE (
(LEN('TestPerson') = 0
AND co.personentered = co.personentered
)
OR
(LEN('TestPerson') <> 0
AND co.personentered LIKE '%TestPerson')
)
Although, either way I’m not sure how great of a query plan you’ll get. These types of shenanigans in a WHERE
clause will often prevent the query optimizer from utilizing indexes.
This should solve your problem for the time being but I must remind you it isn’t a good approach :
WHERE
CASE LEN('TestPerson')
WHEN 0 THEN
CASE WHEN co.personentered = co.personentered THEN 1 ELSE 0 END
ELSE
CASE WHEN co.personentered LIKE '%TestPerson' THEN 1 ELSE 0 END
END = 1
AND cc.ccnum = CASE LEN('TestFFNum')
WHEN 0 THEN cc.ccnum
ELSE 'TestFFNum'
END
AND CASE LEN('2011-01-09 11:56:29.327')
WHEN 0 THEN CASE WHEN co.DTEntered = co.DTEntered THEN 1 ELSE 0 END
ELSE
CASE LEN('2012-01-09 11:56:29.327')
WHEN 0 THEN
CASE WHEN co.DTEntered >= '2011-01-09 11:56:29.327' THEN 1 ELSE 0 END
ELSE
CASE WHEN co.DTEntered BETWEEN '2011-01-09 11:56:29.327'
AND '2012-01-09 11:56:29.327'
THEN 1 ELSE 0 END
END
END = 1
AND tl.storenum < 699
How can I delete duplicate rows where no unique row id
exists?
My table is
col1 col2 col3 col4 col5 col6 col7
john 1 1 1 1 1 1
john 1 1 1 1 1 1
sally 2 2 2 2 2 2
sally 2 2 2 2 2 2
I want to be left with the following after the duplicate removal:
I’ve tried a few queries but I think they depend on having a row id as I don’t get the desired result. For example:
I like CTEs and ROW_NUMBER
as the two combined allow us to see which rows are deleted (or updated), therefore just change the DELETE FROM CTE...
to SELECT * FROM CTE
:
WITH CTE AS(
SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7],
RN = ROW_NUMBER()OVER(PARTITION BY col1 ORDER BY col1)
FROM dbo.Table1
)
DELETE FROM CTE WHERE RN > 1
DEMO (result is different; I assume that it’s due to a typo on your part)
This example determines duplicates by a single column col1
because of the PARTITION BY col1
. If you want to include multiple columns simply add them to the PARTITION BY
:
I would prefer CTE for deleting duplicate rows from sql server table
strongly recommend to follow this article ::http://codaffection.com/sql-server-article/delete-duplicate-rows-in-sql-server/
by keeping original
WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY col1,col2,col3 ORDER BY col1,col2,col3) AS RN
FROM MyTable
)
DELETE FROM CTE WHERE RN<>1
without keeping original
WITH CTE AS
(SELECT *,R=RANK() OVER (ORDER BY col1,col2,col3)
FROM MyTable)
DELETE CTE
WHERE R IN (SELECT R FROM CTE GROUP BY R HAVING COUNT(*)>1)
I’ve got a table with two columns, ID
and Value
. I want to change a part of some strings in the second column.
Example of Table:
ID Value
---------------------------------
1 c:\temp\123\abc\111
2 c:\temp\123\abc\222
3 c:\temp\123\abc\333
4 c:\temp\123\abc\444
Now the 123\
in the Value
string is not needed. I tried UPDATE
and REPLACE
:
When I execute the script SQL Server does not report an error, but it does not update anything either. Why is that?
You don’t need wildcards in the REPLACE
- it just finds the string you enter for the second argument, so the following should work:
(I also added the \
in the replace as I assume you don’t need that either)
Try to remove %
chars as below
To make the query run faster in big tables where not every line needs to be updated, you can also choose to only update rows that will be modified:
I have read the stuff on MS pivot tables and I am still having problems getting this correct.
I have a temp table that is being created, we will say that column 1 is a Store number, and column 2 is a week number and lastly column 3 is a total of some type. Also the Week numbers are dynamic, the store numbers are static.
Store Week xCount
------- ---- ------
102 1 96
101 1 138
105 1 37
109 1 59
101 2 282
102 2 212
105 2 78
109 2 97
105 3 60
102 3 123
101 3 220
109 3 87
I would like it to come out as a pivot table, like this:
Store numbers down the side and weeks across the top.
If you are using SQL Server 2005+, then you can use the PIVOT
function to transform the data from rows into columns.
It sounds like you will need to use dynamic sql if the weeks are unknown but it is easier to see the correct code using a hard-coded version initially.
First up, here are some quick table definitions and data for use:
CREATE TABLE #yt
(
[Store] int,
[Week] int,
[xCount] int
);
INSERT INTO #yt
(
[Store],
[Week], [xCount]
)
VALUES
(102, 1, 96),
(101, 1, 138),
(105, 1, 37),
(109, 1, 59),
(101, 2, 282),
(102, 2, 212),
(105, 2, 78),
(109, 2, 97),
(105, 3, 60),
(102, 3, 123),
(101, 3, 220),
(109, 3, 87);
If your values are known, then you will hard-code the query:
select *
from
(
select store, week, xCount
from yt
) src
pivot
(
sum(xcount)
for week in ([1], [2], [3])
) piv;
See SQL Demo
Then if you need to generate the week number dynamically, your code will be:
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
select @cols = STUFF((SELECT ',' + QUOTENAME(Week)
from yt
group by Week
order by Week
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT store,' + @cols + ' from
(
select store, week, xCount
from yt
) x
pivot
(
sum(xCount)
for week in (' + @cols + ')
) p '
execute(@query);
See SQL Demo.
The dynamic version, generates the list of week
numbers that should be converted to columns. Both give the same result:
This is for dynamic # of weeks.
Full example here:SQL Dynamic Pivot
DECLARE @DynamicPivotQuery AS NVARCHAR(MAX)
DECLARE @ColumnName AS NVARCHAR(MAX)
--Get distinct values of the PIVOT Column
SELECT @ColumnName= ISNULL(@ColumnName + ',','') + QUOTENAME(Week)
FROM (SELECT DISTINCT Week FROM #StoreSales) AS Weeks
--Prepare the PIVOT query using the dynamic
SET @DynamicPivotQuery =
N'SELECT Store, ' + @ColumnName + '
FROM #StoreSales
PIVOT(SUM(xCount)
FOR Week IN (' + @ColumnName + ')) AS PVTTable'
--Execute the Dynamic Pivot Query
EXEC sp_executesql @DynamicPivotQuery
I’ve achieved the same thing before by using subqueries. So if your original table was called StoreCountsByWeek, and you had a separate table that listed the Store IDs, then it would look like this:
SELECT StoreID,
Week1=(SELECT ISNULL(SUM(xCount),0) FROM StoreCountsByWeek WHERE StoreCountsByWeek.StoreID=Store.StoreID AND Week=1),
Week2=(SELECT ISNULL(SUM(xCount),0) FROM StoreCountsByWeek WHERE StoreCountsByWeek.StoreID=Store.StoreID AND Week=2),
Week3=(SELECT ISNULL(SUM(xCount),0) FROM StoreCountsByWeek WHERE StoreCountsByWeek.StoreID=Store.StoreID AND Week=3)
FROM Store
ORDER BY StoreID
One advantage to this method is that the syntax is more clear and it makes it easier to join to other tables to pull other fields into the results too.
My anecdotal results are that running this query over a couple of thousand rows completed in less than one second, and I actually had 7 subqueries. But as noted in the comments, it is more computationally expensive to do it this way, so be careful about using this method if you expect it to run on large amounts of data .
I would like to know the following:
I am planning to use this in my (for example - PHP) application, but don’t want to run multiple queries against the database, what options do I have to get data from multiple tables in a single query?
Note: I am writing this as I would like to be able to link to a well written guide on the numerous questions that I constantly come across in the PHP queue, so I can link to this for further detail when I post an answer.
The answers cover off the following:
This answer covers:
Part 1
Part 2
There are a number of ways to retrieve data from multiple tables in a database. In this answer, I will be using ANSI-92 join syntax. This may be different to a number of other tutorials out there which use the older ANSI-89 syntax (and if you are used to 89, may seem much less intuitive - but all I can say is to try it) as it is much easier to understand when the queries start getting more complex. Why use it? Is there a performance gain? The short answer is no, but it is easier to read once you get used to it. It is easier to read queries written by other folks using this syntax.
I am also going to use the concept of a small caryard which has a database to keep track of what cars it has available. The owner has hired you as his IT Computer guy and expects you to be able to drop him the data that he asks for at the drop of a hat.
I have made a number of lookup tables that will be used by the final table. This will give us a reasonable model to work from. To start off, I will be running my queries against an example database that has the following structure. I will try to think of common mistakes that are made when starting out and explain what goes wrong with them - as well as of course showing how to correct them.
The first table is simply a color listing so that we know what colors we have in the car yard.
mysql> create table colors(id int(3) not null auto_increment primary key,
-> color varchar(15), paint varchar(10));
Query OK, 0 rows affected (0.01 sec)
mysql> show columns from colors;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(3) | NO | PRI | NULL | auto_increment |
| color | varchar(15) | YES | | NULL | |
| paint | varchar(10) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)
mysql> insert into colors (color, paint) values ('Red', 'Metallic'),
-> ('Green', 'Gloss'), ('Blue', 'Metallic'),
-> ('White' 'Gloss'), ('Black' 'Gloss');
Query OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> select * from colors;
+----+-------+----------+
| id | color | paint |
+----+-------+----------+
| 1 | Red | Metallic |
| 2 | Green | Gloss |
| 3 | Blue | Metallic |
| 4 | White | Gloss |
| 5 | Black | Gloss |
+----+-------+----------+
5 rows in set (0.00 sec)
The brands table identifies the different brands of the cars out caryard could possibly sell.
mysql> create table brands (id int(3) not null auto_increment primary key,
-> brand varchar(15));
Query OK, 0 rows affected (0.01 sec)
mysql> show columns from brands;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(3) | NO | PRI | NULL | auto_increment |
| brand | varchar(15) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.01 sec)
mysql> insert into brands (brand) values ('Ford'), ('Toyota'),
-> ('Nissan'), ('Smart'), ('BMW');
Query OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> select * from brands;
+----+--------+
| id | brand |
+----+--------+
| 1 | Ford |
| 2 | Toyota |
| 3 | Nissan |
| 4 | Smart |
| 5 | BMW |
+----+--------+
5 rows in set (0.00 sec)
The model table will cover off different types of cars, it is going to be simpler for this to use different car types rather than actual car models.
mysql> create table models (id int(3) not null auto_increment primary key,
-> model varchar(15));
Query OK, 0 rows affected (0.01 sec)
mysql> show columns from models;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(3) | NO | PRI | NULL | auto_increment |
| model | varchar(15) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> insert into models (model) values ('Sports'), ('Sedan'), ('4WD'), ('Luxury');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from models;
+----+--------+
| id | model |
+----+--------+
| 1 | Sports |
| 2 | Sedan |
| 3 | 4WD |
| 4 | Luxury |
+----+--------+
4 rows in set (0.00 sec)
And finally, to tie up all these other tables, the table that ties everything together. The ID field is actually the unique lot number used to identify cars.
mysql> create table cars (id int(3) not null auto_increment primary key,
-> color int(3), brand int(3), model int(3));
Query OK, 0 rows affected (0.01 sec)
mysql> show columns from cars;
+-------+--------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------+------+-----+---------+----------------+
| id | int(3) | NO | PRI | NULL | auto_increment |
| color | int(3) | YES | | NULL | |
| brand | int(3) | YES | | NULL | |
| model | int(3) | YES | | NULL | |
+-------+--------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
mysql> insert into cars (color, brand, model) values (1,2,1), (3,1,2), (5,3,1),
-> (4,4,2), (2,2,3), (3,5,4), (4,1,3), (2,2,1), (5,2,3), (4,5,1);
Query OK, 10 rows affected (0.00 sec)
Records: 10 Duplicates: 0 Warnings: 0
mysql> select * from cars;
+----+-------+-------+-------+
| id | color | brand | model |
+----+-------+-------+-------+
| 1 | 1 | 2 | 1 |
| 2 | 3 | 1 | 2 |
| 3 | 5 | 3 | 1 |
| 4 | 4 | 4 | 2 |
| 5 | 2 | 2 | 3 |
| 6 | 3 | 5 | 4 |
| 7 | 4 | 1 | 3 |
| 8 | 2 | 2 | 1 |
| 9 | 5 | 2 | 3 |
| 10 | 4 | 5 | 1 |
+----+-------+-------+-------+
10 rows in set (0.00 sec)
This will give us enough data (I hope) to cover off the examples below of different types of joins and also give enough data to make them worthwhile.
So getting into the grit of it, the boss wants to know The IDs of all the sports cars he has.
This is a simple two table join. We have a table that identifies the model and the table with the available stock in it. As you can see, the data in the model
column of the cars
table relates to the models
column of the cars
table we have. Now, we know that the models table has an ID of 1
for Sports
so lets write the join.
So this query looks good right? We have identified the two tables and contain the information we need and use a join that correctly identifies what columns to join on.
Oh noes! An error in our first query! Yes, and it is a plum. You see, the query has indeed got the right columns, but some of them exist in both tables, so the database gets confused about what actual column we mean and where. There are two solutions to solve this. The first is nice and simple, we can use tableName.columnName
to tell the database exactly what we mean, like this:
select
cars.ID,
models.model
from
cars
join models
on cars.model=models.ID
+----+--------+
| ID | model |
+----+--------+
| 1 | Sports |
| 3 | Sports |
| 8 | Sports |
| 10 | Sports |
| 2 | Sedan |
| 4 | Sedan |
| 5 | 4WD |
| 7 | 4WD |
| 9 | 4WD |
| 6 | Luxury |
+----+--------+
10 rows in set (0.00 sec)
The other is probably more often used and is called table aliasing. The tables in this example have nice and short simple names, but typing out something like KPI_DAILY_SALES_BY_DEPARTMENT
would probably get old quickly, so a simple way is to nickname the table like this:
Now, back to the request. As you can see we have the information we need, but we also have information that wasn’t asked for, so we need to include a where clause in the statement to only get the Sports cars as was asked. As I prefer the table alias method rather than using the table names over and over, I will stick to it from this point onwards.
Clearly, we need to add a where clause to our query. We can identify Sports cars either by ID=1
or model='Sports'
. As the ID is indexed and the primary key (and it happens to be less typing), lets use that in our query.
select
a.ID,
b.model
from
cars a
join models b
on a.model=b.ID
where
b.ID=1
+----+--------+
| ID | model |
+----+--------+
| 1 | Sports |
| 3 | Sports |
| 8 | Sports |
| 10 | Sports |
+----+--------+
4 rows in set (0.00 sec)
Bingo! The boss is happy. Of course, being a boss and never being happy with what he asked for, he looks at the information, then says I want the colors as well.
Okay, so we have a good part of our query already written, but we need to use a third table which is colors. Now, our main information table cars
stores the car color ID and this links back to the colors ID column. So, in a similar manner to the original, we can join a third table:
select
a.ID,
b.model
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
where
b.ID=1
+----+--------+
| ID | model |
+----+--------+
| 1 | Sports |
| 3 | Sports |
| 8 | Sports |
| 10 | Sports |
+----+--------+
4 rows in set (0.00 sec)
Damn, although the table was correctly joined and the related columns were linked, we forgot to pull in the actual information from the new table that we just linked.
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
where
b.ID=1
+----+--------+-------+
| ID | model | color |
+----+--------+-------+
| 1 | Sports | Red |
| 8 | Sports | Green |
| 10 | Sports | White |
| 3 | Sports | Black |
+----+--------+-------+
4 rows in set (0.00 sec)
Right, that’s the boss off our back for a moment. Now, to explain some of this in a little more detail. As you can see, the from
clause in our statement links our main table (I often use a table that contains information rather than a lookup or dimension table. The query would work just as well with the tables all switched around, but make less sense when we come back to this query to read it in a few months time, so it is often best to try to write a query that will be nice and easy to understand - lay it out intuitively, use nice indenting so that everything is as clear as it can be. If you go on to teach others, try to instill these characteristics in their queries - especially if you will be troubleshooting them.
It is entirely possible to keep linking more and more tables in this manner.
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=1
While I forgot to include a table where we might want to join more than one column in the join
statement, here is an example. If the models
table had brand-specific models and therefore also had a column called brand
which linked back to the brands
table on the ID
field, it could be done as this:
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
and b.brand=d.ID
where
b.ID=1
You can see, the query above not only links the joined tables to the main cars
table, but also specifies joins between the already joined tables. If this wasn’t done, the result is called a cartesian join - which is dba speak for bad. A cartesian join is one where rows are returned because the information doesn’t tell the database how to limit the results, so the query returns all the rows that fit the criteria.
So, to give an example of a cartesian join, lets run the following query:
select
a.ID,
b.model
from
cars a
join models b
+----+--------+
| ID | model |
+----+--------+
| 1 | Sports |
| 1 | Sedan |
| 1 | 4WD |
| 1 | Luxury |
| 2 | Sports |
| 2 | Sedan |
| 2 | 4WD |
| 2 | Luxury |
| 3 | Sports |
| 3 | Sedan |
| 3 | 4WD |
| 3 | Luxury |
| 4 | Sports |
| 4 | Sedan |
| 4 | 4WD |
| 4 | Luxury |
| 5 | Sports |
| 5 | Sedan |
| 5 | 4WD |
| 5 | Luxury |
| 6 | Sports |
| 6 | Sedan |
| 6 | 4WD |
| 6 | Luxury |
| 7 | Sports |
| 7 | Sedan |
| 7 | 4WD |
| 7 | Luxury |
| 8 | Sports |
| 8 | Sedan |
| 8 | 4WD |
| 8 | Luxury |
| 9 | Sports |
| 9 | Sedan |
| 9 | 4WD |
| 9 | Luxury |
| 10 | Sports |
| 10 | Sedan |
| 10 | 4WD |
| 10 | Luxury |
+----+--------+
40 rows in set (0.00 sec)
Good god, that’s ugly. However, as far as the database is concerned, it is exactly what was asked for. In the query, we asked for for the ID
from cars
and the model
from models
. However, because we didn’t specify how to join the tables, the database has matched every row from the first table with every row from the second table.
Okay, so the boss is back, and he wants more information again. I want the same list, but also include 4WDs in it.
This however, gives us a great excuse to look at two different ways to accomplish this. We could add another condition to the where clause like this:
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=1
or b.ID=3
While the above will work perfectly well, lets look at it differently, this is a great excuse to show how a union
query will work.
We know that the following will return all the Sports cars:
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=1
And the following would return all the 4WDs:
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=3
So by adding a union all
clause between them, the results of the second query will be appended to the results of the first query.
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=1
union all
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=3
+----+--------+-------+
| ID | model | color |
+----+--------+-------+
| 1 | Sports | Red |
| 8 | Sports | Green |
| 10 | Sports | White |
| 3 | Sports | Black |
| 5 | 4WD | Green |
| 7 | 4WD | White |
| 9 | 4WD | Black |
+----+--------+-------+
7 rows in set (0.00 sec)
As you can see, the results of the first query are returned first, followed by the results of the second query.
In this example, it would of course have been much easier to simply use the first query, but union
queries can be great for specific cases. They are a great way to return specific results from tables from tables that aren’t easily joined together - or for that matter completely unrelated tables. There are a few rules to follow however.
Now, you might be wondering what the difference is between using union
and union all
. A union
query will remove duplicates, while a union all
will not. This does mean that there is a small performance hit when using union
over union all
but the results may be worth it - I won’t speculate on that sort of thing in this though.
On this note, it might be worth noting some additional notes here.
order by
but you can’t use the alias anymore. In the query above, appending an order by a.ID
would result in an error - as far as the results are concerned, the column is called ID
rather than a.ID
- even though the same alias has been used in both queries.
order by
statement, and it must be as the last statement.
For the next examples, I am adding a few extra rows to our tables.
I have added Holden
to the brands table. I have also added a row into cars
that has the color
value of 12
- which has no reference in the colors table.
Okay, the boss is back again, barking requests out - *I want a count of each brand we carry and the number of cars in it!` - Typical, we just get to an interesting section of our discussion and the boss wants more work.
Rightyo, so the first thing we need to do is get a complete listing of possible brands.
select
a.brand
from
brands a
+--------+
| brand |
+--------+
| Ford |
| Toyota |
| Nissan |
| Smart |
| BMW |
| Holden |
+--------+
6 rows in set (0.00 sec)
Now, when we join this to our cars table we get the following result:
select
a.brand
from
brands a
join cars b
on a.ID=b.brand
group by
a.brand
+--------+
| brand |
+--------+
| BMW |
| Ford |
| Nissan |
| Smart |
| Toyota |
+--------+
5 rows in set (0.00 sec)
Which is of course a problem - we aren’t seeing any mention of the lovely Holden
brand I added.
This is because a join looks for matching rows in both tables. As there is no data in cars that is of type Holden
it isn’t returned. This is where we can use an outer
join. This will return all the results from one table whether they are matched in the other table or not:
select
a.brand
from
brands a
left outer join cars b
on a.ID=b.brand
group by
a.brand
+--------+
| brand |
+--------+
| BMW |
| Ford |
| Holden |
| Nissan |
| Smart |
| Toyota |
+--------+
6 rows in set (0.00 sec)
Now that we have that, we can add a lovely aggregate function to get a count and get the boss off our backs for a moment.
select
a.brand,
count(b.id) as countOfBrand
from
brands a
left outer join cars b
on a.ID=b.brand
group by
a.brand
+--------+--------------+
| brand | countOfBrand |
+--------+--------------+
| BMW | 2 |
| Ford | 2 |
| Holden | 0 |
| Nissan | 1 |
| Smart | 1 |
| Toyota | 5 |
+--------+--------------+
6 rows in set (0.00 sec)
And with that, away the boss skulks.
Now, to explain this in some more detail, outer joins can be of the left
or right
type. The Left or Right defines which table is fully included. A left outer join
will include all the rows from the table on the left, while (you guessed it) a right outer join
brings all the results from the table on the right into the results.
Some databases will allow a full outer join
which will bring back results (whether matched or not) from both tables, but this isn’t supported in all databases.
Now, I probably figure at this point in time, you are wondering whether or not you can merge join types in a query - and the answer is yes, you absolutely can.
select
b.brand,
c.color,
count(a.id) as countOfBrand
from
cars a
right outer join brands b
on b.ID=a.brand
join colors c
on a.color=c.ID
group by
a.brand,
c.color
+--------+-------+--------------+
| brand | color | countOfBrand |
+--------+-------+--------------+
| Ford | Blue | 1 |
| Ford | White | 1 |
| Toyota | Black | 1 |
| Toyota | Green | 2 |
| Toyota | Red | 1 |
| Nissan | Black | 1 |
| Smart | White | 1 |
| BMW | Blue | 1 |
| BMW | White | 1 |
+--------+-------+--------------+
9 rows in set (0.00 sec)
So, why is that not the results that were expected? It is because although we have selected the outer join from cars to brands, it wasn’t specified in the join to colors - so that particular join will only bring back results that match in both tables.
Here is the query that would work to get the results that we expected:
select
a.brand,
c.color,
count(b.id) as countOfBrand
from
brands a
left outer join cars b
on a.ID=b.brand
left outer join colors c
on b.color=c.ID
group by
a.brand,
c.color
+--------+-------+--------------+
| brand | color | countOfBrand |
+--------+-------+--------------+
| BMW | Blue | 1 |
| BMW | White | 1 |
| Ford | Blue | 1 |
| Ford | White | 1 |
| Holden | NULL | 0 |
| Nissan | Black | 1 |
| Smart | White | 1 |
| Toyota | NULL | 1 |
| Toyota | Black | 1 |
| Toyota | Green | 2 |
| Toyota | Red | 1 |
+--------+-------+--------------+
11 rows in set (0.00 sec)
As we can see, we have two outer joins in the query and the results are coming through as expected.
Now, how about those other types of joins you ask? What about Intersections?
Well, not all databases support the intersection
but pretty much all databases will allow you to create an intersection through a join (or a well structured where statement at the least).
An Intersection is a type of join somewhat similar to a union
as described above - but the difference is that it only returns rows of data that are identical (and I do mean identical) between the various individual queries joined by the union. Only rows that are identical in every regard will be returned.
A simple example would be as such:
While a normal union
query would return all the rows of the table (the first query returning anything over ID>2
and the second anything having ID<4
) which would result in a full set, an intersect query would only return the row matching id=3
as it meets both criteria.
Now, if your database doesn’t support an intersect
query, the above can be easily accomlished with the following query:
select
a.ID,
a.color,
a.paint
from
colors a
join colors b
on a.ID=b.ID
where
a.ID>2
and b.ID<4
+----+-------+----------+
| ID | color | paint |
+----+-------+----------+
| 3 | Blue | Metallic |
+----+-------+----------+
1 row in set (0.00 sec)
If you wish to perform an intersection across two different tables using a database that doesn’t inherently support an intersection query, you will need to create a join on every column of the tables.
Ok, I found this post very interesting and I would like to share some of my knowledge on creating a query. Thanks for this Fluffeh. Others who may read this and may feel that I’m wrong are 101% free to edit and criticise my answer. (Honestly, I feel very thankful for correcting my mistake(s).)
I’ll be posting some of the frequently asked questions in MySQL
tag.
Given this schema
CREATE TABLE MovieList
(
ID INT,
MovieName VARCHAR(25),
CONSTRAINT ml_pk PRIMARY KEY (ID),
CONSTRAINT ml_uq UNIQUE (MovieName)
);
INSERT INTO MovieList VALUES (1, 'American Pie');
INSERT INTO MovieList VALUES (2, 'The Notebook');
INSERT INTO MovieList VALUES (3, 'Discovery Channel: Africa');
INSERT INTO MovieList VALUES (4, 'Mr. Bean');
INSERT INTO MovieList VALUES (5, 'Expendables 2');
CREATE TABLE CategoryList
(
MovieID INT,
CategoryName VARCHAR(25),
CONSTRAINT cl_uq UNIQUE(MovieID, CategoryName),
CONSTRAINT cl_fk FOREIGN KEY (MovieID) REFERENCES MovieList(ID)
);
INSERT INTO CategoryList VALUES (1, 'Comedy');
INSERT INTO CategoryList VALUES (1, 'Romance');
INSERT INTO CategoryList VALUES (2, 'Romance');
INSERT INTO CategoryList VALUES (2, 'Drama');
INSERT INTO CategoryList VALUES (3, 'Documentary');
INSERT INTO CategoryList VALUES (4, 'Comedy');
INSERT INTO CategoryList VALUES (5, 'Comedy');
INSERT INTO CategoryList VALUES (5, 'Action');
QUESTION
Find all movies that belong to at least both Comedy
and Romance
categories.
Solution
This question can be very tricky sometimes. It may seem that a query like this will be the answer:-
SELECT DISTINCT a.MovieName
FROM MovieList a
INNER JOIN CategoryList b
ON a.ID = b.MovieID
WHERE b.CategoryName = 'Comedy' AND
b.CategoryName = 'Romance'
which is definitely very wrong because it produces no result. The explanation of this is that there is only one valid value of CategoryName
on each row. For instance, the first condition returns true, the second condition is always false. Thus, by using AND
operator, both condition should be true; otherwise, it will be false. Another query is like this,
SELECT DISTINCT a.MovieName
FROM MovieList a
INNER JOIN CategoryList b
ON a.ID = b.MovieID
WHERE b.CategoryName IN ('Comedy','Romance')
and the result is still incorrect because it matches to record that has at least one match on the categoryName
. The real solution would be by counting the number of record instances per movie. The number of instance should match to the total number of the values supplied in the condition.
SELECT a.MovieName
FROM MovieList a
INNER JOIN CategoryList b
ON a.ID = b.MovieID
WHERE b.CategoryName IN ('Comedy','Romance')
GROUP BY a.MovieName
HAVING COUNT(*) = 2
Given schema,
CREATE TABLE Software
(
ID INT,
SoftwareName VARCHAR(25),
Descriptions VARCHAR(150),
CONSTRAINT sw_pk PRIMARY KEY (ID),
CONSTRAINT sw_uq UNIQUE (SoftwareName)
);
INSERT INTO Software VALUES (1,'PaintMe','used for photo editing');
INSERT INTO Software VALUES (2,'World Map','contains map of different places of the world');
INSERT INTO Software VALUES (3,'Dictionary','contains description, synonym, antonym of the words');
CREATE TABLE VersionList
(
SoftwareID INT,
VersionNo INT,
DateReleased DATE,
CONSTRAINT sw_uq UNIQUE (SoftwareID, VersionNo),
CONSTRAINT sw_fk FOREIGN KEY (SOftwareID) REFERENCES Software(ID)
);
INSERT INTO VersionList VALUES (3, 2, '2009-12-01');
INSERT INTO VersionList VALUES (3, 1, '2009-11-01');
INSERT INTO VersionList VALUES (3, 3, '2010-01-01');
INSERT INTO VersionList VALUES (2, 2, '2010-12-01');
INSERT INTO VersionList VALUES (2, 1, '2009-12-01');
INSERT INTO VersionList VALUES (1, 3, '2011-12-01');
INSERT INTO VersionList VALUES (1, 2, '2010-12-01');
INSERT INTO VersionList VALUES (1, 1, '2009-12-01');
INSERT INTO VersionList VALUES (1, 4, '2012-12-01');
QUESTION
Find the latest version on each software. Display the following columns: SoftwareName
,Descriptions
,LatestVersion
(from VersionNo column),DateReleased
Solution
Some SQL developers mistakenly use MAX()
aggregate function. They tend to create like this,
SELECT a.SoftwareName, a.Descriptions,
MAX(b.VersionNo) AS LatestVersion, b.DateReleased
FROM Software a
INNER JOIN VersionList b
ON a.ID = b.SoftwareID
GROUP BY a.ID
ORDER BY a.ID
(most RDBMS generates a syntax error on this because of not specifying some of the non-aggregated columns on the group by
clause) the result produces the correct LatestVersion
on each software but obviously the DateReleased
are incorrect. MySQL
doesn’t support Window Functions
and Common Table Expression
yet as some RDBMS do already. The workaround on this problem is to create a subquery
which gets the individual maximum versionNo
on each software and later on be joined on the other tables.
SELECT a.SoftwareName, a.Descriptions,
b.LatestVersion, c.DateReleased
FROM Software a
INNER JOIN
(
SELECT SoftwareID, MAX(VersionNO) LatestVersion
FROM VersionList
GROUP BY SoftwareID
) b ON a.ID = b.SoftwareID
INNER JOIN VersionList c
ON c.SoftwareID = b.SoftwareID AND
c.VersionNO = b.LatestVersion
GROUP BY a.ID
ORDER BY a.ID
So that was it. I’ll be posting another soon as I recall any other FAQ on MySQL
tag. Thank you for reading this little article. I hope that you have atleast get even a little knowledge from this.
UPDATE 1
Given Schema
CREATE TABLE userList
(
ID INT,
NAME VARCHAR(20),
CONSTRAINT us_pk PRIMARY KEY (ID),
CONSTRAINT us_uq UNIQUE (NAME)
);
INSERT INTO userList VALUES (1, 'Fluffeh');
INSERT INTO userList VALUES (2, 'John Woo');
INSERT INTO userList VALUES (3, 'hims056');
CREATE TABLE CONVERSATION
(
ID INT,
FROM_ID INT,
TO_ID INT,
MESSAGE VARCHAR(250),
DeliveryDate DATE
);
INSERT INTO CONVERSATION VALUES (1, 1, 2, 'hi john', '2012-01-01');
INSERT INTO CONVERSATION VALUES (2, 2, 1, 'hello fluff', '2012-01-02');
INSERT INTO CONVERSATION VALUES (3, 1, 3, 'hey hims', '2012-01-03');
INSERT INTO CONVERSATION VALUES (4, 1, 3, 'please reply', '2012-01-04');
INSERT INTO CONVERSATION VALUES (5, 3, 1, 'how are you?', '2012-01-05');
INSERT INTO CONVERSATION VALUES (6, 3, 2, 'sample message!', '2012-01-05');
QUESTION
Find the latest conversation between two users.
Solution
SELECT b.Name SenderName,
c.Name RecipientName,
a.Message,
a.DeliveryDate
FROM Conversation a
INNER JOIN userList b
ON a.From_ID = b.ID
INNER JOIN userList c
ON a.To_ID = c.ID
WHERE (LEAST(a.FROM_ID, a.TO_ID), GREATEST(a.FROM_ID, a.TO_ID), DeliveryDate)
IN
(
SELECT LEAST(FROM_ID, TO_ID) minFROM,
GREATEST(FROM_ID, TO_ID) maxTo,
MAX(DeliveryDate) maxDate
FROM Conversation
GROUP BY minFROM, maxTo
)
Okay, now the boss has burst in again - I want a list of all of our cars with the brand and a total of how many of that brand we have!
This is a great opportunity to use the next trick in our bag of SQL goodies - the subquery. If you are unfamiliar with the term, a subquery is a query that runs inside another query. There are many different ways to use them.
For our request, lets first put a simple query together that will list each car and the brand:
Now, if we wanted to simply get a count of cars sorted by brand, we could of course write this:
select
b.brand,
count(a.ID) as countCars
from
cars a
join brands b
on a.brand=b.ID
group by
b.brand
+--------+-----------+
| brand | countCars |
+--------+-----------+
| BMW | 2 |
| Ford | 2 |
| Nissan | 1 |
| Smart | 1 |
| Toyota | 5 |
+--------+-----------+
So, we should be able to simply add in the count function to our original query right?
select
a.ID,
b.brand,
count(a.ID) as countCars
from
cars a
join brands b
on a.brand=b.ID
group by
a.ID,
b.brand
+----+--------+-----------+
| ID | brand | countCars |
+----+--------+-----------+
| 1 | Toyota | 1 |
| 2 | Ford | 1 |
| 3 | Nissan | 1 |
| 4 | Smart | 1 |
| 5 | Toyota | 1 |
| 6 | BMW | 1 |
| 7 | Ford | 1 |
| 8 | Toyota | 1 |
| 9 | Toyota | 1 |
| 10 | BMW | 1 |
| 11 | Toyota | 1 |
+----+--------+-----------+
11 rows in set (0.00 sec)
Sadly, no, we can’t do that. The reason is that when we add in the car ID (column a.ID) we have to add it into the group by - so now, when the count function works, there is only one ID matched per ID.
This is where we can however use a subquery - in fact we can do two completely different types of subquery that will return the same results that we need for this. The first is to simply put the subquery in the select
clause. This means each time we get a row of data, the subquery will run off, get a column of data and then pop it into our row of data.
select
a.ID,
b.brand,
(
select
count(c.ID)
from
cars c
where
a.brand=c.brand
) as countCars
from
cars a
join brands b
on a.brand=b.ID
+----+--------+-----------+
| ID | brand | countCars |
+----+--------+-----------+
| 2 | Ford | 2 |
| 7 | Ford | 2 |
| 1 | Toyota | 5 |
| 5 | Toyota | 5 |
| 8 | Toyota | 5 |
| 9 | Toyota | 5 |
| 11 | Toyota | 5 |
| 3 | Nissan | 1 |
| 4 | Smart | 1 |
| 6 | BMW | 2 |
| 10 | BMW | 2 |
+----+--------+-----------+
11 rows in set (0.00 sec)
And Bam!, this would do us. If you noticed though, this sub query will have to run for each and every single row of data we return. Even in this little example, we only have five different Brands of car, but the subquery ran eleven times as we have eleven rows of data that we are returning. So, in this case, it doesn’t seem like the most efficient way to write code.
For a different approach, lets run a subquery and pretend it is a table:
select
a.ID,
b.brand,
d.countCars
from
cars a
join brands b
on a.brand=b.ID
join
(
select
c.brand,
count(c.ID) as countCars
from
cars c
group by
c.brand
) d
on a.brand=d.brand
+----+--------+-----------+
| ID | brand | countCars |
+----+--------+-----------+
| 1 | Toyota | 5 |
| 2 | Ford | 2 |
| 3 | Nissan | 1 |
| 4 | Smart | 1 |
| 5 | Toyota | 5 |
| 6 | BMW | 2 |
| 7 | Ford | 2 |
| 8 | Toyota | 5 |
| 9 | Toyota | 5 |
| 10 | BMW | 2 |
| 11 | Toyota | 5 |
+----+--------+-----------+
11 rows in set (0.00 sec)
Okay, so we have the same results (ordered slightly different - it seems the database wanted to return results ordered by the first column we picked this time) - but the same right numbers.
So, what’s the difference between the two - and when should we use each type of subquery? First, lets make sure we understand how that second query works. We selected two tables in the from
clause of our query, and then wrote a query and told the database that it was in fact a table instead - which the database is perfectly happy with. There can be some benefits to using this method (as well as some limitations). Foremost is that this subquery ran once. If our database contained a large volume of data, there could well be a massive improvement over the first method. However, as we are using this as a table, we have to bring in extra rows of data - so that they can actually be joined back to our rows of data. We also have to be sure that there are enough rows of data if we are going to use a simple join like in the query above. If you recall, the join will only pull back rows that have matching data on both sides of the join. If we aren’t careful, this could result in valid data not being returned from our cars table if there wasn’t a matching row in this subquery.
Now, looking back at the first subquery, there are some limitations as well. because we are pulling data back into a single row, we can ONLY pull back one row of data. Subqueries used in the select
clause of a query very often use only an aggregate function such as sum
, count
, max
or another similar aggregate function. They don’t have to, but that is often how they are written.
So, before we move on, lets have a quick look at where else we can use a subquery. We can use it in the where
clause - now, this example is a little contrived as in our database, there are better ways of getting the following data, but seeing as it is only for an example, lets have a look:
select
ID,
brand
from
brands
where
brand like '%o%'
+----+--------+
| ID | brand |
+----+--------+
| 1 | Ford |
| 2 | Toyota |
| 6 | Holden |
+----+--------+
3 rows in set (0.00 sec)
This returns us a list of brand IDs and Brand names (the second column is only added to show us the brands) that contain the letter o
in the name.
Now, we could use the results of this query in a where clause this:
select
a.ID,
b.brand
from
cars a
join brands b
on a.brand=b.ID
where
a.brand in
(
select
ID
from
brands
where
brand like '%o%'
)
+----+--------+
| ID | brand |
+----+--------+
| 2 | Ford |
| 7 | Ford |
| 1 | Toyota |
| 5 | Toyota |
| 8 | Toyota |
| 9 | Toyota |
| 11 | Toyota |
+----+--------+
7 rows in set (0.00 sec)
As you can see, even though the subquery was returning the three brand IDs, our cars table only had entries for two of them.
In this case, for further detail, the subquery is working as if we wrote the following code:
select
a.ID,
b.brand
from
cars a
join brands b
on a.brand=b.ID
where
a.brand in (1,2,6)
+----+--------+
| ID | brand |
+----+--------+
| 1 | Toyota |
| 2 | Ford |
| 5 | Toyota |
| 7 | Ford |
| 8 | Toyota |
| 9 | Toyota |
| 11 | Toyota |
+----+--------+
7 rows in set (0.00 sec)
Again, you can see how a subquery vs manual inputs has changed the order of the rows when returning from the database.
While we are discussing subqueries, lets see what else we can do with a subquery:
select
clause, some in the from
clause and a couple more in the where
clause - just remember that each one you put in is making your query more complex and likely to take longer to execute.
If you need to write some efficient code, it can be beneficial to write the query a number of ways and see (either by timing it or by using an explain plan) which is the optimal query to get your results. The first way that works may not always be the best way of doing it.
How do I clear the value from a cell and make it NULL?
If you’ve opened a table and you want to clear an existing value to NULL, click on the value, and press Ctrl
+0
.
I think @Zack properly answered the question but just to cover all the bases:
This would set the entire column to null as the Question Title asks.
To set a specific row on a specific column to null use:
This would set a specific cell to null as the inner question asks.
If you are using the table interface you can type in NULL (all caps)
otherwise you can run an update statement where you could:
I have a SQL Server table that contains users & their grades. For simplicity’s sake, lets just say there are 2 columns - name
& grade
. So a typical row would be Name: “John Doe”, Grade:“A”.
I’m looking for one SQL statement that will find the percentages of all possible answers. (A, B, C, etc…) Also, is there a way to do this without defining all possible answers (open text field - users could enter ‘pass/fail’, ‘none’, etc…)
The final output I’m looking for is A: 5%, B: 15%, C: 40%, etc…
I have tested the following and this does work. The answer by gordyii was close but had the multiplication of 100 in the wrong place and had some missing parenthesis.
The most efficient (using over()).
select Grade, count(*) * 100.0 / sum(count(*)) over()
from MyTable
group by Grade
```</li>
<li><p>Universal (any SQL version).</p>
```sql
select Grade, count(*) * 100.0 / (select count(*) from MyTable)
from MyTable
group by Grade;
```</li>
<li><p>With CTE, the least efficient.</p>
```sql
with t(Grade, GradeCount)
as
(
select Grade, count(*)
from MyTable
group by Grade
)
select Grade, GradeCount * 100.0/(select sum(GradeCount) from t)
from t;
```</li>
</ol>
#### Answer 3 (score 38)
Instead of using a separate CTE to get the total, you can use a window function without the "partition by" clause.
If you are using:
```sql
count(*)
to get the count for a group, you can use:
to get the total count.
For example:
It tends to be faster in my experience, but I think it might internally use a temp table in some cases (I’ve seen “Worktable” when running with “set statistics io on”).
EDIT: I’m not sure if my example query is what you are looking for, I was just illustrating how the windowing functions work.
I was trying to create a table as follows,
First I tried inserting values as below,
It has given error saying,
Cannot convert varchar to datetime
Then I tried below format as one of the post suggested by our stackoverflow,
insert into table1 values(convert(datetime,'21-02-2012 6:10:00 PM',5)
,convert(datetime,'01-01-2001 12:00:00 AM',5));
But am still getting the error saying,
Conversion failed when converting date and/or time from character string
Any suggestions?
There are many formats supported by SQL Server - see the MSDN Books Online on CAST and CONVERT. Most of those formats are dependent on what settings you have - therefore, these settings might work some times - and sometimes not.
The way to solve this is to use the (slightly adapted) ISO-8601 date format that is supported by SQL Server - this format works always - regardless of your SQL Server language and dateformat settings.
The ISO-8601 format is supported by SQL Server comes in two flavors:
YYYYMMDD
for just dates (no time portion); note here: no dashes!, that’s very important! YYYY-MM-DD
is NOT independent of the dateformat settings in your SQL Server and will NOT work in all situations!
or:
YYYY-MM-DDTHH:MM:SS
for dates and times - note here: this format has dashes (but they can be omitted), and a fixed T
as delimiter between the date and time portion of your DATETIME
.
This is valid for SQL Server 2000 and newer.
So in your concrete case - use these strings:
and you should be fine (note: you need to use the international 24-hour format rather than 12-hour AM/PM format for this).
Alternatively: if you’re on SQL Server 2008 or newer, you could also use the DATETIME2
datatype (instead of plain DATETIME
) and your current INSERT
would just work without any problems! :-) DATETIME2
is a lot better and a lot less picky on conversions - and it’s the recommend date/time data types for SQL Server 2008 or newer anyway.
SELECT
CAST('02-21-2012 6:10:00 PM' AS DATETIME2), -- works just fine
CAST('01-01-2012 12:00:00 AM' AS DATETIME2) -- works just fine
Don’t ask me why this whole topic is so tricky and somewhat confusing - that’s just the way it is. But with the YYYYMMDD
format, you should be fine for any version of SQL Server and for any language and dateformat setting in your SQL Server.
The conversion in SQL server fails sometimes not because of the Date or Time formats used, It is Merely because you are trying to store wrong data that is not acceptable to the system.
Example:
Create Table MyTable (MyDate);
Insert Into MyTable(MyDate) Values ('2015-02-29');
The SQL server will throw the following error:
Conversion failed when converting date and/or time from character string.
The reason for this error is simply there is no such date (Feb-29) in Year (2015).
Simple answer - 5 is Italian “yy” and 105 is Italian “yyyy”. Therefore:
will work correctly, but
will give error.
Likewise,
will give error, where as
will work.
I want to add a row to a database table, but if a row exists with the same unique key I want to update the row.
For example,
Let’s say the unique key is id
, and in my database there is a row with id = 1
. In that case I want to update that row with these values. Normally this gives an error. If I use insert IGNORE
it will ignore the error, but it still won’t update.
Use INSERT ... ON DUPLICATE KEY UPDATE
QUERY:
Check out REPLACE
When using batch insert use the following syntax:
INSERT INTO TABLE (id, name, age) VALUES (1, "A", 19), (2, "B", 17), (3, "C", 22)
ON DUPLICATE KEY UPDATE
name = VALUES (name),
...
I have a table (SQL Sever
) which references paths (UNC
or otherwise), but now the path is going to change.
In the path column, I have many records and I need to change just a portion of the path, but not the entire path. And I need to change the same string to the new one, in every record.
How can I do this with a simple update
?
It’s this easy:
I tried the above but it did not yield the correct result. The following one does:
Is it possible to use an IF clause within a WHERE clause in MS SQL?
Example:
Use a CASE statement
UPDATE: The previous syntax (as pointed out by a few people) doesn’t work. You can use CASE as follows:
WHERE OrderNumber LIKE
CASE WHEN IsNumeric(@OrderNumber) = 1 THEN
@OrderNumber
ELSE
'%' + @OrderNumber
END
Or you can use an IF statement like @N. J. Reed points out.
You should be able to do this without any IF or CASE
WHERE
(IsNumeric(@OrderNumber) AND
(CAST OrderNumber AS VARCHAR) = (CAST @OrderNumber AS VARCHAR)
OR
(NOT IsNumeric(@OrderNumber) AND
OrderNumber LIKE ('%' + @OrderNumber))
Depending on the flavour of SQL you may need to tweak the casts on the order number to an INT or VARCHAR depending on whether implicit casts are supported.
This is a very common technique in a WHERE clause. If you want to apply some “IF” logic in the WHERE clause all you need to do is add the extra condition with an boolean AND to the section where it needs to be applied.
You don’t need a IF statement at all.
WHERE
(IsNumeric(@OrderNumber) = 1 AND OrderNumber = @OrderNumber)
OR (IsNumeric(@OrderNumber) = 0 AND OrderNumber LIKE '%' + @OrderNumber + '%')
A very easy one for someone, The following insert is giving me the
ORA-01722: invalid number
why?
INSERT INTO CUSTOMER VALUES (1,'MALADY','Claire','27 Smith St Caulfield','0419 853 694');
INSERT INTO CUSTOMER VALUES (2,'GIBSON','Jake','27 Smith St Caulfield','0415 713 598');
INSERT INTO CUSTOMER VALUES (3,'LUU','Barry','5 Jones St Malvern','0413 591 341');
INSERT INTO CUSTOMER VALUES (4,'JONES','Michael','7 Smith St Caulfield','0419 853 694');
INSERT INTO CUSTOMER VALUES (5,'MALADY','Betty','27 Smith St Knox','0418 418 347');
An ORA-01722 error occurs when an attempt is made to convert a character string into a number, and the string cannot be converted into a number.
Without seeing your table definition, it looks like you’re trying to convert the numeric sequence at the end of your values list to a number, and the spaces that delimit it are throwing this error. But based on the information you’ve given us, it could be happening on any field (other than the first one).
Suppose telephone number is defined as NUMBER
then the blanks cannot be converted into a number:
create table telephone_number (tel_number number);
insert into telephone_number values ('0419 853 694');
The above gives you a
ORA-01722: invalid number
I have this error message:
Msg 8134, Level 16, State 1, Line 1 Divide by zero error encountered.
What is the best way to write SQL code so that I will never see this error message again?
I could do either of the following:
Or
Is the best way to use a NULLIF
clause?
Is there better way, or how can this be enforced?
In order to avoid a “Division by zero” error we have programmed it like this:
But here is a much nicer way of doing it:
Now the only problem is to remember the NullIf bit, if I use the “/” key.
In case you want to return zero, in case a zero devision would happen, you can use:
For every divisor that is zero, you will get a zero in the result set.
Suppose you want to calculate the male–female ratios for various school clubs, but you discover that the following query fails and issues a divide-by-zero error when it tries to calculate ratio for the Lord of the Rings Club, which has no women:
You can use the function NULLIF
to avoid division by zero. NULLIF
compares two expressions and returns null if they are equal or the first expression otherwise.
Rewrite the query as:
Any number divided by NULL
gives NULL
, and no error is generated.
Can you help me with SQL statements to find duplicates on multiple fields?
For example, in pseudo code:
select count(field1,field2,field3)
from table
where the combination of field1, field2, field3 occurs multiple times
and from the above statement if there are multiple occurrences I would like to select every record except the first one.
To get the list of fields for which there are multiple records, you can use..
select field1,field2,field3, count(*)
from table_name
group by field1,field2,field3
having count(*) > 1
Check this link for more information on how to delete the rows.
http://support.microsoft.com/kb/139444
Edit : As the other users mentioned, there should be a criterion for deciding how you define “first rows” before you use the approach in the link above. Based on that you’ll need to use an order by clause and a sub query if needed. If you can post some sample data, it would really help.
You mention “the first one”, so I assume that you have some kind of ordering on your data. Let’s assume that your data is ordered by some field ID
.
This SQL should get you the duplicate entries except for the first one. It basically selects all rows for which another row with (a) the same fields and (b) a lower ID exists. Performance won’t be great, but it might solve your problem.
This is a fun solution with SQL Server 2005 that I like. I’m going to assume that by “for every record except for the first one”, you mean that there is another “id” column that we can use to identify which row is “first”.
SELECT id
, field1
, field2
, field3
FROM
(
SELECT id
, field1
, field2
, field3
, RANK() OVER (PARTITION BY field1, field2, field3 ORDER BY id ASC) AS [rank]
FROM table_name
) a
WHERE [rank] > 1
I have an SQL statement that has a CASE
from SELECT
and I just can’t get it right. Can you guys show me an example of CASE
where the cases are the conditions and the results are from the cases. For example:
Select xxx, yyy
case : desc case when bbb then 'blackberry';
when sss then 'samsung';
end
from (select ???? .....
where the results show
The MSDN is a good reference for these type of questions regarding syntax and usage. This is from the Transact SQL Reference - CASE page.
http://msdn.microsoft.com/en-us/library/ms181765.aspx
USE AdventureWorks2012;
GO
SELECT ProductNumber, Name, "Price Range" =
CASE
WHEN ListPrice = 0 THEN 'Mfg item - not for resale'
WHEN ListPrice < 50 THEN 'Under $50'
WHEN ListPrice >= 50 and ListPrice < 250 THEN 'Under $250'
WHEN ListPrice >= 250 and ListPrice < 1000 THEN 'Under $1000'
ELSE 'Over $1000'
END
FROM Production.Product
ORDER BY ProductNumber ;
GO
Another good site you may want to check out if you’re using SQL Server is SQL Server Central. This has a large variety of resources available for whatever area of SQL Server you would like to learn.
I think these could be helpful for you .
Using a SELECT
statement with a simple CASE
expression
Within a SELECT
statement, a simple CASE
expression allows for only an equality check; no other comparisons are made. The following example uses the CASE
expression to change the display of product line categories to make them more understandable.
USE AdventureWorks2012;
GO
SELECT ProductNumber, Category =
CASE ProductLine
WHEN 'R' THEN 'Road'
WHEN 'M' THEN 'Mountain'
WHEN 'T' THEN 'Touring'
WHEN 'S' THEN 'Other sale items'
ELSE 'Not for sale'
END,
Name
FROM Production.Product
ORDER BY ProductNumber;
GO
Using a SELECT
statement with a searched CASE
expression
Within a SELECT
statement, the searched CASE
expression allows for values to be replaced in the result set based on comparison values. The following example displays the list price as a text comment based on the price range for a product.
USE AdventureWorks2012;
GO
SELECT ProductNumber, Name, "Price Range" =
CASE
WHEN ListPrice = 0 THEN 'Mfg item - not for resale'
WHEN ListPrice < 50 THEN 'Under $50'
WHEN ListPrice >= 50 and ListPrice < 250 THEN 'Under $250'
WHEN ListPrice >= 250 and ListPrice < 1000 THEN 'Under $1000'
ELSE 'Over $1000'
END
FROM Production.Product
ORDER BY ProductNumber ;
GO
Using CASE
in an ORDER BY
clause
The following examples uses the CASE
expression in an ORDER BY
clause to determine the sort order of the rows based on a given column value. In the first example, the value in the SalariedFlag column of the HumanResources.Employee table is evaluated. Employees that have the SalariedFlag set to 1 are returned in order by the BusinessEntityID in descending order. Employees that have the SalariedFlag set to 0 are returned in order by the BusinessEntityID in ascending order. In the second example, the result set is ordered by the column TerritoryName when the column CountryRegionName is equal to ‘United States’ and by CountryRegionName for all other rows.
SELECT BusinessEntityID, SalariedFlag
FROM HumanResources.Employee
ORDER BY CASE SalariedFlag WHEN 1 THEN BusinessEntityID END DESC
,CASE WHEN SalariedFlag = 0 THEN BusinessEntityID END;
GO
SELECT BusinessEntityID, LastName, TerritoryName, CountryRegionName
FROM Sales.vSalesPerson
WHERE TerritoryName IS NOT NULL
ORDER BY CASE CountryRegionName WHEN 'United States' THEN TerritoryName
ELSE CountryRegionName END;
Using CASE
in an UPDATE
statement
The following example uses the CASE
expression in an UPDATE
statement to determine the value that is set for the column VacationHours for employees with SalariedFlag set to 0. When subtracting 10 hours from VacationHours results in a negative value, VacationHours is increased by 40 hours; otherwise, VacationHours is increased by 20 hours. The OUTPUT
clause is used to display the before and after vacation values.
USE AdventureWorks2012;
GO
UPDATE HumanResources.Employee
SET VacationHours =
( CASE
WHEN ((VacationHours - 10.00) < 0) THEN VacationHours + 40
ELSE (VacationHours + 20.00)
END
)
OUTPUT Deleted.BusinessEntityID, Deleted.VacationHours AS BeforeValue,
Inserted.VacationHours AS AfterValue
WHERE SalariedFlag = 0;
Using CASE
in a HAVING
clause
The following example uses the CASE
expression in a HAVING
clause to restrict the rows returned by the SELECT
statement. The statement returns the the maximum hourly rate for each job title in the HumanResources.Employee table. The HAVING
clause restricts the titles to those that are held by men with a maximum pay rate greater than 40 dollars or women with a maximum pay rate greater than 42 dollars.
USE AdventureWorks2012;
GO
SELECT JobTitle, MAX(ph1.Rate)AS MaximumRate
FROM HumanResources.Employee AS e
JOIN HumanResources.EmployeePayHistory AS ph1 ON e.BusinessEntityID = ph1.BusinessEntityID
GROUP BY JobTitle
HAVING (MAX(CASE WHEN Gender = 'M'
THEN ph1.Rate
ELSE NULL END) > 40.00
OR MAX(CASE WHEN Gender = 'F'
THEN ph1.Rate
ELSE NULL END) > 42.00)
ORDER BY MaximumRate DESC;
For more details description of these example visit the source.
Also visit here and here for some examples with great details.
I have the following code in one of my Sql (2008) Stored Procs which executes perfectly fine:
CREATE PROCEDURE [dbo].[Item_AddItem]
@CustomerId uniqueidentifier,
@Description nvarchar(100),
@Type int,
@Username nvarchar(100),
AS
BEGIN
DECLARE @TopRelatedItemId uniqueidentifier;
SET @TopRelatedItemId =
(
SELECT top(1) RelatedItemId
FROM RelatedItems
WHERE CustomerId = @CustomerId
)
DECLARE @TempItem TABLE
(
ItemId uniqueidentifier,
CustomerId uniqueidentifier,
Description nvarchar(100),
Type int,
Username nvarchar(100),
TimeStamp datetime
);
INSERT INTO Item
OUTPUT INSERTED.* INTO @TempItem
SELECT NEWID(), @CustomerId, @Description, @Type, @Username, GETDATE()
SELECT
ItemId,
CustomerId,
@TopRelatedItemId,
Description,
Type,
Username,
TimeStamp
FROM
@TempItem
END
GO
So the question for you guys is is there a possibility to do something along the lines of:
DECLARE @TempCustomer TABLE
(
CustomerId uniqueidentifier,
FirstName nvarchar(100),
LastName nvarchar(100),
Email nvarchar(100)
);
SELECT
CustomerId,
FirstName,
LastName,
Email
INTO
@TempCustomer
FROM
Customer
WHERE
CustomerId = @CustomerId
So that I could reuse this data from memory in other following statements? SQL Server throws a fit with the above statement, however i don’t want to have to create separate variables and initialize each one of them via a separate SELECT statement against the same table…. UGH!!!
Any suggestions on how to achieve something along the lines without multiple queries against the same table?
You cannot SELECT .. INTO .. a TABLE VARIABLE. The best you can do is create it first, then insert into it. Your 2nd snippet has to be
If you wanted to simply assign some variables for later use, you can do them in one shot with something along these lines:
declare @var1 int,@var2 int,@var3 int;
select
@var1 = field1,
@var2 = field2,
@var3 = field3
from
table
where
condition
If that’s the type of thing you’re after
you can do this:
SELECT
CustomerId,
FirstName,
LastName,
Email
INTO #tempCustomer
FROM
Customer
WHERE
CustomerId = @CustomerId
then later
you doesn’t need to declare the structure of #tempCustomer
After having created a temporary table and declaring the data types like so;
How do I then insert the relevant data which is already held on a physical table within the database?
To insert all data from all columns, just use this:
Don’t forget to DROP
the temporary table after you have finished with it and before you try creating it again:
What I’m trying to do is use more than one CASE WHEN condition for the same column.
Here is my code for the query:
SELECT Url='',
p.ArtNo,
p.[Description],
p.Specification,
CASE
WHEN 1 = 1 or 1 = 1
THEN 1
ELSE 0
END as Qty,
p.NetPrice,
[Status] = 0
FROM Product p (NOLOCK)
However, what I want to do is use more then one WHEN for the same column “qty”.
As in the following code:
There are two formats of case expression. You can do CASE
with many WHEN
as;
Or a Simple CASE
expression
Or CASE
within CASE
as;
Just use this one, You have to use more when they are classes.
You can use below example of case when with multiple conditions.
SELECT
id,stud_name,
CASE
WHEN marks <= 40 THEN 'Bad'
WHEN (marks >= 40 AND
marks <= 100) THEN 'good'
ELSE 'best'
END AS Grade
FROM Result
How do you write a number with two decimal places for sql server?
try this
Use Str()
Function. It takes three arguments(the number, the number total characters to display, and the number of decimal places to display
displays: ’ 12345.679’ ( 3 spaces, 5 digits 12345, a decimal point, and three decimal digits (679). - it rounds if it has to truncate, (unless the integer part is too large for the total size, in which case asterisks are displayed instead.)
for a Total of 12 characters, with 3 to the right of decimal point.
What is the best way to remove all spaces from a string in SQL Server 2008?
LTRIM(RTRIM(' a b '))
would remove all spaces at the right and left of the string, but I also need to remove the space in the middle.
Simply replace it;
Edit: Just to clarify; its a global replace, there is no need to trim()
or worry about multiple spaces for either char
or varchar
:
create table #t (
c char(8),
v varchar(8))
insert #t (c, v) values
('a a' , 'a a' ),
('a a ' , 'a a ' ),
(' a a' , ' a a' ),
(' a a ', ' a a ')
select
'"' + c + '"' [IN], '"' + replace(c, ' ', '') + '"' [OUT]
from #t
union all select
'"' + v + '"', '"' + replace(v, ' ', '') + '"'
from #t
Result
I would use a REPLACE
If it is an update on a table all you have to do is run this update multiple times until it is affecting 0 rows.
I am running data.bat file with the following lines:
Rem Tis batch file will populate tables
cd\program files\Microsoft SQL Server\MSSQL
osql -U sa -P Password -d MyBusiness -i c:\data.sql
The contents of the data.sql file is:
There are 8 more similar lines for adding records.
When I run this with start
> run
> cmd
> c:\data.bat
, I get this error message:
1>2>3>4>5>....<1 row affected>
Msg 8152, Level 16, State 4, Server SP1001, Line 1
string or binary data would be truncated.
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
Also, I am a newbie obviously, but what do Level #
, and state #
mean, and how do I look up error messages such as the one above: 8152?
From @gmmastros’s answer
Whenever you see the message….
Think to yourself… The field is NOT big enough to hold my data.
Check the table structure for the customers table. I think you’ll find that the length of one or more fields is NOT big enough to hold the data you are trying to insert. For example, if the Phone field is a varchar(8) field, and you try to put 11 characters in to it, you will get this error.
I had this issue although data length was shorter than the field length. It turned out that the problem was having another log table (for audit trail), filled by a trigger on the main table, where the column size also had to be changed.
In one of the INSERT
statements you are attempting to insert a too long string into a string (varchar
or nvarchar
) column.
If it’s not obvious which INSERT
is the offender by a mere look at the script, you could count the <1 row affected>
lines that occur before the error message. The obtained number plus one gives you the statement number. In your case it seems to be the second INSERT that produces the error.
table1 (id, name)
table2 (id, name)
Query:
Q: What is happening here?
A: Conceptually, we select all rows from table1
and for each row we attempt to find a row in table2
with the same value for the name
column. If there is no such row, we just leave the table2
portion of our result empty for that row. Then we constrain our selection by picking only those rows in the result where the matching row does not exist. Finally, We ignore all fields from our result except for the name
column (the one we are sure that exists, from table1
).
While it may not be the most performant method possible in all cases, it should work in basically every database engine ever that attempts to implement ANSI 92 SQL
You can either do
or
See this question for 3 techniques to accomplish this
I don’t have enough rep points to vote up the 2nd answer. But I have to disagree with the comments on the top answer. The second answer:
Is FAR more efficient in practice. I don’t know why, but I’m running it against 800k+ records and the difference is tremendous with the advantage given to the 2nd answer posted above. Just my $0.02
I’ve been trying to figure out how I can make a query with MySQL that checks if the value (string $haystack
) in a certain column contains certain data (string $needle
), like this:
In PHP, the function is called substr($haystack, $needle)
, so maybe:
Quite simple actually:
The %
is a wildcard for any character. Do note that this can get slow on very large datasets so if your database grows you’ll need to use fulltext indices.
Use:
Reference:
I have an big problem with an SQL Statement in Oracle. I want to select the TOP 10 Records ordered by STORAGE_DB which aren’t in a list from an other select statement.
This one works fine for all records:
SELECT DISTINCT
APP_ID,
NAME,
STORAGE_GB,
HISTORY_CREATED,
TO_CHAR(HISTORY_DATE, 'DD.MM.YYYY') AS HISTORY_DATE
FROM HISTORY WHERE
STORAGE_GB IS NOT NULL AND
APP_ID NOT IN (SELECT APP_ID
FROM HISTORY
WHERE TO_CHAR(HISTORY_DATE, 'DD.MM.YYYY') = '06.02.2009')
But when I am adding
I’m getting some kind of “random” Records. I think because the limit takes in place before the order.
Does someone has an good solution? The other problem: This query is realy slow (10k+ records)
You’ll need to put your current query in subquery as below :
SELECT * FROM (
SELECT DISTINCT
APP_ID,
NAME,
STORAGE_GB,
HISTORY_CREATED,
TO_CHAR(HISTORY_DATE, 'DD.MM.YYYY') AS HISTORY_DATE
FROM HISTORY WHERE
STORAGE_GB IS NOT NULL AND
APP_ID NOT IN (SELECT APP_ID FROM HISTORY WHERE TO_CHAR(HISTORY_DATE, 'DD.MM.YYYY') ='06.02.2009')
ORDER BY STORAGE_GB DESC )
WHERE ROWNUM <= 10
Oracle applies rownum to the result after it has been returned.
You need to filter the result after it has been returned, so a subquery is required. You can also use RANK() function to get Top-N results.
For performance try using NOT EXISTS
in place of NOT IN
. See this for more.
If you are using Oracle 12c, use:
FETCH NEXT N ROWS ONLY
SELECT DISTINCT
APP_ID,
NAME,
STORAGE_GB,
HISTORY_CREATED,
TO_CHAR(HISTORY_DATE, 'DD.MM.YYYY') AS HISTORY_DATE
FROM HISTORY WHERE
STORAGE_GB IS NOT NULL AND
APP_ID NOT IN (SELECT APP_ID FROM HISTORY WHERE TO_CHAR(HISTORY_DATE, 'DD.MM.YYYY') ='06.02.2009')
ORDER BY STORAGE_GB DESC
FETCH NEXT 10 ROWS ONLY
More info: http://docs.oracle.com/javadb/10.5.3.0/ref/rrefsqljoffsetfetch.html
With regards to the poor performance there are any number of things it could be, and it really ought to be a separate question. However, there is one obvious thing that could be a problem:
If HISTORY_DATE really is a date column and if it has an index then this rewrite will perform better:
This is because a datatype conversion disables the use of a B-Tree index.
I have a string that is up to 3 characters long when it’s first created in SQL Server 2008 R2.
I would like to pad it with leading zeros, so if its original value was ‘1’ then the new value would be ‘001’. Or if its original value was ‘23’ the new value is ‘023’. Or if its original value is ‘124’ then new value is the same as original value.
I am using SQL Server 2008 R2. How would I do this using T-SQL?
If the field is already a string, this will work
If you want nulls to show as ‘000’
It might be an integer – then you would want
As required by the question this answer only works if the length <= 3, if you want something larger you need to change the string constant and the two integer constants to the width needed. eg '0000' and VARCHAR(4)),4
Although the question was for SQL Server 2008 R2, in case someone is reading this with version 2012 and above, since then it became much easier by the use of FORMAT.
You can either pass a standard numeric format string or a custom numeric format string as the format argument (thank Vadim Ovchinnikov for this hint).
For this question for example a code like
DECLARE @myInt INT = 1;
-- One way using a standard numeric format string
PRINT FORMAT(@myInt,'D3');
-- Other way using a custom numeric format string
PRINT FORMAT(@myInt,'00#');
outputs
The safe method:
This has the advantage of returning the string '***'
for n < 0 or n > 999, which is a nice and obvious indicator of out-of-bounds input. The other methods listed here will fail silently by truncating the input to a 3-character substring.
Which method provides the best performance when removing the time portion from a datetime field in SQL Server?
or
The second method does send a few more bytes either way but that might not be as important as the speed of the conversion.
Both also appear to be very fast, but there might be a difference in speed when dealing with hundreds-of-thousands or more rows?
Also, is it possible that there are even better methods to get rid of the time portion of a datetime in SQL?
Strictly, method a
is the least resource intensive:
Proven less CPU intensive for same total duration a million rows by some one with way too much time on their hands: Most efficient way in SQL Server to get date from date+time?
I saw a similar test elsewhere with similar results too.
I prefer the DATEADD/DATEDIFF because:
Edit, Oct 2011
For SQL Server 2008+, you can CAST to date
. Or just use date
so no time to remove.
Edit, Jan 2012
A worked example of how flexible this is: Need to calculate by rounded time or date figure in sql server
Edit, May 2012
Do not use this in WHERE clauses and the like without thinking: adding a function or CAST to a column invalidates index usage. See number 2 here: http://www.simple-talk.com/sql/t-sql-programming/ten-common-sql-programming-mistakes/
Now, this does have an example of later SQL Server optimiser versions managing CAST to date correctly, but generally it will be a bad idea …
Edit, Sep 2018, for datetime2
In SQL Server 2008, you can use:
Of-course this is an old thread but to make it complete.
From SQL 2008 you can use DATE datatype so you can simply do:
I’ve seen similar errors on SO, but I don’t find a solution for my problem. I have a SQL query like:
SELECT DISTINCT
a.maxa ,
b.mahuyen ,
a.tenxa ,
b.tenhuyen ,
ISNULL(dkcd.tong, 0) AS tongdkcd
FROM phuongxa a ,
quanhuyen b
LEFT OUTER JOIN ( SELECT maxa ,
COUNT(*) AS tong
FROM khaosat
WHERE CONVERT(DATETIME, ngaylap, 103) BETWEEN 'Sep 1 2011'
AND
'Sep 5 2011'
GROUP BY maxa
) AS dkcd ON dkcd.maxa = a.maxa
WHERE a.maxa <> '99'
AND LEFT(a.maxa, 2) = b.mahuyen
ORDER BY maxa;
When I execute this query, the error result is: The multi-part identifier “a.maxa” could not be bound. Why?
P/s: if i divide the query into 2 individual query, it run ok.
SELECT DISTINCT
a.maxa ,
b.mahuyen ,
a.tenxa ,
b.tenhuyen
FROM phuongxa a ,
quanhuyen b
WHERE a.maxa <> '99'
AND LEFT(a.maxa, 2) = b.mahuyen
ORDER BY maxa;
and
You are mixing implicit joins with explicit joins. That is allowed, but you need to be aware of how to do that properly.
The thing is, explicit joins (the ones that are implemented using the JOIN
keyword) take precedence over implicit ones (the ‘comma’ joins, where the join condition is specified in the WHERE
clause).
Here’s an outline of your query:
You are probably expecting it to behave like this:
that is, the combination of tables a
and b
is joined with the table dkcd
. In fact, what’s happening is
that is, as you may already have understood, dkcd
is joined specifically against b
and only b
, then the result of the join is combined with a
and filtered further with the WHERE
clause. In this case, any reference to a
in the ON
clause is invalid, a
is unknown at that point. That is why you are getting the error message.
If I were you, I would probably try to rewrite this query, and one possible solution might be:
SELECT DISTINCT
a.maxa,
b.mahuyen,
a.tenxa,
b.tenhuyen,
ISNULL(dkcd.tong, 0) AS tongdkcd
FROM phuongxa a
INNER JOIN quanhuyen b ON LEFT(a.maxa, 2) = b.mahuyen
LEFT OUTER JOIN (
SELECT
maxa,
COUNT(*) AS tong
FROM khaosat
WHERE CONVERT(datetime, ngaylap, 103) BETWEEN 'Sep 1 2011' AND 'Sep 5 2011'
GROUP BY maxa
) AS dkcd ON dkcd.maxa = a.maxa
WHERE a.maxa <> '99'
ORDER BY a.maxa
Here the tables a
and b
are joined first, then the result is joined to dkcd
. Basically, this is the same query as yours, only using a different syntax for one of the joins, which makes a great difference: the reference a.maxa
in the dkcd
’s join condition is now absolutely valid.
As @Aaron Bertrand has correctly noted, you should probably qualify maxa
with a specific alias, probably a
, in the ORDER BY
clause.
Sometimes this error occurs when you use your schema (dbo) in your query in a wrong way.
for example if you write:
you will get the error.
In this situations change it to:
if you have given alies name change that to actual name
for example
SELECT
A.name,A.date
FROM [LoginInfo].[dbo].[TableA] as A
join
[LoginInfo].[dbo].[TableA] as B
on [LoginInfo].[dbo].[TableA].name=[LoginInfo].[dbo].[TableB].name;
change that to
SELECT
A.name,A.date
FROM [LoginInfo].[dbo].[TableA] as A
join
[LoginInfo].[dbo].[TableA] as B
on A.name=B.name;
I need the month+year from the datetime in SQL Server like ‘Jan 2008’. I’m grouping the query by month, year. I’ve searched and found functions like datepart, convert, etc., but none of them seem useful for this. Am I missing something here? Is there a function for this?
If you mean you want them back as a string, in that format;
Beginning with SQL Server 2012, you can use:
For the various popular database systems, how do you list all the columns in a table?
For MySQL, use:
This also works for Oracle as long as you are using SQL*Plus, or Oracle’s SQL Developer.
For Oracle (PL/SQL)
For MySQL