What is the difference between inner and outer join? Explain with example.
Inner Join
Inner join is the most common type of Join which is used to combine the rows from two tables and create a result set containing only such records that are present in both the tables based on the joining condition (predicate).
Inner join returns rows when there is at least one match in both tables
If none of the record matches between two tables, then INNER JOIN will return a NULL set. Below is an example of INNER JOIN and the resulting set.
SELECT dept.name DEPARTMENT, emp.name EMPLOYEE FROM DEPT dept, EMPLOYEE emp WHERE emp.dept_id = dept.id
Department | Employee |
---|---|
HR | Inno |
HR | Privy |
Engineering | Robo |
Engineering | Hash |
Engineering | Anno |
Engineering | Darl |
Marketing | Pete |
Marketing | Meme |
Sales | Tomiti |
Sales | Bhuti |
Outer Join
Outer Join can be full outer or single outer
Outer Join, on the other hand, will return matching rows from both tables as well as any unmatched rows from one or both the tables (based on whether it is single outer or full outer join respectively).
Notice in our record set that there is no employee in the department 5 (Logistics). Because of this if we perform inner join, then Department 5 does not appear in the above result. However in the below query we perform an outer join (dept left outer join emp), and we can see this department.
SELECT dept.name DEPARTMENT, emp.name EMPLOYEE FROM DEPT dept, EMPLOYEE emp WHERE dept.id = emp.dept_id (+)
Department | Employee |
---|---|
HR | Inno |
HR | Privy |
Engineering | Robo |
Engineering | Hash |
Engineering | Anno |
Engineering | Darl |
Marketing | Pete |
Marketing | Meme |
Sales | Tomiti |
Sales | Bhuti |
Logistics |
The (+) sign on the emp side of the predicate indicates that emp is the outer table here. The above SQL can be alternatively written as below (will yield the same result as above):
SELECT dept.name DEPARTMENT, emp.name EMPLOYEE FROM DEPT dept LEFT OUTER JOIN EMPLOYEE emp ON dept.id = emp.dept_id
SQL LEFT JOIN Keyword
The LEFT JOIN keyword returns all rows from the left table (table1), with the matching rows in the right table (table2). The result is NULL in the right side when there is no match.
SQL LEFT JOIN Syntax
SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name=table2.column_name;
or:
SELECT column_name(s)
FROM table1
LEFT OUTER JOIN table2
ON table1.column_name=table2.column_name;
PS! In some databases LEFT JOIN is called LEFT OUTER JOIN.
SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name=table2.column_name;
FROM table1
LEFT JOIN table2
ON table1.column_name=table2.column_name;
SELECT column_name(s)
FROM table1
LEFT OUTER JOIN table2
ON table1.column_name=table2.column_name;
FROM table1
LEFT OUTER JOIN table2
ON table1.column_name=table2.column_name;
SQL RIGHT JOIN Keyword
The RIGHT JOIN keyword returns all rows from the right table (table2), with the matching rows in the left table (table1). The result is NULL in the left side when there is no match.
SQL RIGHT JOIN Syntax
SELECT column_name(s)
FROM table1
RIGHT JOIN table2
ON table1.column_name=table2.column_name;
or:
SELECT column_name(s)
FROM table1
RIGHT OUTER JOIN table2
ON table1.column_name=table2.column_name;
PS! In some databases RIGHT JOIN is called RIGHT OUTER JOIN.
SELECT column_name(s)
FROM table1
RIGHT JOIN table2
ON table1.column_name=table2.column_name;
FROM table1
RIGHT JOIN table2
ON table1.column_name=table2.column_name;
SELECT column_name(s)
FROM table1
RIGHT OUTER JOIN table2
ON table1.column_name=table2.column_name;
FROM table1
RIGHT OUTER JOIN table2
ON table1.column_name=table2.column_name;
What is the difference between JOIN and UNION?
SQL JOIN allows us to “lookup” records on other table based on the given conditions between two tables. For example, if we have the department ID of each employee, then we can use this department ID of the employee table to join with the department ID of department table to lookup department names.
UNION operation allows us to add 2 similar data sets to create resulting data set that contains all the data from the source data sets. Union does not require any condition for joining. For example, if you have 2 employee tables with same structure, you can UNION them to create one result set that will contain all the employees from both of the tables.
SELECT * FROM EMP1 UNION SELECT * FROM EMP2;
What is the difference between UNION and UNION ALL?
UNION and UNION ALL both unify for add two structurally similar data sets, but UNION operation returns only the unique records from the resulting data set whereas UNION ALL will return all the rows, even if one or more rows are duplicated to each other.
In the following example, I am choosing exactly the same employee from the emp table and performing UNION and UNION ALL. Check the difference in the result.
SELECT * FROM EMPLOYEE WHERE ID = 5 UNION ALL SELECT * FROM EMPLOYEE WHERE ID = 5
ID | MGR_ID | DEPT_ID | NAME | SAL | DOJ |
---|---|---|---|---|---|
5.0 | 2.0 | 2.0 | Anno | 80.0 | 01-Feb-2012 |
5.0 | 2.0 | 2.0 | Anno | 80.0 | 01-Feb-2012 |
SELECT * FROM EMPLOYEE WHERE ID = 5 UNION SELECT * FROM EMPLOYEE WHERE ID = 5
ID | MGR_ID | DEPT_ID | NAME | SAL | DOJ |
---|---|---|---|---|---|
5.0 | 2.0 | 2.0 | Anno | 80.0 | 01-Feb-2012 |
What is the difference between WHERE clause and HAVING clause?
WHERE and HAVING both filters out records based on one or more conditions. The difference is, WHERE clause can only be applied on a static non-aggregated column whereas we will need to use HAVING for aggregated columns.
To understand this, consider this example.
Suppose we want to see only those departments where department ID is greater than 3. There is no aggregation operation and the condition needs to be applied on a static field. We will use WHERE clause here:
Suppose we want to see only those departments where department ID is greater than 3. There is no aggregation operation and the condition needs to be applied on a static field. We will use WHERE clause here:
SELECT * FROM DEPT WHERE ID > 3
ID | NAME |
---|---|
4 | Sales |
5 | Logistics |
Next, suppose we want to see only those Departments where Average salary is greater than 80. Here the condition is associated with a non-static aggregated information which is “average of salary”. We will need to use HAVING clause here:
SELECT dept.name DEPARTMENT, avg(emp.sal) AVG_SAL FROM DEPT dept, EMPLOYEE emp WHERE dept.id = emp.dept_id (+) GROUP BY dept.name HAVING AVG(emp.sal) > 80
DEPARTMENT | AVG_SAL |
---|---|
Engineering | 90 |
As you see above, there is only one department (Engineering) where average salary of employees is greater than 80.
What is the difference among UNION, MINUS and INTERSECT?
UNION combines the results from 2 tables and eliminates duplicate records from the result set.
MINUS operator when used between 2 tables, gives us all the rows from the first table except the rows which are present in the second table.
INTERSECT operator returns us only the matching or common rows between 2 result sets.
To understand these operators, let’s see some examples. We will use two different queries to extract data from our emp table and then we will perform UNION, MINUS and INTERSECT operations on these two sets of data.
UNION
SELECT * FROM EMPLOYEE WHERE ID = 5 UNION SELECT * FROM EMPLOYEE WHERE ID = 6
ID | MGR_ID | DEPT_ID | NAME | SAL | DOJ |
---|---|---|---|---|---|
5 | 2 | 2.0 | Anno | 80.0 | 01-Feb-2012 |
6 | 2 | 2.0 | Darl | 80.0 | 11-Feb-2012 |
MINUS
SELECT * FROM EMPLOYEE MINUS SELECT * FROM EMPLOYEE WHERE ID > 2
ID | MGR_ID | DEPT_ID | NAME | SAL | DOJ |
---|---|---|---|---|---|
1 | 2 | Hash | 100.0 | 01-Jan-2012 | |
2 | 1 | 2 | Robo | 100.0 | 01-Jan-2012 |
INTERSECT
SELECT * FROM EMPLOYEE WHERE ID IN (2, 3, 5) INTERSECT SELECT * FROM EMPLOYEE WHERE ID IN (1, 2, 4, 5)
ID | MGR_ID | DEPT_ID | NAME | SAL | DOJ |
---|---|---|---|---|---|
5 | 2 | 2 | Anno | 80.0 | 01-Feb-2012 |
2 | 1 | 2 | Robo | 100.0 | 01-Jan-2012 |
What is Self Join and why is it required?
Self Join is the act of joining one table with itself.
Self Join is often very useful to convert a hierarchical structure into a flat structure
In our employee table example above, we have kept the manager ID of each employee in the same row as that of the employee. This is an example of how a hierarchy (in this case employee-manager hierarchy) is stored in the RDBMS table. Now, suppose if we need to print out the names of the manager of each employee right beside the employee, we can use self join. See the example below:
SELECT e.name EMPLOYEE, m.name MANAGER FROM EMPLOYEE e, EMPLOYEE m WHERE e.mgr_id = m.id (+)
EMPLOYEE | MANAGER |
---|---|
Pete | Hash |
Darl | Hash |
Inno | Hash |
Robo | Hash |
Tomiti | Robo |
Anno | Robo |
Privy | Robo |
Meme | Pete |
Bhuti | Tomiti |
Hash |
The only reason we have performed a left outer join here (instead of INNER JOIN) is we have one employee in this table without a manager (employee ID = 1). If we perform inner join, this employee will not show-up.
How can we transpose a table using SQL (changing rows to column or vice-versa) ?
The usual way to do it in SQL is to use CASE statement or DECODE statement.
How to generate row number in SQL Without ROWNUM
Generating a row number – that is a running sequence of numbers for each row is not easy using plain SQL. In fact, the method I am going to show below is not very generic either. This method only works if there is at least one unique column in the table. This method will also work if there is no single unique column, but collection of columns that is unique. Anyway, here is the query:
SELECT name, sal, (SELECT COUNT(*) FROM EMPLOYEE i WHERE o.name >= i.name) row_num FROM EMPLOYEE o order by row_num
NAME | SAL | ROW_NUM |
---|---|---|
Anno | 80 | 1 |
Bhuti | 60 | 2 |
Darl | 80 | 3 |
Hash | 100 | 4 |
Inno | 50 | 5 |
Meme | 60 | 6 |
Pete | 70 | 7 |
Privy | 50 | 8 |
Robo | 100 | 9 |
Tomiti | 70 | 10 |
The column that is used in the row number generation logic is called “sort key”. Here sort key is “name” column. For this technique to work, the sort key needs to be unique. We have chosen the column “name” because this column happened to be unique in our Employee table. If it was not unique but some other collection of columns was, then we could have used those columns as our sort key (by concatenating those columns to form a single sort key).
Also notice how the rows are sorted in the result set. We have done an explicit sorting on the row_num column, which gives us all the row numbers in the sorted order. But notice that name column is also sorted (which is probably the reason why this column is referred as sort-key). If you want to change the order of the sorting from ascending to descending, you will need to change “>=” sign to “<=” in the query.
As I said before, this method is not very generic. This is why many databases already implement other methods to achieve this. For example, in Oracle database, every SQL result set contains a hidden column called ROWNUM. We can just explicitly select ROWNUM to get sequence numbers.
How to select first 5 records from a table?
This question, often asked in many interviews, does not make any sense to me. The problem here is how do you define which record is first and which is second. Which record is retrieved first from the database is not deterministic. It depends on many uncontrollable factors such as how database works at that moment of execution etc. So the question should really be – “how to select any 5 records from the table?” But whatever it is, here is the solution:
In Oracle,
SELECT * FROM EMP WHERE ROWNUM <= 5;
In SQL Server,
SELECT TOP 5 * FROM EMP;
Generic solution,
I believe a generic solution can be devised for this problem if and only if there exists at least one distinct column in the table. For example, in our EMP table ID is distinct. We can use that distinct column in the below way to come up with a generic solution of this question that does not require database specific functions such as ROWNUM, TOP etc.
SELECT name FROM EMPLOYEE o WHERE (SELECT count(*) FROM EMPLOYEE i WHERE i.name < o.name) < 5
name |
---|
Inno |
Anno |
Darl |
Meme |
Bhuti |
I have taken “name” column in the above example since “name” is happened to be unique in this table. I could very well take ID column as well.
In this example, if the chosen column was not distinct, we would have got more than 5 records returned in our output.
Do you have a better solution to this problem? If yes, post your solution in the comment.
What is the difference between ROWNUM pseudo column and ROW_NUMBER() function?
ROWNUM is a pseudo column present in Oracle database returned result set prior to ORDER BY being evaluated. So ORDER BY ROWNUM does not work.
ROW_NUMBER() is an analytical function which is used in conjunction to OVER() clause wherein we can specify ORDER BY and also PARTITION BY columns.
Suppose if you want to generate the row numbers in the order of ascending employee salaries for example, ROWNUM will not work. But you may use ROW_NUMBER() OVER() like shown below:
SELECT name, sal, row_number() over(order by sal desc) rownum_by_sal FROM EMPLOYEE o
name | Sal | ROWNUM_BY_SAL |
---|---|---|
Hash | 100 | 1 |
Robo | 100 | 2 |
Anno | 80 | 3 |
Darl | 80 | 4 |
Tomiti | 70 | 5 |
Pete | 70 | 6 |
Bhuti | 60 | 7 |
Meme | 60 | 8 |
Inno | 50 | 9 |
Privy | 50 | 10 |
What are the differences among ROWNUM, RANK and DENSE_RANK?
ROW_NUMBER assigns contiguous, unique numbers from 1.. N to a result set.
RANK does not assign unique numbers—nor does it assign contiguous numbers. If two records tie for second place, no record will be assigned the 3rd rank as no one came in third, according to RANK. See below:
SELECT name, sal, rank() over(order by sal desc) rank_by_sal FROM EMPLOYEE o
name | Sal | RANK_BY_SAL |
---|---|---|
Hash | 100 | 1 |
Robo | 100 | 1 |
Anno | 80 | 3 |
Darl | 80 | 3 |
Tomiti | 70 | 5 |
Pete | 70 | 5 |
Bhuti | 60 | 7 |
Meme | 60 | 7 |
Inno | 50 | 9 |
Privy | 50 | 9 |
DENSE_RANK, like RANK, does not assign unique numbers, but it does assign contiguous numbers. Even though two records tied for second place, there is a third-place record. See below:
SELECT name, sal, dense_rank() over(order by sal desc) dense_rank_by_sal FROM EMPLOYEE o
name | Sal | DENSE_RANK_BY_SAL |
---|---|---|
Hash | 100 | 1 |
Robo | 100 | 1 |
Anno | 80 | 2 |
Darl | 80 | 2 |
Tomiti | 70 | 3 |
Pete | 70 | 3 |
Bhuti | 60 | 4 |
Meme | 60 | 4 |
Inno | 50 | 5 |
Privy | 50 | 5 |
Describe Oracle database's physical and logical structure ?
Physical : Data files, Redo Log files, Control file.
Logical : Tables, Views, Tablespaces, etc.
Can a view be updated/inserted/deleted? If Yes - under what conditions ?
A View can be updated/deleted/inserted if it has only one base table if the view is based on columns from one or more tables then insert, update and delete is not possible.
What is explain plan and how is it used ?
The EXPLAIN PLAN command is a tool to tune SQL statements. To use it you must have an explain_table generated in the user you are running the explain plan for. This is created using the utlxplan.sql script. Once the explain plan table exists you run the explain plan command giving as its argument the SQL statement to be explained. The explain_plan table is then queried to see the execution plan of the statement. Explain plans can also be run using tkprof.
The EXPLAIN PLAN command is a tool to tune SQL statements. To use it you must have an explain_table generated in the user you are running the explain plan for. This is created using the utlxplan.sql script. Once the explain plan table exists you run the explain plan command giving as its argument the SQL statement to be explained. The explain_plan table is then queried to see the execution plan of the statement. Explain plans can also be run using tkprof.
What is a mutating table error and how can you get around it ?
This happens with triggers. It occurs because the trigger is trying to update a row it is currently using. The usual fix involves either use of views or temporary tables so the database is selecting from one while updating the other.
Describe the use of %ROWTYPE and %TYPE in PL/SQL ?
%ROWTYPE allows you to associate a variable with an entire table row. The %TYPE associates a variable with a single column type.
Explain the difference between a data block, an extent and a segment ?
A data block is the smallest unit of logical storage for a database object. As objects grow they take chunks of additional storage that are composed of contiguous data blocks. These groupings of contiguous data blocks are called extents. All the extents that an object takes when grouped together are considered the segment of the database object.
|
Explain Database Pessimistic and Optimistic locking?
Transactional isolation is usually implemented by locking whatever is accessed in a transaction. There are two different approaches to transactional locking:
- Pessimistic locking
- Optimistic locking
The disadvantage of pessimistic locking is that a resource is locked from the time it is first accessed in a transaction until the transaction is finished, making it inaccessible to other transactions during that time. If most transactions simply look at the resource and never change it, an exclusive lock may be overkill as it may cause lock contention, and optimistic locking may be a better approach. With pessimistic locking, locks are applied in a fail-safe way. In the banking application example, an account is locked as soon as it is accessed in a transaction. Attempts to use the account in other transactions while it is locked will either result in the other process being delayed until the account lock is released, or that the process transaction will be rolled back. The lock exists until the transaction has either been committed or rolled back.
With optimistic locking, a resource is not actually locked when it is first is accessed by a transaction. Instead, the state of the resource at the time when it would have been locked with the pessimistic locking approach is saved. Other transactions are able to concurrently access to the resource and the possibility of conflicting changes is possible. At commit time, when the resource is about to be updated in persistent storage, the state of the resource is read from storage again and compared to the state that was saved when the resource was first accessed in the transaction. If the two states differ, a conflicting update was made, and the transaction will be rolled back. This is very important for application performance to use Optimistic locking.
Difference in Inner join and Outer join?
Inner Joins
An inner join (sometimes called a simple join) is a join of two or more tables that returns only those rows that satisfy the join condition.
Outer Joins
An outer join extends the result of a simple join. An outer join returns all rows that satisfy the join condition and also returns some or all of those rows from one table for which no rows from the other satisfy the join condition.
- To write a query that performs an outer join of tables A and B and returns all rows from A (a left outer join), use the LEFT [OUTER] JOIN syntax in the FROM clause, or apply the outer join operator (+) to all columns of B in the join condition in the WHERE clause. For all rows in A that have no matching rows in B, Oracle Database returns null for any select list expressions containing columns of B.
- To write a query that performs an outer join of tables A and B and returns all rows from B (a right outer join), use the RIGHT [OUTER] JOIN syntax in the FROM clause, or apply the outer join operator (+) to all columns of A in the join condition in the WHERE clause. For all rows in B that have no matching rows in A, Oracle returns null for any select list expressions containing columns of A.
To write a query that performs an outer join and returns all rows from A and B, extended with nulls if they do not satisfy the join condition (a full outer join), use the FULL [OUTER] JOIN syntax in the FROM clause.
Why Group by is only used with having clause and aggregative functions example?
HAVING is used to perform an action on groups created by GROUP BY similar to that of the WHERE clause on rows in a basic SQL statement. The WHERE clause limits the rows evaluated. The HAVING clause limits the grouped rows returned.
- AVG: returns the variance or variability of an expression.
- COUNT: returns the number of rows returned by a query.
- FIRST: returns the first value from an ordered sequence.
- LAST: returns the last value from an ordered sequence.
- MAX: returns the maximum value of a column.
- MIN: returns the minimum value of a column.
- STDDEV: returns the standard deviation of a set of numbers.
- SUM: returns the sum or total the values of a column.
- VARIANCE: returns the variance or variability of an expression
HAVING can be used only with the SELECT statement. HAVING is typically used in a GROUP BY clause. When GROUP BY is not used, HAVING behaves like a WHERE clause.
A HAVING clause is like a WHERE clause, but applies only to groups as a whole, whereas the WHERE clause applies to individual rows. A query can contain both a WHERE clause and a HAVING clause. The WHERE clause is applied first to the individual rows in the tables . Only the rows that meet the conditions in the WHERE clause are grouped. The HAVING clause is then applied to the rows in the result set. Only the groups that meet the HAVING conditions appear in the query output. You can apply a HAVING clause only to columns that also appear in the GROUP BY clause or in an aggregate function. Example of HAVING and WHERE in one query:
SELECT titles.pub_id, AVG( titles.price) FROM titles INNER JOIN publishers ON titles.pub_id = publishers.pub_id WHERE publishers.state = 'CA' GROUP BY titles.pub_id HAVING AVG(titles.price) > 10;
Sometimes you can specify the same set of rows using either a WHERE clause or a HAVING clause. In such cases, one method is not more or less efficient than the other. The optimizer always automatically analyzes each statement you enter and selects an efficient means of executing it. It is best to use the syntax that most clearly describes the desired result. In general, that means eliminating undesired rows in earlier clauses.
What is a Stored Procedure?
A stored procedure is a named group of SQL statements that have been previously created and stored in the server database. Stored procedures accept input parameters so that a single procedure can be used over the network by several clients using different input data. And when the procedure is modified, all clients automatically get the new version. Stored procedures reduce network traffic and improve performance. Stored procedures can be used to help ensure the integrity of the database.
e.g. sp_helpdb, sp_renamedb, sp_depends etc.
What is a Trigger?
A trigger is a SQL procedure that initiates an action when an event (INSERT, DELETE or UPDATE) occurs. Triggers are stored in and managed by the DBMS. Triggers are used to maintain the referential integrity of data by changing the data in a systematic fashion. A trigger cannot be called or executed; DBMS automatically fires the trigger as a result of a data modification to the associated table. Triggers can be considered to be similar to stored procedures in that both consist of procedural logic that is stored at the database level. Stored procedures, however, are not event-drive and are not attached to a specific table as triggers are. Stored procedures are explicitly executed by invoking a CALL to the procedure while triggers are implicitly executed. In addition, triggers can also execute stored procedures.
Nested Trigger: A trigger can also contain INSERT, UPDATE and DELETE logic within itself; so when the trigger is fired because of data modification, it can also cause another data modification, thereby firing another trigger. A trigger that contains data modification logic within itself is called a nested trigger.
What are the Different Types of Triggers?
There are two types of Triggers.
1) DML Trigger
There are two types of DML Triggers
1.Instead of Trigger
Instead of Triggers are fired in place of the triggering action such as an insert, update, or delete.
Instead of Triggers are fired in place of the triggering action such as an insert, update, or delete.
2. After Trigger
After triggers execute following the triggering action, such as an insert, update, or delete.
After triggers execute following the triggering action, such as an insert, update, or delete.
2) DDL Trigger
This type of trigger is fired against Drop Table, Create Table, Alter Table or Login events. DDL Triggers are always After Triggers.
What is a View?
A simple view can be thought of as a subset of a table. It can be used for retrieving data as well as updating or deleting rows. Rows updated or deleted in the view are updated or deleted in the table the view was created with. It should also be noted that as data in the original table changes, so does the data in the view as views are the way to look at parts of the original table. The results of using a view are not permanently stored in the database. The data accessed through a view is actually constructed using standard T-SQL select command and can come from one to many different base tables or even other views.
What is an Index?
An index is a physical structure containing pointers to the data. Indices are created in an existing table to locate rows more quickly and efficiently. It is possible to create an index on one or more columns of a table, and each index is given a name. The users cannot see the indexes; they are just used to speed up queries. Effective indexes are one of the best ways to improve performance in a database application. A table scan happens when there is no index available to help a query. In a table scan, the SQL Server examines every row in the table to satisfy the query results. Table scans are sometimes unavoidable, but on large tables, scans have a terrific impact on performance.
What is a Cursor?
A cursor is a database object used by applications to manipulate data in a set on a row-by-row basis, instead of the typical SQL commands that operate on all the rows in the set at one time.
In order to work with a cursor, we need to perform some steps in the following order:
- Declare cursor
- Open cursor
- Fetch row from the cursor
- Process fetched row
- Close cursor
- Deallocate cursor.
What is the Difference between a Function and a Stored Procedure?
UDF can be used in the SQL statements anywhere in the WHERE/HAVING/SELECT section, whereas Stored procedures cannot be. UDFs that return tables can be treated as another rowset. This can be used in JOINs with other tables. Inline UDF’s can be thought of as views that take parameters and can be used in JOINs and other Rowset operations.
What are Different Types of Join?
Cross Join
A cross join that does not have a WHERE clause produces the Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. The common example is when company wants to combine each product with a pricing table to analyze each product at each price.
Inner Join
A join that displays only the rows that have a match in both joined tables is known as inner Join. This is the default type of join in the Query and View Designer.
Outer Join
- A join that includes rows even if they do not have related rows in the joined table is an Outer Join. You can create three different outer join to specify the unmatched rows to be included:
- Left Outer Join: In Left Outer Join, all the rows in the first-named table, i.e. “left” table, which appears leftmost in the JOIN clause, are included. Unmatched rows in the right table do not appear.
- Right Outer Join: In Right Outer Join, all the rows in the second-named table, i.e. “right” table, which appears rightmost in the JOIN clause are included. Unmatched rows in the left table are not included.
- Full Outer Join: In Full Outer Join, all the rows in all joined tables are included, whether they are matched or not.
Self Join
This is a particular case when one table joins to itself with one or two aliases to avoid confusion. A self join can be of any type, as long as the joined tables are the same. A self join is rather unique in that it involves a relationship with only one table. The common example is when company has a hierarchal reporting structure whereby one member of staff reports to another. Self Join can be Outer Join or Inner Join.
What is the Correct Order of the Logical Query Processing Phases?
The correct order of the Logical Query Processing Phases is as follows:
1. FROM
2. ON
3. OUTER
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10. TOP
11. ORDER BY
2. ON
3. OUTER
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10. TOP
11. ORDER BY
Difference in clustered and unclustered index?
Non-clustered
The data is present in arbitrary order, but the logical ordering is specified by the index. The data rows may be spread throughout the table regardless of the value of the indexed column or expression. The non-clustered index tree contains the index keys in sorted order, with the leaf level of the index containing the pointer to the record (page and the row number in the data page in page-organized engines; row offset in file-organized engines). In a non-clustered index: The physical order of the rows is not the same as the index order. Typically created on non-primary key columns used in JOIN, WHERE, and ORDER BY clauses. There can be more than one non-clustered index on a database table.
Clustered
Clustering alters the data block into a certain distinct order to match the index, resulting in the row data being stored in order. Therefore, only one clustered index can be created on a given database table. Clustered indices can greatly increase overall speed of retrieval, but usually only where the data is accessed sequentially in the same or reverse order of the clustered index, or when a range of items is selected. Since the physical records are in this sort order on disk, the next row item in the sequence is immediately before or after the last one, and so fewer data block reads are required. The primary feature of a clustered index is therefore the ordering of the physical data rows in accordance with the index blocks that point to them. Some databases separate the data and index blocks into separate files, others put two completely different data blocks within the same physical file(s).
What are the Different Index Configurations a Table can have?
A table can have one of the following index configurations:
- No indexes
- A clustered index
- A clustered index and many non-clustered indexes
- A non-clustered index
- Many non-clustered indexes
What’s the Difference between a Primary Key and a Unique Key?
Both primary key and unique key enforce uniqueness of the column on which they are defined. But by default, the primary key creates a clustered index on the column, whereas unique key creates a non-clustered index by default. Another major difference is that primary key doesn’t allow NULLs, but unique key allows one NULL only.
What is Difference between DELETE and TRUNCATE Commands?
Delete command removes the rows from a table on the basis of the condition that we provide with a WHERE clause. Truncate will actually remove all the rows from a table, and there will be no data in the table after we run the truncate command.
TRUNCATE
- TRUNCATE is faster and uses fewer system and transaction log resources than DELETE. (Read all the points below)
- TRUNCATE removes the data by deallocating the data pages used to store the table’s data, and only the page deallocations are recorded in the transaction log.
- TRUNCATE removes all the rows from a table, but the table structure, its columns, constraints, indexes and so on remains. The counter used by an identity for new rows is reset to the seed for the column.
- You cannot use TRUNCATE TABLE on a table referenced by a FOREIGN KEY constraint.
- Using T-SQL – TRUNCATE cannot be rolled back unless it is used in TRANSACTION. OR TRUNCATE can be rolled back when used with BEGIN … END TRANSACTION using T-SQL.
- TRUNCATE is a DDL Command.
- TRUNCATE resets the identity of the table.
DELETE
- DELETE removes rows one at a time and records an entry in the transaction log for each deleted row.
- DELETE does not reset Identity property of the table.
- DELETE can be used with or without a WHERE clause
- DELETE activates Triggers if defined on table.
- DELETE can be rolled back.
- DELETE is DML Command.
- DELETE does not reset the identity of the table.
What are Different Types of Locks?
- Shared Locks: Used for operations that do not change or update data (read-only operations), such as a SELECT statement.
- Update Locks: Used on resources that can be updated. It prevents a common form of deadlock that occurs when multiple sessions are reading, locking, and potentially updating resources later.
- Exclusive Locks: Used for data-modification operations, such as INSERT, UPDATE, or DELETE. It ensures that multiple updates cannot be made to the same resource at the same time.
What are Pessimistic Lock and Optimistic Lock?
Optimistic Locking is a strategy where you read a record, take note of a version number and check that the version hasn’t changed before you write the record back. If the record is dirty (i.e. different version to yours), then you abort the transaction and the user can re-start it.
Pessimistic Locking is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking but requires you to be careful with your application design to avoid Deadlocks.
When is the use of UPDATE_STATISTICS command?
This command is basically used when a large amount of data is processed. If a large amount of deletions, modifications or Bulk Copy into the tables has occurred, it has to update the indexes to take these changes into account. UPDATE_STATISTICS updates the indexes on these tables accordingly.
What is the Difference between a HAVING clause and a WHERE clause?
They specify a search condition for a group or an aggregate. But the difference is that HAVING can be used only with the SELECT statement. HAVING is typically used in a GROUP BY clause. When GROUP BY is not used, HAVING behaves like a WHERE clause. Having Clause is basically used only with the GROUP BY function in a query, whereas WHERE Clause is applied to each row before they are part of the GROUP BY function in a query.
What is CHECK Constraint?
A CHECK constraint is used to limit the values that can be placed in a column. The check constraints are used to enforce domain integrity.
What is NOT NULL Constraint?
A NOT NULL constraint enforces that the column will not accept null values. The not null constraints are used to enforce domain integrity, as the check constraints.
What is the difference between UNION and UNION ALL?
UNION
The UNION command is used to select related information from two tables, much like the JOIN command. However, when using the UNION command all selected columns need to be of the same data type. With UNION, only distinct values are selected.
The UNION command is used to select related information from two tables, much like the JOIN command. However, when using the UNION command all selected columns need to be of the same data type. With UNION, only distinct values are selected.
UNION ALL
The UNION ALL command is equal to the UNION command, except that UNION ALL selects all values.
The UNION ALL command is equal to the UNION command, except that UNION ALL selects all values.
The difference between UNION and UNION ALL is that UNION ALL will not eliminate duplicate rows, instead it just pulls all rows from all the tables fitting your query specifics and combines them into a table.
What is B-Tree?
The database server uses a B-tree structure to organize index information. B-Tree generally has following types of index pages or nodes:
- Root node: A root node contains node pointers to only one branch node.
- Branch nodes: A branch node contains pointers to leaf nodes or other branch nodes, which can be two or more.
- Leaf nodes: A leaf node contains index items and horizontal pointers to other leaf nodes, which can be many.
What are the Advantages of Using Stored Procedures?
- Stored procedure can reduced network traffic and latency, boosting application performance.
- Stored procedure execution plans can be reused; they staying cached in SQL Server’s memory, reducing server overhead.
- Stored procedures help promote code reuse.
- Stored procedures can encapsulate logic. You can change stored procedure code without affecting clients.
- Stored procedures provide better security to your data.
What is BCP? When is it Used?
BCP or BulkCopy is a tool used to copy huge amounts of data from tables and views. BCP does not copy the complete structures from source to destination. BULK INSERT command helps to import a data file into a database table or view in a user-specified format.
What is SQL Injection? How to Protect Against SQL Injection Attack?
SQL injection is an attack in which malicious code is inserted into strings that are later passed to an instance of SQL Server for parsing and execution. Any procedure that constructs SQL statements should be reviewed for injection vulnerabilities because SQL Server will execute all syntactically valid queries that it receives. Even parameterized data can be manipulated by a skilled and determined attacker.
Here are few methods which can be used to protect again SQL Injection attack:
- Use Type-Safe SQL Parameters
- Use Parameterized Input with Stored Procedures
- Use the Parameters Collection with Dynamic SQL
- Filtering Input parameters
- Use the escape character in LIKE clause
- Wrapping Parameters with QUOTENAME() and REPLACE()
What is CHECKPOINT Process in the SQL Server?
CHECKPOINT process writes all dirty pages for the current database to disk. Dirty pages are data pages that have been entered into the buffer cache and modified, but not yet written to disk.
What special Oracle feature allows you to specify how the cost based system treats a SQL statement?
The COST based system allows the use of HINTs to control the optimizer path selection. If they can give some example hints such as FIRST ROWS, ALL ROWS, USING INDEX, STAR, even better.
What is a Hint?
Hints are options and strong suggestions specified for enforcement by the SQL Server query processor on DML statements. The hints override any execution plan the query optimizer might select for a query
There are three different types of hints. Let us understand the basics of each of them separately.
Join Hint
This hint is used when more than one table is used in a query. Two or more tables can be joined using different types of joins. This hint forces the type of join algorithm that is used. Joins can be used in SELECT, UPDATE and DELETE statements.
Query Hint
This hint is used when certain kind of logic has to be applied to a whole query. Any hint used in the query is applied to the complete query as opposed to a part of it. There is no way to specify that only a certain part of a query should be used with the hint. After any query, the OPTION clause is specified to apply the logic to this query. A query always has any of the following statements: SELECT, UPDATE, DELETE, INSERT or MERGE (SQL 2K8); and this hint can be applied to all of them.
Example:
Example:
SELECT /*+ INDEX(i dcf_vol_prospect_ids_idx)*/
/*+ LEADING(i vol) */
/*+ ALL_ROWS */
i.id_number,
...
FROM i_table i
JOIN vol_table vol on vol.id_number = i.id_number
JOIN to_a_bunch_of_other_tables...
WHERE i.solicitor_id = '123'
AND vol.solicitable_ind = 1;
What is Database Partitioning?
Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions. SQL queries and DML statements do not need to be modified in order to access partitioned tables. However, after partitions are defined, DDL statements can access and manipulate individuals partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects. Also, partitioning is entirely transparent to applications.
Each partition of a table or index must have the same logical attributes, such as column names, datatypes, and constraints, but each partition can have separate physical attributes such as pctfree, pctused, and tablespaces.
Partitioning is useful for many different types of applications, particularly applications that manage large volumes of data. OLTP systems often benefit from improvements in manageability and availability, while data warehousing systems benefit from performance and manageability.
Oracle provides the following partitioning methods:
create table Book (
ID number not null,
ID number not null,
TITLE varchar2,
PUBID char,
PUBDATE date,
constraint lab6_pk primary key(ID))
Partition by range (pubdate)(
Partition p1 values less than (to_date('01-JAN-2000','DD-MON-YYYY')),
Partition p1 values less than (to_date('01-JAN-2000','DD-MON-YYYY')),
Partition p2 values less than (to_date('01-JAN-2010','DD-MON-YYYY')),
Partition p3 values less than (MAXVALUE)
)
You would insert and select data just the same way that you would for a non-partitioned table
INSERT INTO Book( id, title, pubid, pubdate )VALUES( 1, 'Something', 'FOO', sysdate );
SELECT rowid, id, title, pubid, pubdat FROM Book;
No comments:
Post a Comment