Thursday, January 14, 2010

Derived Table, Subquery, Correlated Subquery, etc.

I was recently attempting to explain to someone the difference between derived tables and subqueries. I didn't do a great job and came to the conclusion that I wasn't completely clear on the terminology. I don't like not knowing things so I did a little research and this is what I found . . .

A derived table is not what I thought it was. I always referred to a derived table as being a query in a FROM clause. For example:

    SELECT
        id AS EmployeeId,
        name AS EmployeeFullName,
        COUNT(*) AS PaychecksIssuedCount
    FROM
        (SELECT
             id,
             name
          FROM
              employee
          WHERE
               employee_type_cd = 101) emp
          INNER JOIN paycheck ps
              on emp.id = ps.emp_id

Apparently this is actually called a subquery and can be referred to as a subquery in a FROM clause. It's a bit of symantics but a derived table is actually the result of a table subquery.

For some reason people that use SQL Server (like me) tend to refer to this as a derived table. I guess I'll have to stop doing that now.

As for correlated subqueries, these can be found in the list of projected columns as well as the WHERE and HAVING clauses. For example:

SELECT
    id,
    name
FROM
    employee emp
WHERE
    years_employed =  
        (SELECT
            AVG(years_employed)
         FROM
            employee emp_avg_years
         WHERE
            emp_avg_years.business_unit = emp.business_unit)

As you can see, the subquery has a WHERE clause that refers (correlates) to the outer/parent query.

So there you have it. Stop saying derived tables! Most likely you're misusing it the way I have ever since I learned about the concept (of subqueries - not derived tables ;)). Or maybe you're not as anal as I am. However, if you've read this far I'd say you are.

4 comments:

  1. You convinced me, subqueries from now on!

    ReplyDelete
  2. Your referring to a derived table was correct. A derived table, although it may look like a sub query in most respects, is different. A derived table has a distinct usage. Mainly, it is a result set that is "derived" at run time that can be joined on. You can't join to a sub query. Derived tables (and the term itself) have been around for decades and haven't changed. So, referring to them as such is not incorrect. In fact, if you want to be "anal" about it, please call them derived tables! A derived table is a sub query with a huge distinction, it can be treated as an in-line table that can be joined to other tables, or even other derived tables. So, when you use the term derived table, people "in the know" will know what you mean. Using "sub query" to describe a derived table doesn't convey the same meaning. A CTE is yet another kind of derived table and it too is not a sub query; but that's another discussion.

    ReplyDelete
  3. Reading Hoffner's Modern Database Management book which states that a derived table is a subquery that is used in the FROM clause (as opposed to the WHERE or other clauses). It does not have to be joined on the outer query to be a derived table, but it can be.

    ReplyDelete
  4. Thanks a ton for this post!

    This helped me understand the concept as well:

    http://www.programmerinterview.com/index.php/database-sql/derived-table-vs-subquery/

    ReplyDelete