Skip to main content

Some subquery magic with MySQL

As part of a Rails project I had to gather up some data from a single table, and calculate various stats on the data

Problem was - had to group data by a field, then calculate aggregates in various ways not possible with a single GROUP BY query

Using the SQL cookbook tips, and some pondering, I learned that I could create a bunch of 'fake' tables from the same table using subqueries, then join these tables in a way that would yield a single resultset

An example:

Let us say you have a table of person, their test points whether they passed the test attempt, like (two people a and b tried a test at various times) :












NamePointsPassed
a10 1
a25 0
b21 1
a28 0
b23 1
a35 1
b31 1
a35 1
b41 1


You want a result like





Name Avg AgeNo. of attempts'passed'
a 30.3 5
b 25.1 4
(caution: These numbers are probably wrong just too lazy to fix them up!)

As you can see, it is not possible with a single query, since grouping by name and aggregating for average will give you column 2 of the results, but how about column 3? column 3 is the no. of records which indicate a 1 for passed for each Name

To achieve this in a single query, do the following


Create a subquery to hold the average age results

SELECT avg(Age) as avgage from table a GROUP BY Name


Create a subquery to hold the no. of passed points per Name

SELECT count(0) as passed FROM table WHERE Passed=1 GROUP BY Name

Now join the two subqueries
SELECT t.Name, q1.avgage, q2.passed FROM table t
LEFT JOIN
( SELECT avg(Age) as avgage from table a GROUP BY Name ) q1
ON q1.Name=t.Name

LEFT JOIN
( SELECT count(0) as passed FROM table WHERE Passed=1 GROUP BY Name ) q2
ON q2.Name=t.Name

ORDER BY Name

the left join is not obvious in this case, but you could end up with missed matchse in more complex queries (then it is useful to use IFNULL() MySQL function to detect nulls and set values other than null in those cases)

An example of a real query is :


SELECT xx.label, xx.avg_time, xx.min_time, xx.max_time, xx.std_time,
IFNULL(yy.cnt_s,0) AS s,IFNULL(zz.cnt_f,0) AS f,
(IFNULL(yy.cnt_s,0) + IFNULL(zz.cnt_f,0)) AS totalsamples
FROM
# table 1, averages
( SELECT label,
AVG(responseTime) AS avg_time,
MIN(responseTime) AS min_time,
MAX(responseTime) AS max_time,
STD(responseTime) AS std_time
FROM rawdatas
WHERE jmetersession_id = 1
GROUP BY label ) xx
LEFT JOIN
# table 2, successes
(SELECT b.label, COUNT(0) AS cnt_s FROM rawdatas b
WHERE b.jmetersession_id = 1 AND b.success=1 GROUP BY b.label ) yy
ON yy.label=xx.label
# table 3, failures
LEFT JOIN
(SELECT label, COUNT(0) AS cnt_f FROM rawdatas
WHERE jmetersession_id = 1 AND success=0 GROUP BY label ) zz
ON zz.label=xx.label


Hope you find this useful!

Comments