dimanche 28 juin 2015

What is the performance difference in MySQL relational division (IN AND instead of IN OR) implementations?

Because MySQL does not have a built in relational division operator, programmers must implement their own. There are two leading examples of implementations which can be found in this answer here.

For posterity I'll list them below:

Using GROUP BY/HAVING


SELECT t.documentid
FROM TABLE t
WHERE t.termid IN (1,2,3)
GROUP BY t.documentid
HAVING COUNT(DISINCT t.termid) = 3

The caveat is that you have to use HAVING COUNT(DISTINCT because duplicates of termid being 2 for the same documentid would be a false positive. And the COUNT has to equal the number of termid values in the IN clause.

Using JOINs


SELECT t.documentid
FROM TABLE t
JOIN TABLE x ON x.termid = t.termid
              AND x.termid = 1
JOIN TABLE y ON y.termid = t.termid
              AND y.termid = 2
JOIN TABLE z ON z.termid = t.termid
              AND z.termid = 3

But this one can be a pain for handling criteria that changes a lot.

Of these two implementation techniques, which one would offer the best performance?

Aucun commentaire:

Enregistrer un commentaire