Tuesday, September 11, 2012

filter, spurious tuples and distinct

In django ORM, we can use filter to query our models; we can use double-underscores __ to follow relationships in our model, much like doing JOIN in SQL (in fact, it gets translated to join statements); however, the filter method introduces 'spurious tuples' when doing joins; this may or may not be a big deal, but it is good to know. Using distinct() will eliminate those extra tuples.

For example, imagine a simple model, consisting of a person class, which may have more than one email.
Then, if we add data for a person with one email, as follows: We can use the following code to get the people with a phone number in area '770' However, if the same person has TWO phones with that area code, we get TWO rows for the person. Basically, the filter function with the double underscores is doing something equivalent to the following SQL: where it should be doing: however, adding a call to distinct() will fix it; our query, adjusting for django's idiosyncrasy, is: and we can see the different results returned: And, for completeness, here's the code for adding the second phone number.

No comments:

Post a Comment