![]() Result: item_symbol price source_date my_rank ORDER BY item_symbol, source_date DESC - change this as required ORDER BY item_symbol, source_date DESC, price DESC - in case of ties! So, now we wrap that in a query, pulling out those results whose my_rank value is := := + 1 Note that 12 lines of that query could be replaced by one ROW_NUMBER() function line! So, we have the items ('A', 'B') order by date DESC (most recent first) with the price. Result: item_symbol price source_date my_rank := item_symbol ORDER BY item_symbol, source_date DESC, price DESC I would strongly urge you to upgrade to version 8, it has many other goodies - CTEs, CHECK constraints.Īnyway, I'll demonstrate the steps, partly to explain them to you, and partly to explain them to myself! :-) SELECT MySQL allows the use of user variables which are a godsend when you don't have capabilities such as the ROW_NUMBER() window function which would have made this query trivial. I used the DDL and DML from kudos to him (and +1): CREATE TABLE item I came up with the following adapted from above (all of the DDL, DML and SQL below is available on the fiddle here): The great thing about this site is that it tells you how to do stuff in MySQL for all versions - well, going back at least to MySQL 5.5 - and if you're still running that, well. Query: SELECT DISTINCT ON (author_id) p.I went to my favourite MySQL "tips and tricks" site here and went to the common queries link and looked for the Top N per group section. The resulting query is more compact than the LATERAL and might be more efficient, under different data distributions. In Postgres for example, the DISTINCT ON syntax can be used in version older than 9.3. Needless to say again but before using any of the above, they should be tested in your environment and cross tested against all the many other versions/rewritings of the query. The useful index is on posts (author_id, date, id) for MySQL and or on posts (author_id, date DESC) for Postgres. If you want to be precise about which one (of the tied) will be returned, the ORDER BY in the subquery can be modified (to ORDER BY pi.date DESC, pi.id DESC or ORDER BY pi.date DESC, a.name for example). This solution to the greatest-n-per-group problem matches also your request about ties, as it returns always one result per group. It's also best if there is an index or a table to find all the distinct author_id values and an additional index on the posts table for the group by. The assumptions are that the number of authors (the attribute we group by on) is small, compared to the number of posts (the table where we apply the group by). It uses a LATERAL join in Postgres, which is available in 9.3+ versions (in SQL Server lingo CROSS/OUTER APPLY) and a simulation of this join in MySQL. This specific kind of query is often called greatest-n-per-group (there is even a tag for it!) and under certain assumptions, one of the many ways to write them, is often quite efficient in both MySQL and PostgreSQL. You should always test various different ways of writing the queries, on your tables, with the sizes and distribution you expect to have on production, with your hardware and configuration settings, to decide which rewritings of the queries should be kept. Not always at least.Įfficiency depends on many different things, like the specific DBMS, the specific version (different versions have different improvements on the optimizer and the available syntax), the type of columns, the indexes available, the size of the tables and distribution of values, the hardware the server is running, the configuration settings etc. If you aim is to have queries with maximum efficiency, none of the above queries is really the best. ) p1 INNER JOIN posts p2 USING (author_id, date) ON p1.author_id = p2.author_id AND p1.date = p2.max_date ![]() LEFT JOIN posts p2 ON p1.author_id = p2.author_id AND p1.date < p2.date How should I modify these queries to return exactly one post per author? SELECT p1.* Then the returned result set contains all such posts. Thanks, both links provide answers to my question with some exception.Īll the following queries give the same result (which one is the most efficient btw?) The issue is when there is more than one post from the same author with the same date. I need to select one most recent post for each author. I use recent versions of PostgreSQL and MySQL. I'm sure its a simple question and I suppose it was asked many times, but I just can't figure it out from other answers, sorry.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |