Gj’s SQL blog

April 19, 2009

The power of Views

Filed under: postgresql, views — Grzegorz Jaskiewicz @ 11:52 am

So, views.

What are they useful for?  how can we utilize such a power of views ?

Lets first get into how views are implemented in PostgreSQL, and any other professional DBE (Oracle, DB2, MicrosoftSQL). But I will focus specifically on Postgresql, of course.

So, views are just queries, with given name. They produce some results, but that’s not what’s the most important bit. The most importantly, they are a query.

Not getting too much into PostgreSQL internals, it will “compile” such a query into internal structure, and keep it like that under name you gave it.

Every single time view is used, and views are always used as part of a query, even if the query is “SELECT * FROM myview”, it is still a query – right?

Your query will be parsed, and represented by a structure in postgresql. If view is used, it will be simply merged into that structure.

You might ask, why ?

Well, the reason here is simple. That whole structure, than undergoes planning and optymalization. Think: C/C++ macros.

Personally, I am trying to find any commonly used parts of queries. It is very easy to find such, if you materialize your databases at least a bit (and I hope you do, but that’s a story for another entry).

Let’s say, we have a very simple database, where we keep customer’s account, a list of items customer purchased, and every item has specific properties.

So, again – to simplify.



CREATE TABLE customers( id bigserial, name text not null );

CREATE TABLE items_purchased( id bigserial, customerId bigint  not null, 
                           itemId bigint not null, 
                           purchase_time timestamp not null default current_timestamp );

CREATE TABLE items( id bigserial, name text not null );

Now, in application(s) using such database, one of the common tasks you’ll find is connecting items with customers, or the other way around.

That’s a three way JOIN (plus, on top of it probably many others).

For instance, you might in one place, want to list last 10 items purchased by client:

1. SELECT ip.id, ip.purchase_time, i.id AS item_Id, i.name AS item_name
            FROM customers c
            JOIN items_purchased ip ON ip.customerId = c.id
            JOIN items i            ON i.id          = ip.itemid
                    WHERE c.name = 'fred'
                      ORDER BY ip.purchase_time DESC
                      LIMIT 10;

Well, some others will prefer to get customer c.id from sub-query, personally – I prefer JOINs. For one, they are much more efficient, than sub-query returning more than one result – but for consistency reasons, all_joins rule apply here 🙂

The other thing you might want to do, is report 10 customers that lately purchased the item:

2. SELECT c.name, c.id, ip.id AS transactionId, ip.purchase_time
            FROM items i
            JOIN items_purchased ip ON ip.itemid = i.id
            JOIN customers c        ON c.id          = ip.customerId
                    WHERE i.name = 'keyboardX'
                      ORDER BY ip.purchase_time DESC
                      LIMIT 10;

Third example, might be 10 other items bought recently by someone who also bought item X.

This is only an example, but lets keep it real, and this time – lets add few items to our tables. Note however, that I will not create indices, nor any constraints – for clarity reasons. Plus the amount of data…:

insert into customers(name)
          values('fred'),('bob'), ('joe'), ('kate');
insert into items(name)
          values('keyboard'), ('desk'), ('keyboardX'), ('bookPc pro'),
                ('colshite'), ('jumper'), ('couch'), ('penX'), 
                ('transparent doors'),('gj''s blog'), ('boots'), 
                ('magazine Y'), ('mouse');

Now, lets create purchases. Again, power of postgresql allows us to do it easily. The result will be random, so results that follow will be different on your server:

INSERT INTO items_purchased(customerId, itemId)

SELECT  (random()*3)::int+1, (random()*12)::int+1

      FROM generate_series(1, 13*4*12);

UPDATE items_purchased SET purchase_time = 
  purchase_time - ((random()*1000)::int || ' seconds')::interval;

Now, some items were purchased multiple times by customers, and that’s alright.
We want now list 10 items purchased by other customers in our shop, to offer other possible choice of interest to customer browsing catalog. Say our customer is looking at desk.

First natural thing, is to get list of purchases, with their last time of purchase, number of purchases, etc:

SELECT COUNT(1) AS nr_of_purchases, max(purchase_time) AS max_purchase_time, itemId 
             FROM items_purchased GROUP BY itemId;
Now, if we want to base our criteria on number of purchases, we can do it like that:
SELECT itemId
   FROM (
     SELECT COUNT(1) AS nr_of_purchases, 
            max(ip.purchase_time) AS max_purchase_time, ip.itemId
              FROM items_purchased ip
              JOIN customers c ON ip.customerId = c.id
              JOIN items i ON ip.itemId = i.id
                  WHERE c.name <> 'bob'
                  GROUP BY itemId
    ) f
ORDER BY f.nr_of_purchases DESC
  LIMIT 10;

Brilliant, aye ? But, now we want to have a choice as well, and the other option is to base our suggestion on recent purchase time AND number of purchases. Easy, right ? But that’s another query. And if you want to lay it out nicely in code, to make it readable (and I love to)  – it takes quite few lines each time. Another thing, what if you need to add another element to the sub-query ? What if you need to add yet another option ? What if something doesn’t work, and you want to debug it.

Well, divide and conquer. And the best way to do it, and use power of postgreSql – is to create a VIEW based on the ‘core’, or commonly used element of query. Here is another reason why JOINs make more sense, than subquery. You will be able to use WHERE on all tables used in your view, which isn’t going to be possible if you use sub-query.

Also, remember to create views on most simple version of query, or part of query. That is, remover any unnecessary WHEREs, etc.

So view in for our two first queries, will look like that:

CREATE VIEW customersItems AS (
  SELECT c.name AS customer_name, c.id AS customer_id, ip.id AS transaction_Id, ip.purchase_time
         i.id AS item_id, i.name AS item_name
            FROM items i
            JOIN items_purchased ip ON ip.itemid = i.id
            JOIN customers c        ON c.id      = ip.customerId
);

Simple, isn’t it. Yet, it allows you to rewrite first query into something really simple, and yet perform as good as original:

SELECT transaction_id, purchase_time, item_id, item_name
 FROM customersItems  
   WHERE customer_name = 'fred' 
    ORDER BY purchase_time DESC                 
     LIMIT 10;

Notice however, that you can use the same view to grab name of customers who bought item X too. And it will be as powerful and fast as you would expect it.

The optimization of query, and right order will be choose by planner, based on many criteria – so you don’t have to worry about it!

Now, what about the ‘most recent’ query ? Well, it is simple as well. Here’s a view that I would create:

CREATE VIEW recent_purchases AS
(
  SELECT COUNT(1) AS nr_of_purchases, max(ip.purchase_time) AS max_purchase_time, ip.itemId
    FROM items_purchased ip
    JOIN customers c ON ip.customerId = c.id
    JOIN items i ON ip.itemId = i.id
      GROUP BY itemId
);

Easy, isn’t!

I hope that wasn’t too long, and hopefully you can see the power of views.
Please comment on, I am new to blogging, and not a very good writer – so any comments are appreciated.
There’s still quite few other subjects, that I would like to share my view on with you – so stay tuned.

(btw, I am starting to hate wysiwyg editor here, it seriously looses track of styles after few pages).
Advertisements

2 Comments »

  1. Very nice blog. 🙂

    Comment by a — April 19, 2009 @ 3:35 pm

    • thanks 🙂

      Comment by gryzman — April 19, 2009 @ 4:46 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: