Hi Team,

I have very good configuration server dedicated only for PostgresSql. Server is 16 Core 122 GB RAM on Linux machine.

select version();

PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2 20131212 (Red Hat 4.8.2-7), 64-bit

I have set the following parameter

show segment_size; (Set during installation)
32 GB

show block_size; (Set during installation)
16 KB

show shared_buffers;
30 GB

show temp_buffers;
8 GB

show effective_cache_size;
60 GB

show work_mem;
30 GB

show maintenance_work_mem;
8 GB

I have the fact table with 128M rows and join with small table with 100 rows.

explain analyze
SELECT s.store_code,
       s.store_description,
       SUM(sale_net_val) Sales_value,
       SUM(sale_tot_qty) Qty,
       COUNT(DISTINCT CUSTOMER_CODE_TY)
  FROM arc_dw_ca_etl.dim_store s,
       arc_dw_ca_etl.AGG_MCSP a
 WHERE s.store_key = a.store_key
   AND a.date_key BETWEEN 20110126 AND 20120126
 GROUP BY s.store_code,
          s.store_description;

QUERY PLAN

    GroupAggregate  (cost=6204854.15..6663869.67 rows=112 width=43) (actual time=127019.448..277891.979 rows=112 loops=1)
      ->  Sort  (cost=6204854.15..6281356.55 rows=30600960 width=43) (actual time=125818.967..131955.596 rows=30578360 loops=1)
            Sort Key: s.store_code, s.store_description
            Sort Method: quicksort  Memory: 3905194kB
            ->  Hash Join  (cost=614628.93..2400072.53 rows=30600960 width=43) (actual time=2552.168..27891.699 rows=30578360 loops=1)
                  Hash Cond: (a.store_key = s.store_key)
                  ->  Bitmap Heap Scan on agg_mly_cust_str_prod a  (cost=614624.41..1979304.81 rows=30600960 width=25) (actual time=2552.075..10305.247 rows=30578360 loops=1)
                        Recheck Cond: ((date_key >= 20110126) AND (date_key <= 20120126))
                        ->  Bitmap Index Scan on idx_amcsp_date_store_cust_code_ty  (cost=0.00..606974.17 rows=30600960 width=0) (actual time=2489.971..2489.971 rows=30578360 loops=1)
                              Index Cond: ((date_key >= 20110126) AND (date_key <= 20120126))
                  ->  Hash  (cost=3.12..3.12 rows=112 width=26) (actual time=0.073..0.073 rows=112 loops=1)
                        Buckets: 1024  Batches: 1  Memory Usage: 7kB
                        ->  Seq Scan on dim_store s  (cost=0.00..3.12 rows=112 width=26) (actual time=0.004..0.029 rows=112 loops=1)

    Total runtime: 278084.789 ms

If i have understand properly the data fetching is not taking much time(taking 10 sec's)[considered only upper limits].

But for sort operation(131 Sec's) and GroupAggregate(277 Sec's) it is taking most of the time.

Since the server is very high end I am expecting it should not take more than 10 Sec's to get the result.

Can someone help to resolve this performance issue?

Regards

--Noor

asked Jun 25 '14 at 07:54

Noor%20Ahmed's gravatar image

Noor Ahmed
1112

edited Jul 08 '14 at 11:42

Markus%20Winand's gravatar image

Markus Winand ♦♦
93651120

Is re-writing the query an option or is it generated by some tool?

(Jul 08 '14 at 11:54) Markus Winand ♦♦