hi!
The only thing that comes to my mind is to manually implement something like HierarchyID (trigger) and then search using an indexed LIKE
.
In your particular case, there is a constraint making parent + sortorder
unique (even for the top level nodes). That means that the sortorder
of the hierarchy along is enough to make an entry unique.
The "hierarchy" built be the trigger does therefore only need to concatenate the sortorder
. That string can then server as LIKE
filter (because of the uniqueness) and for the ORDER BY
.
Assuming PostgreSQL:
create table tree (
id int not null primary key,
parent_id int,
sortorder int not null,
hierarchy varchar(1000000),
foreign key (parent_id) references tree(id) on delete cascade
);
create unique index tree_unique_idx1
on tree(parent_id, sortorder) where parent_id is not null;
create unique index tree_unique_idx2 on tree(sortorder) where parent_id is null;
create index tree_hierarchy on tree(hierarchy varchar_pattern_ops);
create or replace function getHierarchy() returns trigger as $$
declare
h varchar;
begin
IF (TG_OP = 'INSERT' AND NEW.hierarchy is not null) THEN
RAISE EXCEPTION 'Hierarchy is set by a trigger, just leave it NULL';
END IF;
IF (TG_OP = 'UPDATE' AND (NEW.ID <> OLD.ID OR NEW.SORTORDER <> OLD.SORTORDER)) THEN
RAISE EXCEPTION 'holy shit. skipping that cases for this sample';
END IF;
IF (NEW.SORTORDER < 0) THEN
RAISE EXCEPTION 'negative sortorder not supported';
END IF;
select into h coalesce(hierarchy || '/' , '') from tree where id = NEW.parent_id;
NEW.hierarchy := coalesce(h, '') || lpad(to_hex(NEW.sortorder), 8, '0');
return NEW;
end
$$ language plpgsql;
create trigger setHierarchy
before insert or update
on tree for each row execute procedure getHierarchy();
So, that's the easy part ;)
Now, some test-data (lower the amount as you like):
insert into tree values (1, null, 1); -- root node
insert into tree values (2, 1, 1);
insert into tree values (3, 1, 3);
insert into tree values (4, 1, 2);
insert into tree values (5, 1, 4);
insert into tree values (6, 3, 1);
insert into tree values (7, 3, 3);
insert into tree values (8, 3, 2);
insert into tree select generate_series, floor(generate_series * 0.5), generate_series from generate_series(9, 100000);
The query is now very straight, for example:
explain analyze
select *
from tree
where hierarchy like '00000001/00000002/00000009/00000012/00000024/00000048/00000090%'
order by hierarchy using ~<~;
------- Simplified:
Index Scan using tree_hierarchy on tree (actual time=0.036..1.979 rows=1023 loops=1)
Index Cond: (((hierarchy)::text ~>=~ '00000001/[...]'::text)
AND ((hierarchy)::text ~<~ '00000001/[...]'::text))
Filter: ((hierarchy)::text ~~ '00000001/[...]%'::text)
Note that there is no Sort operation. Fetching the hierarchy pattern is easy using the PK:
select hierarchy from tree where id = 144;
But now, all the issues with that:
- sub-selecting the pattern doesn't work. Well, it works, from a functional perspective, but it will not use the index on hierarchy. The reason is that PostgreSQL doesn't know that the returned
LIKE
expression has a selective prefix. So, it assumes that it doesn't has one (e.g., like it would start with '%') and performs an full table scan (Seq Scan). That's yet another case where using bind parameters (and similar constructs) prevents proper index usage. Other databases might behave different.
- you can save space by "compressing" the hierarchy. e.g. by using a higher numeric base (I just took hex, because
to_hex
was at hand).
- The tree_hierarchy index uses
varchar_pattern_ops
so that the LIKE
matching works, regardless of the locale.
- The
ORDER BY
needs the USING ~<~
to work in connection with the varchar_pattern_ops
index
- A very similar approach is sill possible, if the constraints don't make the sortorder-hierarchy unique--But it would be required to encode the ID into the hierarchy as well.
- A similar approach should work in other DB's as well, However, if the tree grows deep, the hierachy might get long. So you might reach certain limitations--Such as Oracles infamous ORA-01450
- This sample implementation is probably buggy.
answered
Feb 09 '11 at 15:44
Markus Winand ♦♦
936●5●11●20