Friday, April 8, 2011

Some fancy postgresql query.Window functions.


  • I finally got my 360MB mysqldump file into my postgresql database, thanks to the sed utility I could change some naming and character escaping differences between both DBMS. The data loading took a couple of hours and I was getting this message:



regarding some more character escaping issues,but after all everything went ok. This is the table structure (yes it's only one big table with 391832 records):

I was expecting a more intricate database schema, but I don't want all this work go to waste so I thought of this query:

[sourcecode language="sql"]

select row_number() over(ORDER BY civiliankia desc) as id_killing,date, civiliankia as murdered_civilians,sum(civiliankia) over (ORDER BY civiliankia asc) as total_innocent_killed from redacted_war_diary_irq as A where civiliankia > 0 order by civiliankia desc;

[/sourcecode]

where "civiliankia" stands for civilians killed in action.

partial output:

Results that match with some reported figures:

http://www.iraqbodycount.org/analysis/numbers/warlogs/

No wonder why I just can't get to sleep...creepy.

  • The only new concept in this example is the window functions, recently integrated into postgresql that allow you to do some operations on the data generated by the current query (in this case the incremental numbering presented in the killing_id field and the cumulative dead toll->total_innocent_killed), they look pretty handy but I need to do more exercises to fully understand how they work.

  • There are also these amazing concepts: table inheritance and table partition. I'll review them tomorrow.


 

No comments:

Post a Comment