Friday, April 29, 2011

PostGIS - my first visual map


  • Today I learned a lot about postGIS, but due to the fact that I'm a bit tired at the moment, I'll post only this:


[caption id="attachment_278" align="aligncenter" width="300" caption="Fast food restaurants in America"][/caption]

The data was taken from public shape files available over the internet and loaded into postgresql through the postGIS framework.

It only demonstrates the visualization of geographic data.  One data layer  draws the roads and another the POINT locations of fast food restaurants. The next stage in this learning path is to make spatial queries, i.e. queries that imply certain restrictions such as proximity.

I would love to make maps about interesting things such as U.S. probable clandestine prisons location, U.S. interventions over the world, current world oil  reserves and the like.

Thursday, April 28, 2011

Can't quit / Requirements Management Introduction Extended Map / OpenJump


  • I've been doing this for quite a long time, and it's not easy for me to stop just like that. This thing helps me control my learning goals and it's a good reference point for people to know me a little more, and the feeling of sharing knowledge is just too compelling. It might sound silly but It also helps me deal with the life's empty meaning  issue.



  • Here's the extended version of my "Requirements management" mind map.


[caption id="attachment_271" align="aligncenter" width="296" caption="Requirement_management with_use cases_extended"][/caption]

The course's span has been longer than I thought. I feel really lucky to be in acquaintance of such distinguished professionals (all of them coworkers and bosses are fantastic individuals).

  • OpenJump is my  Geographic Information Systems visualization and manipulation tool of choice, it's very lightweight and pretty functional. I'm learning to use it  and I'll be posting some maps and spatial queries in the following days.


http://www.openjump.org/

Wednesday, April 27, 2011

No more spare time to write about the stuff I learn.

This is it. I think I've achieved a good level of writing proficiency and that was my main goal when I decided to create this blog. Let's fixate on something else from now on.

Tuesday, April 26, 2011

Requirements management with use cases / testdisk


  • Today we were given at work a short course with the subject "Requirements management with use cases".

    [caption id="attachment_261" align="aligncenter" width="300" caption="Requirements management with use cases"][/caption]


I liked it very much. Much of this information is common sense but having it in a structured body makes things easier to understand.

The instructor is someone with a very good background and overall seems to me a really nice and clever guy.

  • I found this great utility for "undeleting" files, which works on linux and supports a broad array of file systems. The issue here is that when we delete a file, its space is marked as available, but the actual data still lays there. As far as we don't overwrite that space there's a good chance that you can get your stuff back.


http://www.cgsecurity.org/wiki/TestDisk

Monday, April 25, 2011

Introduction to the RFCOMM protocol / Open laszlo / JfreeChart / postGIS / nevernote


  • Today I'm learning the main means of communication between bluetooth devices: the RFCOMM protocol.

    [caption id="attachment_255" align="aligncenter" width="300" caption="RFCOMMprotocol"][/caption]


And a brief introduction to the API usage (for left handed people):

[caption id="attachment_256" align="aligncenter" width="300" caption="RFCOMMAPI"][/caption]

I was being sarcastic about the "brief" introduction, gosh the API is huge!

  • In learning the inner workings of spagoBI. The underlying technology it uses for dashboards and charting  is Open Laszlo and JfreeChart, gosh those tools are so beautiful; with the former you use a xml-like language to create windows, forms, buttons, etc,  and it's very intuitive, of the latter I don't know much, but you can create quite stunning and versatile charts (but the developer guide is about 700 pages!!!!). As I mentioned, this are the underlying technologies...the issue is to get a grasp of the Spago Engine implementation.

  • I'm learning about GIS (Geographic Information Systems) and open source implementations, and I'm in love with it (just like my boss), the possibilities of useful applications are endless!!! Take this example: You forget about what a simple relational database can do and try to query something like this: "Give me the number of houses within two miles of the coastline requiring evacuation in the event of a hurricane", not very straightforward huh, but with it GIS you can handle efficiently this kind of problems and more, much more. PostGis is the implementation for Postgresql.

  • Nevernote is a clon of evernote available on windows and mac but not on linux. It's useful for having a local notepad that you synchronize online to access it anywhere you need it. I take notes on everything most of the time (as you can see), so tools like those are a must to me.


http://nevernote.sourceforge.net/

http://www.evernote.com/

[caption id="attachment_257" align="aligncenter" width="300" caption="Nevernote"][/caption]

I used alien to transform the deb package into rpm for my opensuse, and so far, so good.

Sunday, April 24, 2011

Discovered Devices/ PDFxChange


  • My first device detection test,  these are the devices that are going to be detected.

    [caption id="attachment_250" align="aligncenter" width="270" caption="Devices to be detected"][/caption]

  • The application performs the inquiry and informs us that everything went Ok (the current device state allowed the inquiry to be performed successfully).

    [caption id="attachment_251" align="aligncenter" width="132" caption="Inquiry Completed"][/caption]

  • And finally it shows the bluetooth addresses of the devices that it's found.

    [caption id="attachment_252" align="aligncenter" width="313" caption="Devices detected"][/caption]


And as I haven't received any comments, nor questions about the code I'm writing/modifying, maybe there's no use in posting it.

  • I got to know PDFxChange lite (which is free for non commercial use), which is useful to store the data you fill within a PDF form. The other viewers won't save any information you store in a form, they'll just let you print it, and it's kind of silly (IMHO) to refill a form from scratch each time you want something changed.


http://www.tracker-software.com/free_offer_lite.html

It won't work correctly on wine, so...yep, I switched to Windows (but only for a little while =-D)

Now to review some sql, I might find something interesting.

Saturday, April 23, 2011

Device Discovery via Inquiry / filename functions in linux


  • I'm learning the concepts required to understand discovery inquiries.


[caption id="attachment_243" align="aligncenter" width="300" caption="Device Discovery via Inquiry"][/caption]

  • I discovered some pretty neat functions to get a file path in the most used formats, you could do that also with regular expressions, but hey, why to reinvent the wheel?


http://linux.die.net/man/3/filename

which helped me to open my pdf files with foxit reader in linux by double clicking on them. I like foxit reader even though I have to use wine, because is fast and among other interesting features, lets me highlight my documents using the free version.

[sourcecode language="bash"]

#!/bin/bash
dir=$(dirname $1)
cd $dir
base=$(basename $1)
wine /opt/foxit/FoxitReader431_enu_Setup.exe $base

[/sourcecode]

this little script gets called each time I click on a pdf. Weird thing that foxit with wine cannot open directly a file when is given its absolute path, so you move to the directory where the file is located and then you launch it by with its base name.

  • Too good to be true....my OpenSUSE is owned by Microsoft (in practical terms), it sucks.



  • I'm learning python, but I don't know enough to make something interesting...yet.


Gosh, I can't get to sleep.

Friday, April 22, 2011

Mobile Device Capabilities / Simple Device Discovery / Alien


  • I'm testing what kind of characteristics do the virtual devices provided in the J2ME framework have.


[sourcecode language="java"]

package org.rene;

import javax.microedition.midlet.*;

/**
* @author rene
*/
import javax.microedition.midlet.*;
import com.sun.lwuit.Display;
import com.sun.lwuit.Form;
import com.sun.lwuit.Label;
import com.sun.lwuit.layouts.BorderLayout;
import com.sun.lwuit.layouts.BoxLayout;
import com.sun.lwuit.plaf.UIManager;
import com.sun.lwuit.util.Resources;
import javax.bluetooth.BluetoothConnectionException;
import javax.bluetooth.LocalDevice;
import javax.bluetooth.BluetoothStateException;
import javax.bluetooth.DiscoveryAgent;

public class DeviceProps extends MIDlet {

public void startApp() {
//init the LWUIT Display
Display.init(this);
try {
Resources r = Resources.open("/LWUITtheme.res");
UIManager.getInstance().setThemeProps(
r.getTheme(r.getThemeResourceNames()[0]));
} catch (java.io.IOException e) {
System.out.println("Theme not found");
}
Form f = new Form();
f.setTitle("Local Device Props");
f.setLayout(new BoxLayout(BoxLayout.Y_AXIS));
getBluetoothInfo(f);
f.show();
}

public void pauseApp() {
}

public void destroyApp(boolean unconditional) {
}

/*
* Displays the bluetooth device information on the screen
* @param f the form to display the information on
*/
private void getBluetoothInfo(Form f) {
LocalDevice local = null;
//Retrieve the local bluetooth device object
try {
local = LocalDevice.getLocalDevice();

} catch (BluetoothStateException e) {
f.addComponent(new Label("Failed to retrieve the local device ("
+ e.getMessage() + ")"));
}
//Retrieve the bluetooth address
f.addComponent(new Label("Address: " + local.getBluetoothAddress()));
//watch out -> could be null
f.addComponent(new Label("Name: " + local.getFriendlyName()));
f.addComponent(new Label("API version: " + local.getProperty("bluetooth.api.version")));
int mode = local.getDiscoverable();
StringBuffer text = new StringBuffer("Discoverable Mode: ");
switch (mode) {
case DiscoveryAgent.GIAC:
text.append("General");
break;
case DiscoveryAgent.LIAC:
text.append("Limited");
break;
case DiscoveryAgent.NOT_DISCOVERABLE:
text.append("Not Discoverable");
break;
default:
text.append("0x");
text.append(Integer.toString(mode, 16));
break;
}
f.addComponent(new Label(text.toString()));
f.addComponent(new Label("Master Switch Supported : "
+ local.getProperty("bluetooth.master.switch")));
f.addComponent(new Label("Max Attributes : "
+ local.getProperty("bluetooth.sd.attr.retrievable.max")));
f.addComponent(new Label("Max Connected Devices : "
+ local.getProperty("bluetooth.connected.devices.max")));
f.addComponent(new Label("Max Receive MTU : "
+ local.getProperty("bluetooth.l2cap.receiveMTU.max")));
f.addComponent(new Label("Max Service Discovery Transactions : "
+ local.getProperty("bluetooth.sd.trans.max")));
f.addComponent(new Label("Connection options: "));
f.addComponent(new Label("Inquiry Scan Supported : "
+ local.getProperty("bluetooth.connected.inquiry.scan")));
f.addComponent(new Label("Page Scan Supported : "
+ local.getProperty("bluetooth.connected.page.scan")));
f.addComponent(new Label("Inquiry Supported : "
+ local.getProperty("bluetooth.connected.inquiry")));
f.addComponent(new Label("Page Supported : "
+ local.getProperty("bluetooth.connected.page")));

}
}

[/sourcecode]

[caption id="attachment_236" align="aligncenter" width="138" caption="Virtual Device Capabilities"][/caption]

  1. Discoverable mode: General. It means that this device responds only to general inquiries. (Whilst limited mode enabled devices respond to both general and limited inquiries)

  2. Master Switch Supported: true. The application can ask the device to change from master to slave and vice versa within a connection. Why does this matter?  Well, once defined a master it could form what's called a piconet (a small network) with up to 7 slaves which share the master's clock. This magic number suits me very well as I'm planing to support up to eight players.

  3. Max Attributes: 10. This has to do with service records. I'll find out more when I study service discovery.

  4. Max Connected Devices : 7. Reference to the piconet.

  5. Max Receive MTU: 512. Maximum Transmission Unit. In some forums talk about setting this size, I need to do some research latter related to the option of living it alone or tweaking it for performance of possible.

  6. Max Service Discovery Transactions : 5. keyword -> Concurrent, which is not bad at all.

  7. Inquiry Scan Supported: false. This device cannot respond to an inquiry request while it has established a link to another device.

  8. Page Scan Supported: false. It cannot accept a connection from a new remote device if it is already connected to another remote device.

  9. Inquiry Supported: false. It cannot start an inquiry while it's connected to another device.

  10. Page Supported: false. It cannot establish a connection to a remote device if it's already connected to another device.



  • So this device is very limited in the connectivity aspect. I'm adopting the pessimistic approach  and I will consider that all devices are like this, in order to aim for a fail-prove communication architecture. This is where the engineering stuff comes into play, in the designing and implementation of  subtle workarounds.

  • Some concepts on establishing communication:


[caption id="attachment_237" align="aligncenter" width="300" caption="SimpleDeviceDiscovery"][/caption]

This is just an introduction, the real thing comes with inquiries but it's ok for  warming up.

  • I like xmind very much to make brainmaps and conceptual maps but it's only available on its prepackage from in .deb format. I could build it from source, but I wanted to use an utility called alien to transform deb packages into rpm packages and conversely.


http://forums.opensuse.org/deutsch-german/hilfe-und-helfen/anwendungen/454719-howto-xmind-auf-opensuse-installieren.html
this does the trick:
[sourcecode language="bash"]

alien -r -c xmind-3.2.1.201011212218_amd64.deb

[/sourcecode]

and then you just install the generated rpm package as usual.

Thursday, April 21, 2011

JABWT Understanding the local device /Religion & pets


  • For my videogame it will be of the utmost importance to be aware of the device capabilities at hand. The LocalDevice class provides a handful of methods to get acquainted with the device characteristics and its current state. I developed a simple conceptual map:





  • Device Discovery LocalDevice


It gives me the creeps to check all those variables and possible exceptions, but gosh...it's necessary.



  • I had a recent interview and yet again I was asked about my religion, and I was wondering...are religious people more trustworthy than atheistic people? Well that might not be the case:


Sociologists have long known that religious people are no more honest or trustworthy than the non-religious, and the new poll suggests that atheists and other non-believers are actually better informed about the religious world than the faithful themselves.

http://news.discovery.com/human/atheists-best-informed-about-religion.html

http://blog.practicalethics.ox.ac.uk/2010/05/should-believers-trust-atheists/

Where do atheistic people's principles come from then? In my case I just try  following the golden rule: "One should treat others as one would like others to treat oneself" . Bad deeds give you bad reputation, and a bad reputation results in a diminishment of your trustworthiness and your social powers (in my opinion).

  • The other one that caught my attention : do you have pets?


http://www.cdc.gov/healthypets/health_benefits.htm

http://seedmagazine.com/content/article/the_human_animal/

Apparently having pets is a good thing for your health and identifies you as an empathetic person. I've never had one, I might start with a goldfish....but on the other hand, I think of a poor little thing being alone at home most of the day...nope that wouldn't work, I think I'm more empathetic that I dare to accept...moreover, in my opinion reducing a poor animal to the confinements of four walls is sort of animal cruelty...yikes! =-D

Tuesday, April 19, 2011

Device discovery with the Bluetooth Java API/ JNDI vs JDBC / Business Processes / yes

I'll be posting extracts of the research related to my terminal project.

JAVA Bluetooth API

  • The bluetooth specification separates discovery of devices and discovery of services into separate processes.

  • In bluetooth terms, device discovery is known as an inquiry. Devices respond to an inquiry depending of their discoverable mode

  • Those devices respond with their Bluetooth address and class of device record. The class of device record determines what type of services are available on the device.

  • We have two types of inquiry: general and limited. One good analogy the author suggests is: a general inquiry is similar to asking all people in a room to say their names, and a limited inquiry is asking the same only if they're accountants.

  • The API let's you to take full control of the device and service discovery process, or it can leave it up to the JABWT implementation (I'll be covering the first approach)

  • Steps to device discovery:



  1. To specify the type of inquiry to perform

  2. The API returns each device found during an inquiry back to the application as the device is found via a deviceDiscovered() event. I guess it's the most appropriate way to handle device discovery as it happens.

  3. You get the device found and the device class record which is composed of: the major service class, major device class and minor device class. Basically I'm interesting in knowing that a device has the major device class:"phone", but I need to look up what the minor device class for the major class 'phone' entails. Regarding the available services, it might be worth asking if the device major  service class is : 'Limited Discoverable'


Now the cumbersome part:

A complete inquiry can take eight up to ten seconds to complete in order to achieve a 95% chance of finding all the devices in the area, moreover a complete inquiry is very power consuming.

  • To overcome this issues the API provides two facilities, to skip inquiries altogether. It depends on the concept of predefined device, which is a device that's been discovered before and presents frequent interaction. And this makes perfect sense: if you have frequent communication with a devices there's no reason to ask for address, service class and device class that are for that matter immutable.


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Spago BI

  • Now I know the difference between a  JNDI (java naming directory interface)  and a simple JDBC connection in spago. With the former only one connection is made to a datasource and then shared between modules or reports, and a simple JDBC connection actually creates one instance connection for each module requesting information from a datasource.


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Datawarehousing

  • I'm looking forward to learn the bus architecture in dimensional designs.

  • Dimensional modeling is concerned with representing entire business processes, it does not enforce departmental boundaries. Such distinction would entail unnecessary replication. So the main aspect we're interested in is always: business processes.

  • There are four steps involved in defining a dimensional model, I'll comment on those steps tomorrow, maybe with some made up examples.


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Linux

  • There's some innocuous command called yes that generates an infinite series of the specified characters. It's useful when issuing interactive commands that require us to confirm certain actions such as overriding a file. For example


# yes n | cp -ir  folder/ /path/folder/

to do some kind of folder merging when you have two folders with the same name.

Wednesday, April 13, 2011

Simple Heapsort / Sun Studio ->C/C++ Profiler /Good bye Ubuntu


  • This is the simplest implementation of the heap sort algorithm:


[sourcecode language="cpp"]

#include <cstdlib>
#include <iostream>
#include <vector>
#include <ctime>
#include <algorithm>
#include <cmath>
using namespace std;
/*  We use this function each time we remove the root node, to restructure the
heap evenly.
*/
template <class T>
void createData(vector<T> &list, int size,int lowerBound, int upperBound) {
T num;
int range = upperBound - lowerBound + 1;
for (int i = 0; i < size; i++) {
num = lowerBound + int(rand() / (RAND_MAX / range + 1.0));
list.push_back(num);
}
}

template <class T>
void fixHeap(vector<T> &list, int root, int key, int bound) {
//list  the list/heap being sorted
//root  the index of the root of the heap
//key   the key value that needs to be reinserted in the heap (the one that was
// taken from the tree's bottom )
//bound the upper limit (index) on the heap (or list size)
int vacant = root;
int largerChild;
while (2 * vacant <= bound) {
largerChild = 2 * vacant;
if (largerChild < bound && list[largerChild + 1] > list[largerChild])
largerChild++;
//does the key belong above this child?
if (key > list[largerChild])break;
else // no, move the larger child up
{
list[vacant] = list[largerChild];
vacant = largerChild;
}
}
list[vacant] = key;
}

void execSimpleHeapSort(int size,bool printResult) {
int max;
vector<int> list;
createData(list,size,0,10000);
const int N = list.size() - 1;
//to create the heap structure
for (int i = N / 2; i >= 1; i--) {
fixHeap(list, i, list[i], N - 1);
}
//here we remove the root element (the maximum value) N times
//and of course we need to fix the heap each time we do that
for (int i = N - 1; i > 1; i--) {
max = list[1];
fixHeap(list, 1, list[i], i - 1);
list[i] = max;
}

if(printResult){
cout << " The sorted list : " << endl;
for (int i = 1; i < N; i++) {
cout << list[i] << " ";
}
cout << endl;
}
}

void simpleSTLsort(int size){
vector<int> list;
createData(list,size, 0, 10000);
sort(list.begin(),list.end());
}


int main(int argc, char** argv) {
srand((unsigned) time(0));
execSimpleHeapSort(10000000,false);
//simpleSTLsort(10000000);
return 0;
}

[/sourcecode]

Based on the pseudocode presented in "Analysis of Algorithms, An Active Learning Approach" by Jeffrey J. McConnel.

I coded it in order to test the profiling tools available in the Sun Studio suite  v.12.2 for C/C++, which was impossible to install correctly in ubuntu without disrupting some critical configurations (I was interested in using the compilers also, not just the profiling tools). For the aforementioned reason and others I decided  to switch to CentOS once and for all, I grew very tired of all the cool stuff being only well tested on Suse and RedHat. Yes, my CentOS is a  second class Red Hat citizen but who cares.

  • This is the output generated by sorting a vector with 10 million integers using the simple heap-sort:

    [caption id="attachment_220" align="aligncenter" width="300" caption="Performance of simple heap-sort on 10 million keys"][/caption]

    It took about 32 seconds to complete the task and his memory footprint was around 39 MB. This report can give you information about memory leaks (i.e: when you reserve memory and never release it), hot spots regarding CPU utilization and Thread synchronization issues, awesome!!!!!

  • And if you weren't impressed with that, take a look:

    [caption id="attachment_221" align="aligncenter" width="300" caption="Which functions used how many cpu cycles"][/caption]

    Yeahh, it can even give you a report on what functions were the most expensive to process.

  • How does the sort function included in the C++ STL compared with the simple heap-sort algorithm?

    [caption id="attachment_222" align="aligncenter" width="300" caption="The STL sorting algorithm rocks!"][/caption]

    8 secods !,  just a quarter of the time that it took to do the same task with the simple heap-sort algorithm, with basically the same memory footprint.

  • Conclusion: To code your own sorting functions is not worth it, period....unless you parallelize them, because as you can see the stl implementation ,although highly optimized is not parallel, the report depicts only one thread. I think the merge sort approach could allow such parallelism, but I'm not sure....

  •  

Sunday, April 10, 2011

I definitively don't like SQL.


  • I downloaded and installed in my postgresql, the example database called pagila that we're using in our datawarehouse-bi course from here


To import a sql file into postgresql is as easy as this:

[sourcecode langugae="bash"]

psql -U username database -f file.sql

[/sourcecode]

This database depicts a small video club business and is used to train people  in SQL. The schema is normalized to some extend (though some parts aren't normalized at all) and almost every attribute has its own table.

  • So I started to practice and realized how tough is to get the reports you need, in a schema structured this way. I thought of a simple query: "give me a list of movie names, quantity in stock, category, in which store I can find them, the principal actor and their language". I came up with this:


[sourcecode language="sql"]

select f.title as TITLE, ac.name as ACTOR,count(f.film_id) as in_stock,n.name as CATEGORY,l.name as LANGUAGE,i.store_id as STORE

from film as f  left join film_category as c on f.film_id = c.film_id left join language as l on l.language_id = f.language_id

left join category as n on c.category_id = n.category_id left join film_actor as a on a.film_id = f.film_id

left join actor as ac on ac.actor_id = a.actor_id left join inventory as i on f.film_id = i.film_id

group by f.title,n.name,l.name,i.store_id,a.actor_id,ac.name

order by in_stock desc;

[/sourcecode]

The output:



Something so simple becomes a rather complex thing. So it's not that hard to imagine the reason why many people are looking for alternatives to sql (like nosql)  and to the relational model. Or I don't know, maybe its just that I don't like databases that much =-D.

 

Saturday, April 9, 2011

What's a distributed version control system? Mercurial in Netbeans


  • When I was working on my computer graphics project I realized I needed a way to keep track of the changes I was doing to a working version of my virtual world, so if those changes I made weren't good enough, or worse  they broke something that was already working I could go back to the working version of my code.Of course you can make multiple backups of your work regularly, but is not practical.So I got to know what a version control system was.

  • This is one great tutorial (there's no use in reexplaining, we call it re-usability ! =-D)  to learn what a version control system is, and how to use one of the best available: Mercurial. http://hginit.com/top/index.html (windows oriented but all the concepts apply to linux as well). Here's the official guide too: http://mercurial.selenic.com/guide/

  • As far as I fully comprehend the above tutorial, I'll be posting more comments. In the meantime is worth nothing mentioning that I'm using the google code hosting service as my remote repository for mercurial. https://code.google.com/hosting/ , looks very nice (as almost everything else with google). I'll be posting some feedback later.


Mercurial with Netbeans.

  • First you need to install the mercurial client, in ubuntu: https://help.ubuntu.com/community/Mercurial , don't forget to modify your /etc/mercurial/hgrc file accordingly.

  • To set it up with Netbeans I'm using this tutorial: http://netbeans.org/kb/docs/ide/mercurial.html . The path you need to specify is the one to the hg binary file (for example /usr/bin/hg).

  • In the Netbeans 6.9 version you specify your repository in the Team Menu ->Mercurial->Properties (which can be a local or remote repository). In the user part, use the same user name you specified in the /etc/mercurial/hgrc file.


That's it. Let's play with it a little bit now.

 

 

Friday, April 8, 2011

Some fancy postgresql query.Window functions.


  • I finally got my 360MB mysqldump file into my postgresql database, thanks to the sed utility I could change some naming and character escaping differences between both DBMS. The data loading took a couple of hours and I was getting this message:



regarding some more character escaping issues,but after all everything went ok. This is the table structure (yes it's only one big table with 391832 records):

I was expecting a more intricate database schema, but I don't want all this work go to waste so I thought of this query:

[sourcecode language="sql"]

select row_number() over(ORDER BY civiliankia desc) as id_killing,date, civiliankia as murdered_civilians,sum(civiliankia) over (ORDER BY civiliankia asc) as total_innocent_killed from redacted_war_diary_irq as A where civiliankia > 0 order by civiliankia desc;

[/sourcecode]

where "civiliankia" stands for civilians killed in action.

partial output:

Results that match with some reported figures:

http://www.iraqbodycount.org/analysis/numbers/warlogs/

No wonder why I just can't get to sleep...creepy.

  • The only new concept in this example is the window functions, recently integrated into postgresql that allow you to do some operations on the data generated by the current query (in this case the incremental numbering presented in the killing_id field and the cumulative dead toll->total_innocent_killed), they look pretty handy but I need to do more exercises to fully understand how they work.

  • There are also these amazing concepts: table inheritance and table partition. I'll review them tomorrow.


 

Thursday, April 7, 2011

(From MySQL to postgresql / sql in action) -> sed utility/regExp


  • You know I always speak my mind, so I asked professor Andrade today: who uses postgresql anyway? He named several public organizations and a couple of businesses here in México, so I was like: mmm, ok then, that sounds good to me . Here's the official list, containing the best known organizations using postgresql: http://www.postgresql.org/about/users, so postgresql is quite something for that matter. This link is  mind blowing: http://www.sun.com/bigadmin/features/articles/switching_to_postgres.jsp?cid=e4390

  • It's quite a coincidence but I happened to find the infamous iraq war logs in my disk drive (a mysql sqldump weighing about 360 MB), I have no idea of how did they get there (really!), but let's use it for didactical  purposes.

  • The first thing I wan't to do is to transform this mysql dump file  into a working postgresql database. So I'm using this guide: http://en.wikibooks.org/wiki/Converting_MySQL_to_PostgreSQL <-didn't work too many errors

  • Ok let's give this perl script a try:  http://www.pgsql.com/download/mysql2pgsql.tar.gz, by postgres.  <- didn't work either


I'm learning to use the sed utility (to fix those idiomatic errors) , found a marvelous tutorial here:

http://www.grymoire.com/Unix/Sed.html#uh-0

and on regular expressions:

http://www.grymoire.com/Unix/Regular.html

I just love wonderful people making wonderful tutorials!

  • To finish this post, I'll say that from all the things we learned today the most important were:



  1. The case clause. We used it to say: if this guy is over 21 years, give me the word : "adult", else give me the word "minor". i.e: SELECT * , CASE when age>21 then 'adult' else 'minor' end FROM friends;

  2. 1::integer -> sort of a cast

  3. ILIKE. for case insensitive regular expressions

  4. Coalesce. Returns the first not null field. <- need to review it, I didn't quite get it =-D.

Wednesday, April 6, 2011

Postgresql / Simple chat


  • Today our course went a bit slow, teacher Andrade taught us Postgresql basic commands, and we practiced some beginner SQL sentences....gosh. Well, everybody deserves a break once in a while.

  • For what is worth, I gave up trying to use svg images in my project, the xml parser in J2me is just too picky with the svg sentences it allows, and I don't want to screw my brains up analyzing every svg code for every single picture I use.

  • I'm setting up a chat screen to begin my bluetooth communication testing, is a little too simple but as I said, is just for testing. To remove the numbering I need to define a custom cell renderer... don't think so.

    [caption id="attachment_198" align="aligncenter" width="136" caption="Freaking IEEE format!"][/caption]

  • Again, the code for this is extremely simple.


[sourcecode language="java"]

public void Scr_Chat(){
Form f = new Form("Chat");
final Vector dialog = new Vector();
List myList = new List(dialog);
TextField InputText = new TextField();
f.setLayout(new BorderLayout());
f.addComponent(BorderLayout.SOUTH,InputText);
f.addComponent(BorderLayout.CENTER,myList);
f.addCommand(new Command("Send"){
public void actionPerformed(ActionEvent evt){
dialog.addElement(InputText.getText());
}
});
f.show();
}

[/sourcecode]

  • You have to define the elements you access from an inner class as final.


http://geekexplains.blogspot.com/2008/06/why-inner-classes-can-access-only-local.html

  • Now I feel like doing real life SQL, I might be using the Iraq war logs published by wikileaks, let's see how many civilians  killed the american terrorists  back in those days.

Tuesday, April 5, 2011

What do your teachers teach you?? (Goddamn it!)

That's one question teacher Andrade (Business Intelligence Course), asked us today ( I added the "Goddamn it!" part =-D ). And  I just laughed it up, out of shame and irony because long ago I expected so much more from my Database Systems class.

Things I learned today:

  1. To systematize. Sometimes we call "to systematize"  to automatizing processes , but in fact we systematize something when we figure out the model that describes it  and only after that we automatize it.

  2. What does a model serves you for? To better understand the requirements; it's a basis for building information systems on top of it, to communicate with users, and it acts as a plan for developers. When building a model we don't care about technical or implementation details. The model has to have enough elements to form a good representation the entity requirements.

  3. Data Quality. Relevance is based on the context. Clarity. It's usual to make a glossary of terms with the client. Consistency. It's useful to make comparison table between terms (measure units, times, processes etc.) and its discrepancies and adopt one uniform term. Precision. Related to granularity (again, use always the same type of measure units)

  4. It's better to let the database manager to handle most of the data processing, because inside the database lays the business model, so this helps reduce errors. (The opposite approach is to process the data programmatically).

  5. What's an index? (this is the question that triggered this post's title). An index is a data structure that "copies" the values of a particular field, stores a pointer to the original record that value belongs to, then sorts those values and makes binary search (one of the fastest search you can do). The main issue here is that records are stored as linked list and you know how troublesome is looking up values in such structures. Indexing is only wise under limited circumstances, if not done properly you're wasting precious space!.

  6. The scheduler. This is the part of the database that decides in which way to perform a query so the spent time is minimal.

  7. Database statistics. When performed, they'll determine the access path to data that will produce less I/O operations in SELECT, UPDATE and DELETE operations. http://www.saptechies.com/faq-database-statistics/

  8. File system tweaking. Andrade told us about specific file system tweaking we can do in order to boost database performance, and guess what? It's not available in windows, stupid windows...


We learned a lot more but I just take notes on the things that go beyond the slides of the course.

  • I want my videogame to be multilingual, so  while playing with Layout Managers, buttons and other components,  I came up with this:


[caption id="attachment_193" align="aligncenter" width="141" caption="Chose your language"][/caption]

The code for this is as simple as it gets:

[sourcecode language="java"]

public class HelloMidlet extends javax.microedition.midlet.MIDlet {
public void startApp() {
Display.init(this);
Image deFlag = null;
Image mxFlag = null;
Image ukFlag = null;

try{
//setting the application theme
Resources r = Resources.open("/LWUITtheme.res");
UIManager.getInstance().setThemeProps(r.getTheme(r.getThemeResourceNames()[0]));
deFlag = Image.createImage("/deFlag.png").scaled(100, 50);
mxFlag = Image.createImage("/mxFlag.png").scaled(100, 50);
ukFlag = Image.createImage("/ukFlag.png").scaled(100, 50);
}
catch(java.io.IOException e){
System.out.println("Error al abrir el recurso");
}
Form f = new Form();
f.setTitle("Choose your language");
Button deFlagButton = new Button("Deutsch",deFlag);
Button ukFlagButton = new Button("English",ukFlag);
Button mxFlagButton = new Button("Español",mxFlag);
deFlagButton.setAlignment(Component.CENTER);
mxFlagButton.setAlignment(Component.CENTER);
ukFlagButton.setAlignment(Component.CENTER);
f.setLayout(new BoxLayout(BoxLayout.Y_AXIS));
f.addComponent(mxFlagButton);
f.addComponent(ukFlagButton);
f.addComponent(deFlagButton);
f.show();
}
public void pauseApp() {
}
public void destroyApp(boolean unconditional) {
}
}

[/sourcecode]

  • I was trying really hard to make svg images appear but just couldn't. I like svg images because they have a great quality independently of screen resolutions. I'll keep trying tomorrow.

  • I need also to figure out the way to get the screen width and height to resize my images accordingly.

  • Java provides a simple way to use localization, I'll implement the basis for it tomorrow.

  • I'll code the heapsort tomorrow as well.

Monday, April 4, 2011

Ass-kicking Business Intelligence Course and Merge Sort


  • Today finally started our Information Management course, and I'm very pleased with it. It's being given by a chemist (what are the odds to that!)  ,he's an experienced DBA and open source software expert. Some topics we reviewed:



  1. Business intelligence (BI) focuses on interrelations and facts to act as a guide in decision making and ultimately to achieve goals.

  2. What really matters? Return of Investment. What's the relation between what I'm putting in, and what I'm getting out?  Typical information management systems don't provide that information. In general terms we're interested in KPI (key performance indicators), i.e ROI, etc.

  3. BI is not appropriate for all enterprises. We're talking about big amounts of data (i.e. data generated by multinational enterprises, the government, etc.) and in general enterprises with several information systems that have been operating for several years.

  4. BI is a set of abilities and techniques that aim to the understanding of the commercial and functional environment.

  5. Data Mart. It's the set of data pertaining to an specific area of the business (i.e. accountancy, warehouse, etc).

  6. Data warehousing. Depends on acknowledgment, analysis and discovery of information.Provides historical and predictive views and alerts (warnings that come up when some deviation from goals occurs).

  7. The Business Intelligence Life Cycle. Data Sources -> Exchange, Transformation and Load (ETL)->Data Marts / Data WareHouse ->Business Intelligence Tools -> Outputs (such as web reports). The most important phases in this cycle are the Exchange, Transformation and Load and the DataWareHousing part. For instance in ETL we have a centralized system that gathers all the data from heterogeneous sources in an automated way.

  8. The dimensional model. In contrast with the relational model, it doesn't enforce normalization. We have two kind of tables: fact tables, and dimension tables that provide services to fact tables. What do we understand by dimensions?: those variables we're interested in analyzing, i.e (I wan't a report by vendor, or by year, or by model, etc). It implies high granularity, i.e. a very high level of detail; redundancy is not avoided but encouraged because it speeds up processing (weird!). We use surrogated keys (usually integers we defined by the system). What's a fact? Numeric data. Long and descriptive fields are endorsed.

  9. Star Schema. Formed by a central fact-table and surrounded by dimensional tables in an entity-relation fashion. Is the basic form of a datawarehouse.

  10. We were given some tips regarding new BI projects. To have a data quality program ( to reduce garbage), sell convenience and decision making support, not less work for the IT crew. Not to over dimension the project, start by a department at a time.


And a few more things. After all, the course was worth the waiting.

  • Today, I also studied again the merge sort algorithm. There's not much to it's implementation, and there's a phrase that comes to my mind: Denis Diderot used to say: "Simplicity is the ultimate sophistication", and in this regard the merge sort algorithm is highly sophisticated. The only shortcoming it has, is the extra space needed as it does not do the sorting "in place" ,so for high volumes of data, it might not be the best choice. I read that the recently modified implementation of the heap sort algorithm brings up the best of quick sort( in place sorting) and heapsort (nlogn performance), I'll study it tomorrow. Here is the code I took from Analysis of Algorithms An Actyve Learning Approach by Jeffrey J. Mc. Connel:


[sourcecode language="cpp"]

#include <cstdlib>
#include <iostream>
#include <ctime>
const int MAX_SIZE = 1000;
int datA[MAX_SIZE];
using namespace std;
int createData(){
int lBound=0,hBound=1000;
int range = hBound - lBound + 1;
for(int i=0; i<MAX_SIZE; i++){
datA[i] = lBound + int(rand()/(RAND_MAX/range + 1.0));
}
}

void mergeLists(int e[],int start1,int end1, int start2, int end2){
int result[MAX_SIZE];
int finalStart = start1;
int finalEnd = end2;
int indexC = 1;
while(start1 <= end1 && start2<= end2){
if( e[start1] < e[start2]){
result[indexC] = e[start1];
start1++;
}
else{
result[indexC] = e[start2];
start2++;
}
indexC++;
}
//move the part of the list that is left over
if(start1<= end1){
for(int i= start1; i<= end1; i++){
result[indexC] = e[i];
indexC++;
}
}
else{
for(int i=start2; i<= end2; i++){
result[indexC] = e[i];
indexC++;
}
}
//now put the result back into the list
indexC=1;
for(int i=finalStart; i<=finalEnd; i++){
e[i] = result[indexC];
indexC++;
}
}
void mergeSort(int e[], int first, int last){
if(first < last){
int middle = (first + last) / 2;
mergeSort(e,first,middle);
mergeSort(e,middle + 1,last);
mergeLists(e,first,middle,middle+1,last);
}
}

int main(int argc, char** argv) {
srand((unsigned) time(0));
createData();
mergeSort(datA,0,MAX_SIZE-1);
for(int i=0; i<MAX_SIZE; i++)cout<<datA[i]<<" ";
cout<<endl;
return 0;
}

[/sourcecode]

What does it do? Divides the input down to small lists through recursion. When the end of recursion is reached, it sorts and merges the sublists until we have just one ordered list which is then copied into the original array.

Sunday, April 3, 2011

Java GUI development . LWUIT for J2ME


  • Before  learning to use the LWUIT library to develop mobile phone interfaces, the developer guide advices us to get some background first on the AWT and SWING APIs. So here are some concepts I grasped:



  1. You add components (buttons, check boxes, etc) inside a container (panel / window), and each container can host other containers inside and that ability enables us to create complex interfaces.

  2. Each container has its own Layout Manager, which arranges containers and its components automatically according to the screen resolution, resizing etc, so we don't have to worry about placing our components manually. The default Layout Manger for the main containers is the Border Layout which divides the gui space in five sections: north, south, east, west, and center.

  3. You've got to set up the arrangement and composition of your gui before you display it. (yep, kind of obvious).

  4. For each component you register event handlers to do something according to actions the user perform.

  5. To register such event handlers you need an object that implements the ActionListener interface, or some other listener according to the event source. The issue with interfaces is that you need to implement all the methods defined inside them (those you don't use are left with an empty body. A workaround to that is to use Adapter Classes (or Adapter Interfaces?,mm nevermind) that implement all the interface methods for you and you override only those you're interested in using.

  6. There's another new concept: anonymous classes. Such classes are declared inline and usually contain just a few lines, the lines needed to register the event handler. You can also declare an inner class ( a class inside a class) for those purposes which is the preferred method regarding event handler registration.

  7. The AWT api, creates interfaces that look differently between operating systems because it depends on native components whereas the SWING api provides a consistent look thorough several platforms because it takes care of the drawing by itself.



  • And as its customary when using a new technology, here's my "Hello world" LWUIT application


[caption id="attachment_184" align="aligncenter" width="141" caption="My first LWUIT app"][/caption]

  • The code is practically the same that the one provided inside the developers guide, so there's not reason to post it.

Saturday, April 2, 2011

Kruskal algorithm, my implementation


  • Today I decided to code my own implementation of the Kruskal algorithm, based on the description found in profesor Zaragoza's notes
    slides 234 through 252

  • This is the graph I'm using:


which I found here:







[sourcecode language="cpp"]
#include <iostream>
#include <algorithm>
using namespace std;
const int NUM_VERTEXES = 5;
const int NUM_EDGES = 8;
int edgesA[NUM_EDGES][3] = {{1,2,7},{1,4,8},{1,5,9},{2,3,11},{2,4,3},{2,5,9},{3,4,2},{4,5,5}};
bool tree[NUM_EDGES] = {false};
struct vertex{
int vNum;
struct vertex* father;
int count;
}vertexes[NUM_VERTEXES];
struct edge{
int start;
int end;
int cost;
}edges[NUM_EDGES];
struct vertex* vPtrs[NUM_VERTEXES];
bool operator<(const edge a, const edge b){
return a.cost < b.cost;
}
void initStructure(){
for(int i=0; i<NUM_VERTEXES; i++){
vertexes[i].vNum = i+1;
vertexes[i].father = NULL;
vertexes[i].count = 1;
vPtrs[i] = &vertexes[i];
}
for(int i=0; i<NUM_EDGES; i++){
edges[i].start = edgesA[i][0];
edges[i].end = edgesA[i][1];
edges[i].cost = edgesA[i][2];
}
}
struct vertex* search(struct vertex* n){
vertex* temp;
temp = n;
while(temp->father != NULL){
temp = temp->father;
}
return temp;
}
void join(int i, int j,int edge){
vertex* iPtr = search(vPtrs[i]);
vertex* jPtr = search(vPtrs[j]);
if(iPtr != jPtr){
tree[edge] = true;
if(iPtr->count <= jPtr->count){
iPtr->father = jPtr;
jPtr->count += iPtr->count;
}
else{
jPtr->father = iPtr;
iPtr->count += jPtr->count;
}
}
}
int main(){
initStructure();
sort(edges,edges+NUM_EDGES);
for(int i=0; i<NUM_EDGES; i++){
join(edges[i].start-1,edges[i].end-1,i);
}
for(int i=0; i<NUM_EDGES; i++)
if(tree[i]== true){
cout<<char(edges[i].start+64)<<" "<<char(edges[i].end+64)<<endl;
}
return 0;
}
[/sourcecode]

I'm using a customized format for the edges, one improvement is to parse to that format from a typical adjacency matrix or adjacency list eliminating redundant edges.

Things I learned from this exercise:

  • To use consistent subscript ranges, gosh.. debugging the typical out of bounds error is the lamest thing in the world.

  • To better use a debugger.

  • Operator overloading. Read about it before, but never used it in a program.

  • The different flavors of the sort function in <algorithm>

  • A better understanding of pointers.


My goal in these holydays is to code all the algorithms we reviewed in the algorithms analysis class.

Friday, April 1, 2011

The greatest and most amazing four years of my life


  1. Today I took my last class at the University. I'm not finished yet with it though, my final project and my social service are still in the process, but they won't take long , I hope. From the first up to this day I felt in love with the University and those were the best spent four years of my life. Plans for the future... I intent to get a masters degree at the cinvestav, and then I'll probably leave the country, where to? I'm not sure yet, but I love Europe and particularly Germany and France. But those are my plans, a whole lot of things can happen in the middle

  2. I wrote a little program to take advantage of my command line email sending script.


[sourcecode language="cpp"]

#include <unistd.h>
#include <cstdlib>

int main(){
char* args[] = {"/home/jocker/sendMail.sh","al207205823@alumnos.azc.uam.mx","Backup del sistema ha iniciado",
"El Backup del sistema ha comenzado siendo las : 22:02 hrs"};
pid_t pid = fork();
if(pid == 0){
execv("/home/jocker/sendMail.sh", args);
}
else if (pid){
sleep(120);
pid_t pid1 = fork();
if(pid1 == 0){
args[2] = "El Backup del sistema ha concluido";
args[3] = "El Backup del sistema ha terminado exitosamente a las 23:00 hrs";
execv("/home/jocker/sendMail.sh", args);
}
}
return 0;
}

[/sourcecode]

I had never used those system calls in a program before ( I didn't take the operating systems class with the teacher Nava), so it was very exciting.

[caption id="attachment_174" align="aligncenter" width="300" caption="Automatic mail notification"][/caption]

  • The next step is to use the real timestamp in the message, I tried several times but execv doesn't take variables as arguments (just const char*), nor lets me use strcat to append the time as it returns segmentation fault.


3. Yesterday I  attended a chemistry conference, why? because it was given in English, yay!. And in the process learned some things about scientific computing and  first principle calculations. The speaker was an extremely beautiful woman, her beauty was only matched with her intelligence. She is an expert using linux and  a skilled programmer too, what a mix, sadly she's way older than I am.