Sunday, July 15, 2007

Pseudo code and Mathematical Analysis

I have describe the cube generation algorithm based on HashMap. First we get the entire dimension in a vector. Then iterate over the vector and get the taransactionID and Amount for each dimension in the first while loop. And another HashMap is build with respect to DimensionID and previously build HashMap. Then we iterate through the second HashMap in the next while loop. And find the common transactionID which belongs to all the Map, and push them as final result.

Algorithm:

HashMap dimensionID2hashMapPointer;

Iterator dimensionIterator = dimension->begin();

while(dimensionIterator != dimension->end())

begin

HashMapPointer transactionID2Amount = new HashMap();

Query newQuery = BuildQuery();

ResultSet queryResult = get the query result which contain transactionID

and Amount;

Iterator dataIterator = queryResult->begin();

Repeat

tuple = dataIterator ->tuple();

transactionID2Amount = tuple;

dataIterator = dataIterator->next();

until (dataIterator != queryResult->end())

end

HashMapPointer firstDimension = dimensionID2hashMapPointer->begin();

Iterator mapIterator = firstDimension->begin();

While(mapIterator != firstDimension->end())

begin

Boolean found = true;

String resultString = DimensionID+”#”;

for all other dimension

begin

Boolean newfound = Is transactionID is in current HashMap;

found = found && newfound;

resultString += currentDimensionID+”#”;

end

If(found)

resultList->add(resultString+”TransactionID+amount”);

end


Complexity:

Let,

The query has Dimension Number = n;

The fact table has tuple count = t;

Average tuple count for any query = ta;

HashMap has constant searching time = s;

So, the first while loop has time = O(n x ta);

The second while loop has time = O( (ta) x (n-1) x s )

= O((ta) x (n-1)) [ as s is constant]

So, the total algorithm complexity = O(n x ta);

Time Complexity:

The cube access time is linear in this development. It should be polynomial as the dimension of the cube increases, but in this development Hash map algorithms were used to make the polynomial time linear. This can be considered as a great achievement in this system. So the cube generation time = O(n) here n is the number tuples in the fact table.

Space Complexity:

The space required for holding the cube is almost linear with respect to tuple size. Except if tuple size is too low. For very few tuple sizes the space will be constant and after a threshold value the space will increase linearly.

No comments:

Post a Comment

Please, no abusive word, no spam.