Friday, March 26, 2010

Fortune Cookie Says

It is better to have beans and bacon in peace than cakes and ale in fear.

Friday, February 19, 2010

Fortune Cookie Says

Your dream must be bigger than your fear.

nosql - hypertable

I've been looking at some of the nosql databases and have been playing around with Hypertable a little. I had some difficulties getting my head around it, probably due more to my long experience with RDBMSs than Hypertable itself, so here are some notes about it.

Creating a table is analagous to creating a table in an RDBMS.

hypertable> create table foo ("bar", "bat");

Inserting data into the table is quite a bit different than what I am used to. First is that each 'row' inserted into the tab must have a row key, basically there is an implied row id column in every table. Second is that the insert is done as individual (row, column, value) tuples or cells. Third is that a single insert statement can insert into any number of cells for any number of rows for a single table. Fourth is that the columns are not really columns in the RDBMS sense but can hold data and multiple 'sub-columns' which can be dynamically defined.

hypertable> insert into foo values
         -> ("zap", "bar", "cat"),
         -> ("zap", "bat", "tap"),
         -> ("car", "bar:a", "nar"),
         -> ("car", "bar:b", "whal");
hypertable> select * from foo;
car    bar:a    nar
car    bar:b    whal
zap    bar    cat
zap    bat    tap
Each value inserted into the table is a tuple of '(' <row key>, <column name> [':'<column qualifier>], <value> ')'

There is no update statement, updates are simply inserts and Hypertable keeps history of all changes and timestamps each. Specifying an option of REVS=1 is analogous to a select in an RDBMS and gives the current revision of the row.

hypertable> insert into foo values
         -> ("zap", "bat", "lap"); 
hypertable> select * from foo where ROW="zap" DISPLAY_TIMESTAMPS;
2010-02-19 17:47:32.535534003    zap    bar    cat
2010-02-19 17:58:22.743904001    zap    bat    lap
2010-02-19 17:47:32.535534004    zap    bat    tap


hypertable> select * from foo where ROW="zap" DISPLAY_TIMESTAMPS REVS=1;
2010-02-19 17:47:32.535534003    zap    bar    cat
2010-02-19 17:58:22.743904001    zap    bat    lap
How the where clause is used  in Hypertable is quite different than in an RDBMS.  The value 'ROW' is the row key and can be used pretty much in the same way as a column in a boolean operation in SQL.
hypertable> select * from foo where ROW<="zap" DISPLAY_TIMESTAMPS REVS=1;
2010-02-19 17:47:32.535534001    car    bar:a    nar
2010-02-19 17:47:32.535534002    car    bar:b    whale
2010-02-19 17:47:32.535534003    zap    bar    cat
2010-02-19 17:58:22.743904001    zap    bat    lap
There is also a CELL option. I wanted to use this to select on column contents but it does not work that way. It allows one to select rows and columns by  row keys and column names. This probably has the most benefit if one uses column qualifiers ( dynamically created sub-columns).
hypertable> select * from foo where "a", "bar:a" <= CELL <= "zz", "bar:z";
car    bar:a    nar
car    bar:b    whale
zap    bar    cat
zap    bat    lap
zap    bat    tap
This basically sets up a range query where "a" <= ROW <= "zz" and there is a column such that "bar:a" <= column name <= "bar:z"

Wednesday, February 10, 2010

Unpacking 7-bit ASCII

To (finally) complete my series of posts on processing CDMA SMS bearer data and user data fields. In my cases the actual text of the SMS message is packed 7-bit ASCII which corresponds to messages with an encoding flag of ENCODE_7BIT, or 0x10. A packed message then looks something like the following.

|aaaaaaab|bbbbbbcc|cccccddd|ddddeeee|eeefffff|ffgggggg|ghhhhhhh|


Which packs 8 characters (a - h) into 7 octets. To unpack the message we have to pick out each individual character and make it an 8-bit ASCII character.  It helps if we reorganize the data to look at each character individually as follows.

|aaaaaaax| |xxxxxxxx|
|xxxxxxxb| |bbbbbbxx|
|xxxxxxcc| |cccccxxx|
|xxxxxddd| |ddddxxxx|
|xxxxeeee| |eeexxxxx|
|xxxfffff| |ffxxxxxx|
|xxgggggg| |gxxxxxxx|
|xhhhhhhh| |xxxxxxxx|

From this we see that a 7-bit ASCII character can be packed into octets in one of 8 patterns. So to unpack the characters we need to know the current octet, the next octet, and the packing pattern. With that we can apply some bit shifting and bit masks to create an unpacked 8-bit ASCII character. 

I coded this so that the pattern associated to the character 'a' above is pattern 0. And the user data headers cause the SMS message to start at packing pattern 3, which matches the character 'd' above. The routine to decode the SMS message then just becomes a simple loop over the 7-bit ASCII characters while keeping track of the packing pattern.

/* copyright (c) 2010 Steve Hill - All rights reserved */
char*
decode_7bit_ascii( uint_8 *sms, uint_8 len, uint_8 startPat )
{
    uint_8 buffLen = len + 1;
    char *buff     = malloc( sizeof(char) * buffLen );

    memset( buff, 0, sizeof(char) * buffLen );
  
    char   *currChar = buff;        // current char in buff
    char   *lastChar = buff + len;  // last char in buff
    uint_8 *curr     = sms;         // current byte being converted
    uint_8 *next     = curr + 1;    // next byte
    uint_8  currPat  = startPat;    // conversion pattern

    while( currChar < lastChar )
    {
        switch( currPat )
        {
            case 0:     // aaaaaaax xxxxxxxx
                *currChar = ( *curr >> 1 ) & 0x7F;
                break;
            case 1:     // xxxxxxxa aaaaaaxx
                *currChar = (( *curr << 6 ) & 0x40 ) + 
                            (( *next >> 2 ) & 0x3F );
                break;
            case 2:     // xxxxxxaa aaaaaxxx
                *currChar = (( *curr << 5 ) & 0x60 ) + 
                            (( *next >> 3 ) & 0x1F );
                break;
            case 3:     // xxxxxaaa aaaaxxxx
                *currChar = (( *curr << 4 ) & 0x70 ) + 
                            (( *next >> 4 ) & 0x0F );
                break;
            case 4:     // xxxxaaaa aaaxxxxx
                *currChar = (( *curr << 3 ) & 0x78 ) + 
                            (( *next >> 5 ) & 0x07 );
                break;
            case 5:     // xxxaaaaa aaxxxxxx
                *currChar = (( *curr << 2 ) & 0x7C ) + 
                            (( *next >> 6 ) & 0x03 );
                break;
            case 6:     // xxaaaaaa axxxxxxx
                *currChar = (( *curr << 1 ) & 0x7E ) + 
                            (( *next >> 7 ) & 0x01 );
                break;
            case 7:     // xaaaaaaa xxxxxxxx
                *currChar = *curr & 0x7F;
                break;
        }

        currChar++;
        if( currPat ) // stay on current byte if pattern 0
        {
            curr++;
            next++;
        }
        currPat = ++currPat % 8;
    }

    return buff;
}

Fortune Cookie Says

You will make many changes before settling down happily.

Thursday, September 17, 2009

Extending Watir

My current project is my first project that is solely a web application. I was looking for a way to automate testing for acceptance and regression tests. In reviewing the available frameworks I chose watir. Watir is built using Ruby which fit well with my project as I am using Rake for my build system.

The app makes extensive use of divs and spans in the page templates. Watir allows divs and spans to be tested but it required a lot of repetitive coding. To test that a div exists, and if so that it meets some criteria you have to do something like this.
# validate the login error message is not displayed
#
divFound = false
@browser.divs.each{ |d|
if d.id == 'ValidationSummary'
assert( false ==
d.text().include?('Enter a valid e-mail address.'),
"Expected the error message: 'Enter a valid e-mail " +
"address'. Received error message: #{d.text()}\n"
)
divFound = true
break
end
}
assert( divFound, "Did not find the ValidationSummary div\n")
Not a big deal but it gets tedious if just about everything is in a div or span. To make life a bit easier I wrote routines that encapsulate the repetitive code. They will test if the div or span exists and if so call an optional block. Since all objects in Ruby are open I coded these as an extension of watir.
module Watir
module Container

# Search the list of divs for the one specified by id.
# If the div is not found return false. If a block is
# provide return its result. Otherwise return true.
#
# id - The id of the div to test for
# block - The optional code block
#
def div?(id, &block )
self.divs.each{ |d|
if d.id == id
return (block) ? yield(d) : true
end
}
false
end

# Search the list of spans for the one specified by id.
# If the span is not found return false. If a block is
# provide return its result. Otherwise return true.
#
# id - The id of the span to test for
# block - The optional code block
#
def span?(id, &block )
self.spans.each{ |d|
if d.id == id
return (block) ? yield(d) : true
end
}
false
end
end
end

We can then change the first example to:
# validate the login error message is not displayed
#
divFound = @browser.divs?( 'ValidationSummary') { |d|
assert( false ==
d.text().include?('Enter a valid e-mail address.'),
"Expected the error message: 'Enter a valid e-mail " +
"address'. Received error message: #{d.text()}\n")
true }
assert( divFound, "Did not find the ValidationSummary1 div\n")
If you just wanted to test for the existence of the div you would do this.
assert( @browser.divs?( 'ValidationSummary'), "Did not find....")

I'm working on a patch submission to add these to the watir project.

Saturday, May 2, 2009

Fortune Cookie Says

You have much skill in expressing yourself to be effective.