Sunday, June 22, 2008

Fortune Cookie Says

Endurance and persistence will be rewarded.

Wednesday, June 4, 2008

CDMA SMS User Data

In this post I will describe the decoding of the user data portion of the bearer data field of a CDMA SMS message. See my first post for the structure of the bearer data field.

The User Data subparameter is the portion of the Bearer Data field of a CDMA SMS message that contains the actual message or payload. The user data is made up of an integral number of octets and is 0 padded as needed. The user data subparameter is documented in section 4.5.2 of the IS-637 spec. A PDF of this spec is available on the 3GPP2's web site. A PDF of the spec is also available on the TIA's website. The structure of the user data subparameter is:
  • The subparameter ID of 8 bits which is the constant that identifies the start of this subparameter in the bearer data.
  • The subparamter Len of 8 bits which is the number of octets that make up the value portion of this subparameter.
These two portions of the user data are processed as part of the bearer data. The rest of the user data subparameter is processed separately. The remaining fields of the user data are:
  • The message encoding is a 5 bit value that indicates which encoding scheme was used for the message.
  • The message type is an optional 8 bit value that is used only if the encoding is an IS-91 extended protocol message. See the specification document for the details.
  • The num fields is the number of data elements, of the size specified by the encoding and message type, that the message contains.
  • The chari portion contains the actual text or payload of the message.
  • The final portion the message is 0-7 bits of 0 padding as needed to fill the last octet.
The first step I took in decoding the user data was to write a function that determines the message encoding, the size of the data elements, and the starting byte of the message. To this I first start by defining some constants.
// masks and values for processing the user data fields
//
#define ENCODING_MASK 0xF8
#define ENCODE_OCTET 0X00
#define ENCODE_IS41 0X08
#define ENCODE_7BIT 0X10
#define ENCODE_IA5 0X18

#define MST_BYTE_1_MASK 0xE0
#define MST_BYTE_2_MASK 0x1F
#define NF_BYTE_1_MASK 0xE0
#define NF_BYTE_2_MASK 0x1F

// standard sized type definitions
//
typedef char sint_8;
typedef short sint_16;
typedef int sint_32;
typedef long long sint_64;

typedef unsigned char uint_8;
typedef unsigned short uint_16;
typedef unsigned int uint_32;
typedef unsigned long long uint_64;


Then I write the routine to do the initial processing. This function is writtento call another function that is will handle the actual decoding of the message.
void
decode_user_data( uint_8 *userData, size_t sz )
{
uint_8 *ud = userData; // current element
uint_8 *lud = userData + sz - 1; // last element

uint_8 encoding;
uint_8 mst;
uint_8 numFields;
uint_8 *nextByte = ud + 1;

int i;
for( i = 0; i < sz; i++ ) {
printf("%X\n",userData[i]);
}

mst = 7;
encoding = *ud & ENCODING_MASK;
switch( encoding ) {
case ENCODE_OCTET:
mst = 8;
break;

case ENCODE_IS41:
mst = (( *ud << 5 ) & MST_BYTE_1_MASK ) +
(( *nextByte >> 3 ) & MST_BYTE_2_MASK );
ud++;
nextByte++;
break;

case ENCODE_7BIT:
case ENCODE_IA5:
break;

default:
perror( "unknown paramters\n");
exit(0);
}

numFields = (( *ud << 5 ) & NF_BYTE_1_MASK ) +
(( *nextByte >> 3 ) & NF_BYTE_2_MASK );

printf("numFields: %d\n", numFields );
printf("first byte: %X\n", *nextByte);

switch( encoding ) {
case ENCODE_7BIT: {
char *text = decode_7bit_ascii(
nextByte, numFields, 3 );
printf("The text message is: '%s'\n", text );
free( text );
break;
}

case ENCODE_OCTET:
case ENCODE_IS41:
case ENCODE_IA5:
perror( "requested encoding is not implmented\n");
return;
break;
}
}
The only message type that I am concerned about with this is the 7bit packed ASCII. I will show how to unpack this into a NULL terminated string in another post.

Monday, June 2, 2008

Source Code Formatter

I have found blogger to be very frustrating to work with for posting source code. I quick search with Google and I found this source code formatter.

CDMA SMS Bearer Data

In December we tested the system I had been developing at Lucent's lab for the EARS project. This test was to prove the feasibility of sending broadcast SMS messages for emergency alerts. This testing was successful, for the most part. The one snag that was encountered was with the Bearer Data portion of the SMS message. The bearer data carriers the message that will be transmitted to the phone and I naively thought that this was just the text. But it turned out the bearer data is encoded according to the IS-637 specification. With a set of hex dumps from Lucent's internal testing tool I set out to figure out how to decode the bearer data so I could learn how to encoded it. Unfortunately, this line of work was stopped, just when I had almost gotten everything figured out. I couldn't let all of that work go to waste, so I finished up that task on my time and I present it to you now.

The structure of the SMS bearer data field in a CDMA system is defined in section 4.5 of the IS-637 spec. A PDF of this spec is available on the 3GPP2's web site. A PDF of the spec is also available on the TIA's website. In short the bearer data field is a series of fields where each field is an integral number of octets and the fields are 0 padded if necessary. The structure of the bearer data is in the form of parameter ID, parameter length, parameter value. Where parameter ID defines what data is being passed. The parameter length is the number of octets of the parameter value. The value of course is the data that we need to provide.

To decode the bearer data field I wrote a simple routine that loops through the data picking out all of the parameters. We were working with a minimal set of parameters of those available. The first step is to define a set of constants that will be used in the routine.
// bearer data subparameter identifiers
//
#define BD_MESSAGE_ID 0x00
#define BD_USER_DATA 0x01
#define BD_USER_RESP_CD 0x02
#define BD_TIMESTAMP 0x03
#define BD_VALIDITY_PER_ABS 0x04
#define BD_VALIDITY_PER_REL 0x05
#define BD_DEFERRED_DELIVERY_ABS 0x06
#define BD_DEFERRED_DELIVERY_REL 0x07
#define BD_PRIORITY_IND 0x08
#define BD_PRIVACY_IND 0x09
#define BD_REPLY_OPT 0x0A
#define BD_NUM_MSGS 0x0B
#define BD_ALERT_ON_DEL 0x0C
#define BD_LANG_IND 0x0D
#define BD_CALLBACK_NUM 0x0E

// standard sized type definitions
//
typedef char sint_8;
typedef short sint_16;
typedef int sint_32;
typedef long long sint_64;

typedef unsigned char uint_8;
typedef unsigned short uint_16;
typedef unsigned int uint_32;
typedef unsigned long long uint_64;

The routine to decode the bearer data just receives an array of octets (unsigned 8 bit integers) and the length of the array. It loops through the data and writes the received parameters to stdout.
void
decode_bearer_data( uint_8 *bearerData, size_t sz )
{
uint_8 *bd = bearerData; // current element
uint_8 *lbd = bearerData + sz - 1; // last element

uint_32 msgID = 0;
uint_8 userDataLen = 0;
uint_8 *userData = NULL;
uint_8 timestamp[6]; // YY MM DD hh mm ss
uint_8 msgDelivery = 1;

while( bd < lbd ){
switch( *bd ){
case BD_MESSAGE_ID:
if( *(++bd) != 3 ){
perror("message ID Len is not 3\n");
}
msgID = (*(++bd) << 16) + (*(++bd) << 8) +
*(++bd) ;
break;

case BD_USER_DATA:
userDataLen = *(++bd);
userData = bd + 1;
bd += userDataLen;
break;

case BD_TIMESTAMP:
if( *(++bd) != 6 ){
perror("timestamp len is not 6\n");
}
timestamp[0] = *(++bd);
timestamp[1] = *(++bd);
timestamp[2] = *(++bd);
timestamp[3] = *(++bd);
timestamp[4] = *(++bd);
timestamp[5] = *(++bd);
break;

case BD_ALERT_ON_DEL:
msgDelivery = *(++bd);
break;

case BD_USER_RESP_CD:
case BD_VALIDITY_PER_ABS:
case BD_VALIDITY_PER_REL:
case BD_DEFERRED_DELIVERY_ABS:
case BD_DEFERRED_DELIVERY_REL:
case BD_PRIORITY_IND:
case BD_PRIVACY_IND:
case BD_REPLY_OPT:
case BD_NUM_MSGS:
case BD_LANG_IND:
case BD_CALLBACK_NUM:
printf(
"sub parameter is not implemented: %X\n",
*bd);
exit(1);
break;

default:
printf("unknown sub parameter: %X\n", *bd);
exit(1);
}
bd++;
}

printf("BEARER DATA\n");
printf("msgID: %x\n", msgID);
printf("timesamp %x/%x/%x %x:%x:%x\n",
timestamp[0], timestamp[1], timestamp[2],
timestamp[3], timestamp[4], timestamp[5] );
printf("alert on delivery: %x\n", msgDelivery);

printf("\nPROCESSING USER DATA\n\n");
decode_user_data( userData, userDataLen );
}

The User Data field is the bearer data parameter that actually contains the message that we want to send. This field is additionally encoded and may contain data encoded in several different schemes. The text message data I was working with is encoded as packed 7 bit ASCII characters. I will cover this in another post.