Wednesday, June 4, 2008

CDMA SMS User Data

In this post I will describe the decoding of the user data portion of the bearer data field of a CDMA SMS message. See my first post for the structure of the bearer data field.

The User Data subparameter is the portion of the Bearer Data field of a CDMA SMS message that contains the actual message or payload. The user data is made up of an integral number of octets and is 0 padded as needed. The user data subparameter is documented in section 4.5.2 of the IS-637 spec. A PDF of this spec is available on the 3GPP2's web site. A PDF of the spec is also available on the TIA's website. The structure of the user data subparameter is:
  • The subparameter ID of 8 bits which is the constant that identifies the start of this subparameter in the bearer data.
  • The subparamter Len of 8 bits which is the number of octets that make up the value portion of this subparameter.
These two portions of the user data are processed as part of the bearer data. The rest of the user data subparameter is processed separately. The remaining fields of the user data are:
  • The message encoding is a 5 bit value that indicates which encoding scheme was used for the message.
  • The message type is an optional 8 bit value that is used only if the encoding is an IS-91 extended protocol message. See the specification document for the details.
  • The num fields is the number of data elements, of the size specified by the encoding and message type, that the message contains.
  • The chari portion contains the actual text or payload of the message.
  • The final portion the message is 0-7 bits of 0 padding as needed to fill the last octet.
The first step I took in decoding the user data was to write a function that determines the message encoding, the size of the data elements, and the starting byte of the message. To this I first start by defining some constants.
// masks and values for processing the user data fields
#define ENCODING_MASK 0xF8
#define ENCODE_OCTET 0X00
#define ENCODE_IS41 0X08
#define ENCODE_7BIT 0X10
#define ENCODE_IA5 0X18

#define MST_BYTE_1_MASK 0xE0
#define MST_BYTE_2_MASK 0x1F
#define NF_BYTE_1_MASK 0xE0
#define NF_BYTE_2_MASK 0x1F

// standard sized type definitions
typedef char sint_8;
typedef short sint_16;
typedef int sint_32;
typedef long long sint_64;

typedef unsigned char uint_8;
typedef unsigned short uint_16;
typedef unsigned int uint_32;
typedef unsigned long long uint_64;

Then I write the routine to do the initial processing. This function is writtento call another function that is will handle the actual decoding of the message.
decode_user_data( uint_8 *userData, size_t sz )
uint_8 *ud = userData; // current element
uint_8 *lud = userData + sz - 1; // last element

uint_8 encoding;
uint_8 mst;
uint_8 numFields;
uint_8 *nextByte = ud + 1;

int i;
for( i = 0; i < sz; i++ ) {

mst = 7;
encoding = *ud & ENCODING_MASK;
switch( encoding ) {
mst = 8;

case ENCODE_IS41:
mst = (( *ud << 5 ) & MST_BYTE_1_MASK ) +
(( *nextByte >> 3 ) & MST_BYTE_2_MASK );

case ENCODE_IA5:

perror( "unknown paramters\n");

numFields = (( *ud << 5 ) & NF_BYTE_1_MASK ) +
(( *nextByte >> 3 ) & NF_BYTE_2_MASK );

printf("numFields: %d\n", numFields );
printf("first byte: %X\n", *nextByte);

switch( encoding ) {
case ENCODE_7BIT: {
char *text = decode_7bit_ascii(
nextByte, numFields, 3 );
printf("The text message is: '%s'\n", text );
free( text );

case ENCODE_IS41:
case ENCODE_IA5:
perror( "requested encoding is not implmented\n");
The only message type that I am concerned about with this is the 7bit packed ASCII. I will show how to unpack this into a NULL terminated string in another post.

1 comment:

Anonymous said...

Please share the coding of decode_7bit_ascii