Wednesday, April 9, 2008

Parsing Socket Connections With Flex and Bison, part II

I left my first post on parsing socket connections with Flex and Bison with a note about a small memory leak. In this post I will show how to fix this leak.

I used Valgrind, which is a profiling and instrumenting application, to test for memory leaks. The command line I used to run the test server was:
valgrind --suppressions=./mysupps.supp --log-file-exactly=valgrind.log --leak-check=full --show-reachable=yes --leak-resolution=high --num-callers=40 -v ./stubserver

The process as describe in the first post, is that a statically allocated input buffer is populated every time the socket is read. Then yy_scan_string is called with the buffer so flex will start processing that buffer. The yy_scan_string function returns a YY_BUFFER_STATE handle each time it is called, see chapter 12 of the Flex Manual. Each YY_BUFFER_STATE handle consists of 3 allocations totaling 92 bytes of memory. Which can be seen in the Valgrind output file, which has been simplified for space.
searching for pointers to 3 not-freed blocks.
checked 59,892 bytes.

8 bytes in 1 blocks are still reachable in loss record 1 of 3
at 0x4022765: malloc
by 0x804D607: yyalloc
by 0x804D3D1: yy_scan_bytes
by 0x804D3B1: yy_scan_string
by 0x80491FE: yywrap
by 0x804C1ED: yylex
by 0x8049A9C: yyparse
by 0x8048EE2: main

36 bytes in 1 blocks are still reachable in loss record 2 of 3
at 0x4022862: realloc
by 0x804D621: yyrealloc
by 0x804D266: yyensure_buffer_stack
by 0x804CCD2: yy_switch_to_buffer
by 0x804D373: yy_scan_buffer
by 0x804D43C: yy_scan_bytes
by 0x804D3B1: yy_scan_string
by 0x80491FE: yywrap
by 0x804C1ED: yylex
by 0x8049A9C: yyparse
by 0x8048EE2: main

48 bytes in 1 blocks are still reachable in loss record 3 of 3
at 0x4022765: malloc
by 0x804D607: yyalloc
by 0x804D2E9: yy_scan_buffer
by 0x804D43C: yy_scan_bytes
by 0x804D3B1: yy_scan_string
by 0x80491FE: yywrap
by 0x804C1ED: yylex
by 0x8049A9C: yyparse
by 0x8048EE2: main

LEAK SUMMARY:
definitely lost: 0 bytes in 0 blocks.
possibly lost: 0 bytes in 0 blocks.
still reachable: 92 bytes in 3 blocks.
The leak is that unless we recover the memory we leak 92 bytes of memory every time yy_scan_string is called. To recover the memory allocated in the YY_BUFFER_STATE handle Flex provides the function yy_delete_buffer which is described in chapter 12 of the Flex Manual. To prevent the memory leak I created a buffer_state variable, as a void*, which is accessible to all of the parser source. This variable is initialized to NULL at application startup. Then the yy_wrap and parser_init functions call yy_delete_buffer on buffer_state if it is not NULL. And the yy_wrap function must set the buffer_state to NULL after it has been deleted to ensure that it is not deleted twice. Then when the functions set buffer_state to the return value of yy_scan_string. So the process including the memory management, with the changes highlighted, is:
  • The application calls parser_init( connfd, connfd, NULL, NULL )
    • parser_init sets the input and output file descriptors
    • parser_init sets the callbacks to the defaults
    • if buffer_state is not NULL call yy_delete_buffer
    • parser_init initializes the buffer and calls yy_scan_string
  • the application calls yyparse to start the parser
    • flex finds that the input buffer is empty and calls yywrap
      • if buffer_state is not NULL
        • call yy_delete_buffer
        • set buffer_state to NULL
      • yywrap calls the read callback.
        • the read callback reads from the socket and populates the buffer.
      • if data was read
        • yywrap calls yy_scan_string
        • yywrap returns 0
      • else return 1

No comments: