Quickpost: IDAPython script to identify unrecognized functions.

WhatTheFunct?

WhatTheFunct?

Hey folks! This time I’m gonna share with you a small IDAPython tool made by Federico Muttis (aka @acid_. Maybe you remember him from the -pretty awesome- pidgin vulnerability or the WebEx one). This is one of those scripts that you have to use and reuse several times when working with obscure firmwares, memory dumps or even unknown pieces of code.  A lot of us made something like this in the past. It’s a must. But I felt that we really needed something with a little more generical approach. Like Acid did.

Let’s see what he has to say about it ;)

When reversing unknown binaries, such as firmware or any non-standard executable (ELF, PE, etc), it’s pretty common that IDA doesn’t recognize most of the functions.

This is when I usually start hitting “C” whenever something looks like code, and then define everything that looks like functions using “P”.

Of course IDA helps a bit, i.e. when you find a function that jumps to another section on the file, it disassemblies that part, and defines some functions.

But sometimes the binary file is just too long, and even if IDA helps by defining such sections of the file as code/functions, there is a lot of undefined code as well.

This little IDA Python script finds all your defined functions, takes the first instruction’s opcode and searches for it in the rest of the file, if the opcode is found in an undefined portion of the file, it does MakeCode, which is the same as hitting “C”, and then MakeFunction (IDC equivalent for “P”).

It’s worth mentioning that the script also filters which opcodes are functions prologues based on a set of common instructions (i.e. “STMFD” (for ARM), “PUSH” and “MOV”).

You should modify it to suit your needs.


import idc
import struct
import idautils

def find_all( opcode_str ):
    ret = []
    ea = idc.FindBinary(0, 1, opcode_str)
    while ea != idc.BADADDR:
        ret.append(ea)
        ea = idc.FindBinary(ea + 4, 1, opcode_str)
    return ret
    
def define_functions():
    # The function first searches for all user defined functions, reads
    # the opcodes and searches for that opcodes in the rest of the file.
    #
    # You can extend this by adding more disassembled instructions that
    # make you believe are function prologues.
    #
    # Obviously not any PUSH is a function start, this is only a filter
    # against erroneously defined functions. So if you define a function
    # that starts with other instruction (and you think there could be
    # other functions that start with that instruction), just add it here.
    prologues = ["STMFD", "push", "PUSH", "mov", "MOV"]
    
    print "Finding all signatures"
    ea = 0
    opcodes = set()
    for funcea in idautils.Functions(idc.SegStart(ea), idc.SegEnd(ea)):
        # Get the opcode
        start_opcode = idc.Dword(funcea)
        
        # Get the disassembled text
        dis_text = idc.GetDisasm(funcea)
        we_like_it = False
        
        # Filter possible errors on manually defined functions
        for prologue in prologues:
            if prologue in dis_text:
                we_like_it = True
        
        # If it passes the filter, add the opcode to the search list.
        if we_like_it:
            opcodes.add(start_opcode)
        
    print "# different opcodes: %x" % (len(opcodes))
    while len(opcodes) > 0:
        # Search for this opcode in the rest of the file
        opcode_bin = opcodes.pop()
        opcode_str = " ".join(x.encode("hex") for x in struct.pack("<L", opcode_bin))
        print "Searching for " + opcode_str
        matches = find_all( opcode_str )
        for matchea in matches:
            # If the opcode is found in a non-function
            if not idc.GetFunctionName(matchea):
                # Try to make code and function
                print "Defining function at " + hex(matchea)
                idc.MakeCode(matchea)
                idc.MakeFunction(matchea)

    print "We're done!"
    
define_functions()


This in an example of a firmware file with only user (and IDA) defined functions:

And this is after the script ran:

Obviously, blue means code within a function.

About these ads

~ by aLS -- on December 6, 2011.

One Response to “Quickpost: IDAPython script to identify unrecognized functions.”

  1. [...] IDAPython script to identify unrecognized functions. – Link : http://exploiting.wordpress.com/2011/12/06/quickpost-idapython-script-to-identify-unrecognized-funct… 메모리 이미지 내의 코드 조각, 펌웨어 이미지를 IDA로 분석할 때 도구에서 [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: