Copyright | (c) Martin Grabmueller and Dirk Kleeblatt |
---|---|
License | BSD3 |
Maintainer | martin@grabmueller.de,klee@cs.tu-berlin.de |
Stability | provisional |
Portability | portable |
Safe Haskell | None |
Language | Haskell98 |
Disassembler for x86 machine code.
This is a disassembler for object code for the x86 architecture. It provides functions for disassembling byte arrays, byte lists and memory blocks containing raw binary code.
Features:
- Disassembles memory blocks, lists or arrays of bytes into lists of instructions.
- Abstract instructions provide as much information as possible about opcodes, addressing modes or operand sizes, allowing for detailed output.
- Provides functions for displaying instructions in Intel or AT&T style (like the GNU tools)
Differences to GNU tools, like gdb or objdump:
- Displacements are shown in decimal, with sign if negative.
Missing:
- LOCK and repeat prefixes are recognized, but not contained in the opcodes of instructions.
- Support for 16-bit addressing modes. Could be added when needed.
- Complete disassembly of all 64-bit instructions. I have tried to disassemble them properly but have been limited to the information in the docs, because I have no 64-bit machine to test on. This will probably change when I get GNU as to produce 64-bit object files.
- Not all MMX and SSESSE2SSE3 instructions are decoded yet. This is just a matter of missing time.
- segment override prefixes are decoded, but not appended to memory references
On the implementation:
This disassembler uses the Parsec parser combinators, working on byte lists. This proved to be very convenient, as the combinators keep track of the current position, etc.
- data Opcode
- data Operand
- data InstrOperandSize
- data Instruction
- = BadInstruction Word8 String Int [Word8]
- | PseudoInstruction Int String
- | Instruction { }
- data ShowStyle
- data Config = Config {
- confDefaultBitMode :: OperandSize
- confOperandBitMode :: OperandSize
- confAddressBitMode :: OperandSize
- confIn64BitMode :: Bool
- confStartAddr :: Word32
- disassembleBlock :: Ptr Word8 -> Int -> IO (Either ParseError [Instruction])
- disassembleList :: Monad m => [Word8] -> m (Either ParseError [Instruction])
- disassembleArray :: (Monad m, IArray a Word8, Ix i) => a i Word8 -> m (Either ParseError [Instruction])
- disassembleFile :: FilePath -> IO (Either ParseError [Instruction])
- disassembleBlockWithConfig :: Config -> Ptr Word8 -> Int -> IO (Either ParseError [Instruction])
- disassembleListWithConfig :: Monad m => Config -> [Word8] -> m (Either ParseError [Instruction])
- disassembleArrayWithConfig :: (Monad m, IArray a Word8, Ix i) => Config -> a i Word8 -> m (Either ParseError [Instruction])
- disassembleFileWithConfig :: Config -> FilePath -> IO (Either ParseError [Instruction])
- showIntel :: Instruction -> [Char]
- showAtt :: Instruction -> [Char]
- defaultConfig :: Config
Types
All operands are in one of the following locations:
- Constants in the instruction stream
- Memory locations
- Registers
Memory locations are referred to by on of several addressing modes:
- Absolute (address in instruction stream)
- Register-indirect (address in register)
- Register-indirect with displacement
- Base-Index with scale
- Base-Index with scale and displacement
Displacements can be encoded as 8 or 32-bit immediates in the instruction stream, but are encoded as Int in instructions for simplicity.
OpImm Word32 | Immediate value |
OpAddr Word32 InstrOperandSize | Absolute address |
OpReg String Int | Register |
OpFPReg Int | Floating-point register |
OpInd String InstrOperandSize | Register-indirect |
OpIndDisp String Int InstrOperandSize | Register-indirect with displacement |
OpBaseIndex String String Int InstrOperandSize | Base plus scaled index |
OpIndexDisp String Int Int InstrOperandSize | Scaled index with displacement |
OpBaseIndexDisp String String Int Int InstrOperandSize | Base plus scaled index with displacement |
data InstrOperandSize Source
Some opcodes can operate on data of several widths. This information is encoded in instructions using the following enumeration type..
data Instruction Source
The disassembly routines return lists of the following datatype. It encodes both invalid byte sequences (with a useful error message, if possible), or a valid instruction. Both variants contain the list of opcode bytes from which the instruction was decoded and the address of the instruction.
BadInstruction Word8 String Int [Word8] | Invalid instruction |
PseudoInstruction Int String | Pseudo instruction, e.g. label |
Instruction | Valid instruction |
Instructions can be displayed either in Intel or AT&T style (like in GNU tools).
Intel style:
- Destination operand comes first, source second.
- No register or immediate prefixes.
- Memory operands are annotated with operand size.
- Hexadecimal numbers are suffixed with
H
and prefixed with0
if necessary.
AT&T style:
- Source operand comes first, destination second.
- Register names are prefixes with
%
. - Immediates are prefixed with
$
. - Hexadecimal numbers are prefixes with
0x
- Opcodes are suffixed with operand size, when ambiguous otherwise.
IntelStyle | Show in Intel style |
AttStyle | Show in AT&T style |
Config | |
|
Functions
disassembleBlock :: Ptr Word8 -> Int -> IO (Either ParseError [Instruction]) Source
Disassemble a block of memory. Starting at the location pointed to by the given pointer, the given number of bytes are disassembled.
disassembleList :: Monad m => [Word8] -> m (Either ParseError [Instruction]) Source
Disassemble the contents of the given list.
disassembleArray :: (Monad m, IArray a Word8, Ix i) => a i Word8 -> m (Either ParseError [Instruction]) Source
Disassemble the contents of the given array.
disassembleFile :: FilePath -> IO (Either ParseError [Instruction]) Source
disassembleBlockWithConfig :: Config -> Ptr Word8 -> Int -> IO (Either ParseError [Instruction]) Source
disassembleListWithConfig :: Monad m => Config -> [Word8] -> m (Either ParseError [Instruction]) Source
disassembleArrayWithConfig :: (Monad m, IArray a Word8, Ix i) => Config -> a i Word8 -> m (Either ParseError [Instruction]) Source
disassembleFileWithConfig :: Config -> FilePath -> IO (Either ParseError [Instruction]) Source
showIntel :: Instruction -> [Char] Source
Show an instruction in Intel style.
showAtt :: Instruction -> [Char] Source
Show an instruction in AT&T style.