Home of the Dupe since 2001

* Login   * Register    * FAQ    * Search

Join us on IRC: (or Mibbit Web IRC)

It is currently Fri Apr 20, 2018 4:03 am

All times are UTC [ DST ]

Post new topic Reply to topic  [ 11 posts ] 
Author Message
 Post subject: Reverse Engineering Basics
PostPosted: Wed Aug 08, 2007 1:48 am 

Joined: Sun May 09, 2004 5:41 am
Reverse Engineering Basics
by Darawk

(all of the asm code examples of this are taken from Diablo II's dll's)

For those of you that already know a programming language or two, you should understand this without too much trouble. However, for those who don't, it's very hard to explain this without you at least having a cursory knowledge of general programming concepts. This is not going to be a tutorial on how to reverse engineer(though I will provide you with a bunch, at the end of this). I'd just like to explain, exactly what "reverse engineering" is, for those who don't know, as there seem to be some misconceptions here.

When you write code in your compiler, and then compile it, the compiler takes the code that you wrote, and converts it into assembly code. Assembly code is the lowest level language you can possibly have. You deal directly with memory and CPU registers(the tuts I'll provide you with at the end should clear up what these are). So, the compiler turns your C/C++/VB/ whatever language you code in, into assembly code. Why does it do this? Well, because every assembly instruction corresponds directly to a hexadecimal number(the encoding for this is not exactly straightforward, but it's not all that important to reversing things like d2, either). These hexadecimal numbers are things that can be understood by the processor. Which, is indeed the ultimate goal of the programmer, to create code that is understood by the processor, and can be executed.

However, the compilation process is, to some degree ir-reversible. When you write code in a high level language, you have things like "function names" and "variable names". These are stripped out by the compiler, and are not included in your executable. Why? Because they take up space, and are unnecessary for the processor to understand what it is that your program is doing. So, instead all of those nice variables you had are simply referenced with a memory address, and all of those cutely named functions you created, as well are called with no more than a 4-byte long hex number that specify's their address within your program.

Let's look at a standard example of this, from the 0x1E packet:

//put the 2nd arg into the register eax(2nd arg is the length of the packet)
mov     eax, [esp+arg_4]     

<...edited a bit to make it more clear...>

//compare eax, to 9
cmp     eax, 9

//not to important...
mov     ebx, edx       

//also not to important...       
mov     edi, ecx

/* This instruction is the "jump if equal/zero" instruction, for now, let's      just assume it is "jump if equal", and it means go to valid_packet, if the last comparison performed resulted in a value of equal(in other words, if eax==9) */
jz      valid_packet

//Fix something known as "the stack" so that we can exit the function
pop     edi

//more fixing
pop     esi

//more fixing           
pop     ebp

/* Set the return value to 3(all return values are placed in the eax register), a return value of 3 indicates an invalid length-packet */     
mov     eax, 3

//more fixing     
pop     ebx

//return from function         
retn    8             

/* I named it valid_packet to make it easier to understand...and also...can't be giving away code locations that easy, can I? =p */
valid packet:

//arg_0 is a pointer to our 1E packet                         
mov     eax, [esp+10h+arg_0]

//don't worry about this for now
test    edi, edi

/* put arg_0+1 into ebp(remember it's a pointer to the packet we sent, so this is essentially eliminating that pre-fixed 1E byte, since we don't need it anymore, now all that's important are the 2 dwords that went with 1E) */                         
mov     ebp, [eax+1]

/* eax points to the beginning of our packet, so, eax+5 points to the DWORD Position, which is put into al(the significance of it being al, is beyond the scope of my little primer here, you'll learn all about it in the tuts i'll give later) */                 
mov     al, [eax+5]

/* place our DWORD Position in arg_4, which was our packet length, but since we no longer need to worry about the length of our packet, it can be used for other things */                   
mov     byte ptr [esp+10h+arg_4], al
<...code continues on for quite a while ;) ...>

Ok, you may not have understood all of that...but I hope this will help to clarify. This is how that same ASM code might look in C/C++

Packet_1E_Handler(BYTE *packet, DWORD len)

/* remember, a return value of 3 indicates a packet of invalid length(as well as a few other occasions, but mostly invalid length) */
return 3;

/* skip over the 1E byte, to get to our first DWORD, you may have noticed that it was not actually placed into a local variable, but rather a CPU register.  This is a compiler optimization that occured probably because the variable was used so much as to warrant never actually storing it in a memory location.  This is very common throughout d2 code.

/* since len is no longer used, blizz decided to use it to store your dword position */
<..code continues for a while...>

Maybe this helped you...maybe it didn''s really kinda hard to teach something like this in a single post . So, this is just to help you understand what it is that...reverse engineering is.

As for what you need to know to reverse code:

You should know a few different higher level languages(c/c++ mostly) fairly well, as it will help you to identify control structures and other such features that become transparent in compiled code.

You need to know x86 intel style assembly(for windows reversing).

You need to know about how memory works, the stack, the heap, etc..

You need to know how compilers compile code, and all the different calling conventions for functions.  You also need to know how local variables are referenced and other such things.

You also need to have an understanding of hexadecimal and binary numbering systems.

Now, you might be saying.."ok, but wtf are all these things, and where can I find out how it all works"? Here, i'll try to provide you to the best of my ability with the best resources on the 'net.

Informational Resources Very basic and easy to understand tutorial on assembly programming. This is probably the first thing you should read. Awesome website, devoted more to cracking software protections, but that is what most reversers do. They have a ton of information, and great forums ( ) Another great site with excellent forums on reverse engineering. You should definitely read their FAQ, and browse the newbie forums a bit, as well as read some of the tutorials at the bottom of the page. Another software cracking site, but lots can be learned about code reversing from cracking tutorials. Very nice site, and eBook on windows assembly programming The famous home page of Iczelion, a master reverser and ASM coder. TONS of information, a must read. Not the greatest tut on practical RE, but take a cursory glance at some of what it's got. EXCELLENT website, lots of info. You definitely need to check this out. Of course, +fravia's website of reversing. You cannot consider yourself a RE if you have not read some of the +HCU essays, or tutorials of +ORC. All of which are contained here. ... tified.asp Excellent paper explaining function calling conventions. Very nice explanation of the windows PE(Portable Executable) file format. Contains the PE specification written by microsoft, as well as the notes of a few legendary reversers. This is not so important for D2 reversing, but if you want to get into software cracking or other types of low level operations, then you will need to know this stuff.

hehe, you should all be occupied for quite some time now, eh? =p

k, well, now that you know everything, you need some tools.

First, an explanation of what each tool in a reverser's toolkit is:

The Dis-Assembler:

A dis-assembler shows you the assembly code of a program. It also shows the imported and exported functions of that program or dll file(imported functions are functions called by your program that are from other dll's, like all of the windows API, GetWindowText, malloc, are all imported functions. Exported functions are usually only found in dll files, and they are functions that are "exported" so that they can be called by programs that load those dll's.). The disassembler also displays string's used by the program, like if you had the static text string

"hello world"

It would be stored in a section of your PE file(the compiled program), known as the .data section. The disassembler will show you all of the strings in this section, and commonly where in the code they are referenced.

More advanced disassemblers will perform some degree of code analysis for you, which will make the reversing process much easier.

The Debugger

A debugger is similar to a disassembler, in that the both display executable code. However the debugger goes one step further. The debugger allows you to actually modify the program as it runs. A debugger allows you to set "breakpoints" on memory or code, so that when that memory is referenced or when that code is executed, the program freezes. At this point, the debugger allows you to dump the programs memory, examine it's CPU registers, and even modify the code itself.

You also have the capability to do what's called "single-stepping" which is when you execute one instruction at a time, and it pauses after each instruction. This is VERY useful for code reversing. It allows you to examine the program at every step of the way as it processes your packet(in the case of d2). I'll provide you with some tutorials on how to use these programs at the end.

The specific tools that I suggest you use, and where to get them:


IDA Pro by Datarescue is by far the best dis-assembler out there. It is the most advanced dis-assembler I know of.

You can find it here:


When it comes to a debugger, you really have only 2 choices. SoftIce by Numega or Olly Debug. SoftIce is what's known as a ring0 debugger. It runs in kernel mode(what exactly this means is beyond the scope of this tut). While Olly Debug is a ring3 debugger, which runs in application mode.

When you are in SoftIce, you cannot interact with anything else. Your entire comp is essentially frozen, until you exit SoftIce. This is useful when cracking software for a few reasons, but I don't think that cracking software is what most of you will be doing. So, I don't really recommend that you use it for reversing D2, however if you would like to, you can get it here: (you must sign up, and get their FTP password, their FTP server contains...everything you could ever want as a reverser ) Please don't flood their forums though or their servers too much with useless downloads. They are an awesome site, but their bandwidth is somewhat limited, and I don't want them to have to close off membership again.

Olly Debug is a much better choice for most of your practical reversing situations. It runs in the application "ring"(ring3) so that it only freezes the program that you are debugging, and you can still interact with all of your other windows normally. It is a free program, that you can get here:

A few notes about Olly. When you attach to a process, you cannot detach without killing that process. Also, if you attach to one process, you must close Olly before attaching to another...this is just a bug in Olly that may or may not be resolved in later versions. There are many tutorials on how to use Olly available on the web, just google for "Ollydbg Tutorials" fact, i'll even give you a google link ... +tutorials

Also, the OllyDbg website(you know...that site you downloaded Olly from?) has a lot of good basic stuff on how to use it. And read the help in olly. I cannot stress enough how much you absolutely must read the help. Same with SoftIce, tons of tutorials on the web, if you google for them. It also comes with a HUGE PDF on how to use it.

As for the use of your dis-assembler, IDA. Just read the help for that, it's fairly simple to use, just play around until ya get the hang of it. And of course, i'd be happy to answer any questions that you have, provided they aren't too obviously answerable by googling, or RTFM'ing.

Physco wrote:
Its all the same haha they try to disprove our religion but they cant, they just point out that theirs a lack of evidence and an overwhelming amount of evidence on their side.

 Post subject:
PostPosted: Wed Aug 08, 2007 2:32 am 

Joined: Sun Dec 08, 2002 4:39 pm

I like the example of the assembly then c++ code.

Good tutorial.

 Post subject:
PostPosted: Wed Aug 08, 2007 3:44 am 

Joined: Fri Jan 23, 2004 2:02 am
Whoa. Thanks :)

 Post subject:
PostPosted: Wed Aug 15, 2007 10:50 pm 
User avatar

Joined: Sun Jan 14, 2007 3:49 pm
that was hawt. ty

 Post subject:
PostPosted: Wed Aug 15, 2007 11:57 pm 

Joined: Thu Jul 19, 2007 2:36 am
The and IDA Pro link don't work.

 Post subject:
PostPosted: Thu Aug 16, 2007 6:55 am 

Joined: Sun May 09, 2004 5:41 am
TemporaryUsername wrote:
The and IDA Pro link don't work.

I know, this is from a very very long time ago. I'm not going to link you directly to IDA Pro, but with some searching it's not too hard to find. As for the tutorial:

Physco wrote:
Its all the same haha they try to disprove our religion but they cant, they just point out that theirs a lack of evidence and an overwhelming amount of evidence on their side.

 Post subject:
PostPosted: Thu Aug 16, 2007 8:05 am 

Joined: Thu Jul 19, 2007 2:36 am
New link doesn't work either lol. Perhaps its just me?

 Post subject:
PostPosted: Thu Aug 16, 2007 9:06 am 

Joined: Sun May 09, 2004 5:41 am
TemporaryUsername wrote:
New link doesn't work either lol. Perhaps its just me?

Google cache to the rescue: ... =firefox-a

And i'll just quote it here, so it's saved:

This is G o o g l e's cache of as retrieved on Jul 16, 2007 14:38:54 GMT.
G o o g l e's cache is the snapshot that we took of the page as we crawled the web.
The page may have changed since that time. Click here for the current page without highlighting.
This cached page may reference images which are no longer available. Click here for the cached text only.
To link to or bookmark this page, use the following url: ... =firefox-a

Google is neither affiliated with the authors of this page nor responsible for its content.
These search terms have been highlighted: sk00l m3
These terms only appear in links pointing to this page: asm txt

::::::::: :::::::: ::::::::: ::::::::::
:+: :+: :+: :+: :+: :+: :+:
+:+ +:+ +:+ +:+ +:+ +:+
+#++:++#+ +#++:++#++ +#++:++#: :#::+::#
+#+ +#+ +#+ +#+ +#+ +#+
#+# #+# #+# #+# #+# #+# #+#
######### ######## ### ### ###
_______________________I Topic: I_____________________
\ I Sk00l m3 ASM!! I /
\ E-mail: I I Written by: /
> I I <
/ I____________________I Ralph \
/___________________________> <_________________________\

Sk00l m3 ASM!!#@$!@#

by Ralph (
-AWC (
Version: 0.841
Date: 7/23/00

NOTE: This thing is almost done, just gotta finish of the Win32 section, however I
started working on other shit so finishing this is kinda 10th on my priority
list. If you think you can convince me to finish it sooner, feel free to
contact me.


1. Introduction
-What is it?
-Why learn it?
-What will this tutorial teach you?

2. Memory
-Number Systems
-Bits, Nybbles, Bytes, Words, Double Words
-The Stack

3. Getting started
-Getting an assembler
-Program layout

4. Basic ASM
-Basic Register operations
-Stack operations
-Arithmetic operations
-Bit wise operation

5. Tools

6. More basics
-.COM file format
-Flow control operations
-String Operations
-User Input

7. Basics of Graphics
-Using interrupts
-Writing directly to the VRAM
-A line drawing program

8. Basics of File Operations
-File Handles
-Reading files
-Creating files
-Search operations

9. Basics of Win32
-A Message Box
-A Window

Appendix A

Appendix B
-Credits, Contact information, Other shit

1. Introduction

What is it?
Assembly language is a low-level programming language. The syntax is nothing like
C/C++, Pascal, Basic, or anything else you might be used to.

Why learn it?
If you ask someone these days what the advantage of assembly is, they will tell you it's
speed. That might have been true in the days of BASIC or Pascal, but today a C/C++
program compiled with an optimized compiler is as fast, or even faster than the same
algorithm in assembly. According to many people assembly is dead. So why bother
learning it?
1. Learning assembly will help you better understand just how a computer works.
2. If windows crashes, it usually returns the location/action that caused the error.
However, it doesn't return it in C/C++. Knowing assembly is the only way to track
down bugs/exploits and fix them.
3. How often do you whish you could just get rid of that stupid nag screen in that
shareware app you use? Knowing a high-level language wont get you very far when you
open the shit up in your decompiler and see something like CMP EAX, 7C0A
4. Certain low level and hardware situations still require assembly
5. If you need precise control over what your program is doing, a high level language
is seldom powerful enough.
6. Anyway you put it, even the most optimized high level language compiler is still
just a general compiler, thus the code it produces is also general/slow code. If
you have a specific task, it will run faster in optimized assembly than in any other
7. "Professional Assembly Programmer" looks damn good on a resume.
My personal reason why I think assembly is the best language is the fact that you're
in control. Yes all you C/C++/Pascal/Perl/etc coders out there, in all your fancy
high level languages you're still the passenger. The compiler and the language itself
limit you. In assembly you're only limited by the hardware you own. You control the
CPU and memory, not the otherway around.

What will this tutorial teach you?
I tryed to make this an introduction to assembly, so I'm starting from the beginning.
After you've read this you should know enough about assembly to develop graphics
routines, make something like a simple database application, accept user input,
make Win32 GUIs, use organized and reuseable code, know about different data types
and how to use them, some basic I/O shit, etc.

2. Memory
In this chapter I will ask you to take a whole new look at computers. To many they
are just boxes that allow you to get on the net, play games, etc. Forget all that
today and think of them as what they really are, Big Calculators. All a computer does
is Bit Manipulation. That is, it can turn certain bits on and off. A computer can't
even do all arithmetic operations. All it can do is add. Subtraction is achieved
by adding negative numbers, multiplication is repeaded adding, and dividing is
repeaded adding of negative numbers.

Number systems
All of you are familiar with at least one number system, Decimal. In this chapter I
will introduce you to 2 more, Binary and Hexadecimal.

Before we get into the other 2 systems, lets review the decimal system. The decimal
system is a base 10 system, meaning that it consists of 10 numbers that are used to make
up all other number. These 10 numbers are 0-9. Lets use the number 125 as an example:
Hundreds Tens Units
Digit 1 2 5
Meaning 1x10^2 2x10^1 5x10^0
Value 100 20 5
NOTE: x^y means x to the power of y. ex. 13^3 means 13 to the power of 3 (2197)
Add the values up and you get 125.

Make sure you understand all this before going on to the binary system!

The binary systems looks harder than decimal at first, but is infact quite a bit easier
since it's only base 2 (0-1). Remember that in decimal you go "value x 10^position" to
get the real number, well in binary you go "value x 2^position" to get the answer.
Sounds more complicated than it is. To better understand this, lets to some converting.
Take the binary number 10110:
1 x 2^4 = 16
0 x 2^3 = 0
1 x 2^2 = 4
1 x 2^1 = 2
0 x 2^0 = 0
Answer: 22

NOTE: for the next example I already converted the Ax2^B stuff to the real value:
2^0 = 1
2^1 = 2
2^2 = 4
2^3 = 8
2^4 = 16
2^5 = 32

Lets use 111101:
1 x 32 = 32
1 x 16 = 16
1 x 8 = 8
1 x 4 = 4
0 x 2 = 0
1 x 1 = 1
Answer: 61

Make up some binary numbers and convert them to decimal to practise this. It is very
important that you completely understand this concept. If you don't, check Appendix B
for links and read up on this topic BEFORE going on!

Now lets convert decimal to binary, take a look at the example below:
238 / 2 remainder: 0
119 / 2 remainder: 1
59 / 2 remainder: 1
29 / 2 remainder: 1
14 / 2 remainder: 0
7 / 2 remainder: 1
3 / 2 remainder: 1
1 / 2 remainder: 1
0 / 2 remainder: 0
Answer: 11101110

Lets go through this:
1. Divide the original number by 2, if it divides evenly the remainder is 0
2. Divide the answer from the previous calculation (119) by 2. If it wont
divide evenly the remainder is 1.
3. Round the number from the previous calculation DOWN (59), and divide it by 2.
Answer: 29, remainder: 1
4. Repeat until you get to 0....
The final answer should be 011101110, notice how the answer given is missing the 1st 0?
That's because just like in decimal, they have no value and can be omitted (023 = 23).

Practise this with some other decimal numbers, and check it by converting your answer
back to binary. Again make sure you get this before going on!

A few additional things about binary:
* Usually 1 represents TRUE, and 0 FALSE
* When writing binary, keep the number in multiples of 4
ex. DON'T write 11001, change it to 00011001, remember that the 0 in front
are not worth anything
* Usually you add a b after the number to signal the fact that it is a binary number
ex. 00011001 = 00011001b

Some of you may have notice some consistency in things like RAM for example. They seem
to always be a multiple of 4. For example, it is common to have 128 megs of RAM, but
you wont find 127 anywhere. That's because computer like to use multiples of 2, 4, 8,
16, 32, 64 etc. That's where hexadecimal comes in. Since hexadecimal is base 16, it is
perfect for computers. If you understood the binary section earlier, you should have
no problems with this one. Look at the table below, and try to memorize it. It's not
as hard as it looks.
Hexadecimal Decimal Binary
0h 0 0000b
1h 1 0001b
2h 2 0010b
3h 3 0011b
4h 4 0100b
5h 5 0101b
6h 6 0110b
7h 7 0111b
8h 8 1000b
9h 9 1001b
Ah 10 1010b
Bh 11 1011b
Ch 12 1100b
Dh 13 1101b
Eh 14 1110b
Fh 15 1111b

NOTE: the h after each hexadecimal number stands for <insert guess here>

Now lets do some converting:
Hexadecimal to Decimal

F x 16^0 = 15 x 1 = 15
4 x 16^1 = 4 x 16 = 64
A x 16^2 = 10 x 256 = 2560
2 x 16^3 = 2 x 4096 = 8192
Answer: 10831

1. Write down the hexadecimal number starting from the last digit
2. Change each hexadecimal number to decimal and times them by 16^postion
3. Add all final numbers up

Confused? Lets do another example: DEAD
D x 1 = 13 x 1 = 13
A x 16 = 10 x 16 = 160
E x 256 = 14 x 256 = 3584
D x 4096 = 13 x 4096 = 53248
Answer: 57005

Practise this method until you get it, then move on.

Decimal to Hexadecimal
Study the following example:

1324 / 16 = 82.75
82 x 16 = 1312
1324 - 1312 = 12, converted to Hexadecimal: C

82 / 16 = 5.125
5 x 16 = 80
82 - 80 = 2, converted to Hexadecimal: 2

5 / 16 = 0.3125
0 x 16 = 0
5 - 0 = 5, converted to Hexadecimal: 5

Answer: 52C

I'd do another example, but it's too much of a pain in the ass, maybe some other time.

Learn this section you WILL need it!
This was already one of the hardest parts, the next sections should be a bit easier

Some additional things abot hexidecimal
1. It's not uncommon to say "hex" instead of "hexidecimal" even thechnicaly speaking
"hex" means 6, not 16.
2. Keep hexidecimal numbers in multiples of 4, adding zeros as necessary
3. Most assemblers can't handle numbers that start with a "letter" because they don't
know if you mean a label, instruction, etc. In that case there are a number of
other ways you can express the number. The most common are:
DEAD = 0DEADh (Usually used for DOS/Win)
DEAD = 0xDEAD (Usually used for *Nix based systems)
Consult your assembler's manual to see what it uses.

By the way, does anyone think I should add Octal to this...?

Bits, Nibbles, Bytes, Words, Double Words
Bits are the smallest unit of data on a computer. Each bit can only represent 2 numbers,
1 and 0. Bits are fairly useless because they're so damn small so we got the nibble.
A nibble is a collection of 4 bits. That might not seem very interesting, but remember
how all 16 hexadecimal numbers can be represented with a set of 4 binary numbers?
That's pretty much all a nibble is good for.

The most important data structure used by your computer is a Byte. A byte is the
smallest unit that can be accessed by your processor. It is made up of 8 bits, or
2 nibbles. Everything you store on your hard drive, send with your modem, etc is in
bytes. For example, lets say you store the number 170 on your hard drive, it would look
like this:

| 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
7 6 5 4 3 2 1 0
H.O Nibble | L.O Nibble

10101010 is 170 in binary. Since we can fit 2 nibbles in a byte, we can also refer
to bits 0-3 as the Low Order Nibble, and 4-7 as the High Order Nibble
Next we got Words. A word is simply 2 bytes, or 16 bits. Say you store 43690, it would
look like this:

| 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |

Physco wrote:
Its all the same haha they try to disprove our religion but they cant, they just point out that theirs a lack of evidence and an overwhelming amount of evidence on their side.

 Post subject:
PostPosted: Thu Aug 16, 2007 9:24 am 

Joined: Fri Jan 23, 2004 2:02 am
Darawk, it seems that last quote is cut off a bit.

Anyway, I found this plugin that allowed me to attach to Mythos.

Hide Debugger: ... e_Debugger

 Post subject: Just for a bit of updated info
PostPosted: Sun Sep 20, 2009 7:30 am 

Joined: Sat Dec 11, 2004 6:33 am
more modern asm tutorials and references

Paul carter's x86 32bit assembly textbook

NASM Cheat Sheet ... index.html

Ralf Brown's Interrupt List


Lena151's Reversing Tutorials along with lots of other good info can be downloaded from:

 Post subject: Re: Reverse Engineering Basics
PostPosted: Sat Mar 24, 2012 8:03 am 
User avatar

Joined: Wed Aug 24, 2011 7:29 pm
Thanks for the post!

Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC [ DST ]

Who is online

Users browsing this forum: No registered users and 1 guest

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to: