Potentially this is another long section. Challenge crackits/crackmes vary considerably and there is a lot of ground to cover. At its simplest a password will be in plain view and can be found using only a hexeditor. As an example of something more complex I recently solved a level that I had to write a decryptor for to decrypt 5 levels of encryption, then I had to extract some of the assembly language within it and finally I had to write a brute forcer to find the right password. You may find it difficult to gauge how hard a particular crackme is but I will do what I can to point you in the right directions. Often I think that many challenge designers do not know assembly particularly well and levels have often been written in things like VB, VB-Pcode, MFC, and Delphi. All of these languages compile to executables that can be difficult to debug and disassemble. They are often overbloated and present great problems for newbies and no problems for experts. It's not unknown to find 10 different crackmes on a challenge site which are practically the same - VB with some string compare at the end. By far the most interesting levels tend to be hand written asm and when you find a hand written asm level you will know it - they are compact and neat and the code just flows right :)
Lets start with a quick talk about tools. You will soon find people talking about debuggers and disassemblers, but there are other tools that can be used in solving these challenges too. The main areas are:
If you are looking for introductory material and tutorials, etc then consider the old Fravia site and the Anticrack site. Both sites are absolutely huge but contain within their pages some invaluable lessons in reverse engineering, assembly, unpacking, coding, etc. What follows here are some specific tips on challenge levels.
Many challenge levels seem to be written in VB (if only challenge writers could learn to code ;)). When you compile a program in VB you could potentially compile to native x86 code, or to P-code. Whichever you choose the start of the program will be a call to MSVBVM60.DLL or similar. When you look through memory it will contain a lot of data after this call, relating to the VB form construction. Eventually somewhere further down in memory you should find some code (or P-code). If you suspect it is P-code then try the aforementioned tools. If it isnt P-code then these tools will soon tell you. Unfortunately there is NO documentation about P-code, and I mean there is NO documenation about it. You can often find little snippets in some tutorials but it is largely going to be guesswork. WKTVBDE can help greatly here. Lets assume you have x86 code in the program. If you scroll through the first section in Ollydbg then you will come across some code. This will normally be clearly demarked into a number of routines starting with standard entry code (push ebp; mov ebp,esp; sub esp,xxh; etc and finishing with standard exit code (mov esp,ebp; pop ebp; ret; etc ). Each of these will either be some subroutine called from elsewhere, or it will be a handler for some event. Generally with challenge levels there will be a form for a password and a button to press. You might have one routine in the code and it will be the 'button pushed' event handler. Its easy enough to breakpoint the start of any or each of these routines.
Now when you look at VB code you will generally see a number of calls scattered throughout the code to the runtime MSVBVM. Some of these may be named and some will not be named. Often you will have something like:
.... call __vbaVarTstEq .... je someaddr .... push goodstring .... someaddr: .... push badstring ....
The obvious point to breakpoint is on the call. So you get to the call, and want to know what the two values being compared are. In a simple crackme this will probably be your entered serial/password and the correct serial/password. Simply peek at these and you have the right answer. So where are the strings ? Well, prior to this there should be some pushes to the stack and the top stack values will be pointers. So we move the data area to esp and write the pointers down and then look at the addresses by these pointers. Typically you will find something like:
00123456 08 00 00 00 12 13 12 00 15 14 16 00
Now that doesnt look like a string. In fact the '08' indicates the variable type, in this case a string (and you will mostly find '08' when looking at this type of level). The address where the string is actually stored is [00123456+8], ie the string in this case is at 00161415. So now you would look and see what is at 00161415 and probably find a unicode string there. The fact that type=8 indicates a string and that the address is stored at [var+8] is a coincidence. If type=0 then this indicates that the variant has no value (like null).
Many other VB calls work in a similar fashion. Often a variant pointer is returned in eax, or a value is returned in eax. It is often useful to step over the call and see which registers have changed, and then check out each one in turn if it refers to a memory address. After a while you should become familiar with the type of things that VB does, like freeing lists of temporary variables (which VB creates by the dozen when you do string manipulation). You will often find short sequences where a variable is checked to see if it has a value and if not then it is allocated some space, these are characterised by a call to '__vbaObjSet' followed by a short forward jump shortly afterwards.
Another popular language for challenge levels is Delphi. Delphi executable code has its own characteristics as well. You will find that the executables tend to be large with a lot of library code at the beginning of the code section, and the program entry code at the end of the code section. It is normally interspersed with tables and strings that indicate objects (Delphi stores object data and code in the same areas). Code tends to be what I would call dense in the sense that a lot of Delphi generated code consists of sequences that tend to push a few values and then make a call, as opposed to sparse code where a long sequence of code might manipulate values. If you try and trace Delphi code in a debugger by following all of the calls then you will soon be lost in a maze of code. Delphi subroutines often have separate exit stubs that are jumped to and all look very similar.
I take two approaches to Delphi levels. First of all is the need to find the appropriate code that we are interested in - start at the end of the code section and move backwards, you should find it soon enough. Secondly when tracing a Delphi program the general tip is: Never trace into calls. Occasionally there may be a routine hidden away that you will need to look at, this will normally have a high address (ie it is a user piece of code and not a library piece of code). Delphi routines tend to have a lot of startup and exit code to them, so you need to look backwards from where you want to be and you will need to develop a good ignorance factor when trying to decipher them ;) Generally ignore any calls where it is not obvious what they do (check a calls arguments being pushed to the stack and check what was returned in registers) and try to work out what is happening at the topmost level before diving in any lower.
In these days of win32 PE files it can be quite disconcerting to find challenge levels that hark back to older days - DOS executables, NE executables or COM files. COM files are the easiest to explain as 16-bit executables that load at an offset of 0x100h and run in a single segment (oh you remember segmented memory dont you ?). Many disassemblers will disassemble a COM file, and you may need recourse to a copy of Ralf Browns interrupt list to resolve any interrupt calls being used (for those of you that are lost - interrupt calls were system calls to perform functions like reading/ writing to files or screen, etc in a broadly (very broadly) similar way that win32 API calls are used to accomplish these tasks today but arguments were passed in registers and the value of AX roughly determined which function was performed). General DOS executables are similar although multi-segmented, and I think its easy to forget these days that an address can be referred to in many ways through segmented addressing. NE executables were the original windows 16-bit executables. Faced with a challenge involving any of these files you will probably be tempted to grab your latest disassembler and try to decode it. Personally I have found that some disassemblers seem to have lost some of their ability to decipher these programs well. Older versions often do a much better job. Luckily there are few NE challenges around, you are more likely to find DOS and COM files. The only occasion I have seen an NE file recently was with a new CD protection scheme. Perhaps future challenge levels will try something similar ?
Overall I would say that there are two classes of challenge crackit. The first class either has the correct password in the program, or it is generated at some point within the program and used in a compare. These are obviously the simplest levels, and anyone should be able to crack most of them with a debugger and some patience. The second class contains everything else. Perhaps you need to do something else so that a jump is made to the correct routine (for example creating a file with a certain name), or perhaps the password is manipulated in some way and then checked against an encrypted version. The second class of crackme requires a better understanding of what is going on than the first class. Sometimes you will be asked for a correct answer and when it is compared the program will simply output a message like 'congratulations the answer you need for the challenge is blah' where blah is a constant value. You can often circumvent the checking on these kind of levels simply by changing values in a debugger like Ollydbg. Its always worth playing around and seeing what happens when a jump is or is not made. Double clicking on flags in Olly will change them and when you are paused before a jump it is easy to change the path of execution. Having said that, dont be dismayed if the program crashes out, these things happen ;) Often when tracing code I do not just single step, I often place breakpoints and then see what happens when I run a program again and change values at those points, or change the flow of execution.
Of course the crackme itself could be a puzzle that you need to solve, and the crackme is merely playing that puzzle to you. If you want an example of this then see Along3x - Zebra crackme and My solution to it. If you play Electrica then expect to find more of these levels later on.
It's difficult to recommend books in this area, but I recently bought and read a book on IL assembly which is quite good - at least it is documented unlike VB P-Code. Personally I haven't read any of the recent assembly language books and so cannot comment on them. I have also put a link here to Aho and Ullmans classic Compiler design book, "The dragon book" as it is more informally known. Once you have written your own compiler you will understand a hell of a lot more about reversing, assembly language will take on more structure and you will not find yourself bogged down in details so much. Of course I don't mean a fully fledged ISO compliant C++ compiler, I'm talking tiny pascal or some such here from design to machine code with the help of say bison and flex. Reversing isn't just about understanding a few assembly language instructions there is so much more that can make life so much easier for you. Don't understand the MS P-code engine ? No information about it ? If you understood compilation then the P-code engine starts to become a bit easier to understand but that's not to say that even that is easy. I have also put a link to the undocumented file formats book - not for the casual novice, it details LE and W3/W4 executable formats which is information really hard to come by. Kaulers system programming book is a gem for the expert only.
I have recently added two books by Kris Kaspersky. The disassembly book is an excellent read if you have only basic assembler knowledge. The CD book is my current read and I am really enjoying it and looking forward to some interesting CD experiments.