HELLO EVERYBODY !!!!
In this tutorial we'll be covering loops, what they are, how to use them, and then comparing them to other types of loops you might find in higher level programming languages.
Preface:
Just some information/expectations for you guys:
1. I will be using Linux for these articles. No, they will not work on Windows. However, because of the nature of this site, I expect that most if not all people on here have access to at the vary least, a virtual machine they can use. If not, direct yourself over to VMware, and to Ubuntu
2. I will be using Ubuntu 12.10 for these articles. Don't fret, the code will assemble on any Linux-based distro, but you may have to use a different package management system to download the assembler.
3. You need to know hexadecimal for these articles. You don't need to know all that much, just what it is, how to convert it, and just be generally comfortable seeing it.
4. This is not for people new to programming. In my opinion, you should learn assembler after you know a bit more about C/C++, or some other local language. Because this is not for those who are new to programming, I will not explain certain programming paradigms I expect most people with moderate knowledge of programming will know. (Functions, pointers, arrays, etc.)
5. After a recent upgrade, I now find myself using Ubuntu 12.10 x64. This means that I will be including the linking/assembling commands for that system as well. However, because all CPU's are backwards compatible, I will be only using x32 bit OPCodes and registers. This means for people on a x64 bit OS, such as myself, will have to use a bit longer linking command than those on a x32-bit OS. Everything else should be the same.
6. I expect that you all have read and comprehended to some degree my previous tutorial. If not, you can find it here.
7. That's it! Let's get started!
Loops in NASM:
If you will all recall from last time, there is a kind of function-esque feature in NASM that look something like this:
CODE :
MyFunction:
add eax, 8
sub ebx, 12
cmp eax, ebx
...
These little buggers are an interesting part of the language, in that they are both similar and different from functions in higher-level languages at the same time. They allow you to call them at any given time and execute the code they contain, much like a function. On the other hand, as described in the last tutorial, they also bleed into each other and they don't take any arguments (although through the stack you can theoretically pass arguments). "So what's the use of them?" a silly person might ask and although this person is reasonably silly, it's still a sound argument. Functions that bleed into each other and don't accept arguments seem pretty pointless, except for maybe acting as comments in the code. But they are very much important, in that they work perfectly for loops. A loop in NASM might look something like this:
CODE :
MyFunction:
add eax, 8
sub ebx, 12
cmp eax, ebx
je MyFunction
jmp somethingWentWrong
somethingWentWrong:
...
As you can see, there are two clearly defined functions, MyFunction, and somethingWentWrong. In this example, we can also deduce that some sort of comparison between eax and ebx is taking place (line 4 of the example). Furthermore, we can see that the output of this comparison will result in either jumping back to "MyFunction", or going to "somethingWentWrong". To paint a clearer picture, I labeled the same example above, to show the possible paths of the function.
CODE :
MyFunction:
add eax, 8 ;Adding 8 to EAX
sub ebx, 12 ;Subtracting 12 from EBX
cmp eax, ebx ;Comparing EAX and EBX. This is essentially the equivilent of subtracting one from the other to see if they equal 0, i.e. are the same number.
je MyFunction ;If the output of this comparison is 0, and they are in fact equal, then this will move us back to the top of "MyFunction". This is the "Jump if equal" OPCode
jmp somethingWentWrong ;If we didn't take the previous jump, then they must not be equal. So in this example, we're going to error and jump to "somethingWentWrong"
somethingWentWrong:
....
"Alright, so that's sort of cool, you can make an "If" statement in NASM. (If this is equal to that, then do this. If it isn't, then do something else). But what about actual looping?"
Well, my friend, that can be accomplished relatively easily. But first, let's dissect actual loops in higher level languages.
Loops in Higher Level Languages:
(Creative title, no?)
All loops in higher level languages are the same save for the syntax. A "for" loop repeats something for "x" amount of times. This can also be represented as a "while" loop that looks something like this:
CODE :
x = 0;
while(x < 30)
{
printf("%d", x);
x++;
}
That does the same thing as a "for" loop, doesn't it? Or, take a "while" loop for example. A "while" loop repeats something until it's true. This can also be represented as a "for" loop that looks something like this:
CODE :
conditions = true;
for(x = 0; x = 20; x++)
{
printf("%d", x);
if(conditions = false)
{
x = 0;
}
else
{
x = 20;
}
}
As you can see, both can accomplish what the other can do. Granted, it's easier to use them for their intended purposes, besides syntax, they're the same thing. Now, let's extend these out to NASM:
The stack:
Ha, sure tricked you. You thought we were going straight to the code, well, no. We are, but not right now. Right now, we're going to talk about something called "the stack". However, this really shouldn't take that long and is critical to all future tutorials anyway so we might as well get it done now.
You can think of an operating system as a classroom. In this metaphore, the professor is the kernel, and each student is a program. Now, this is one of those super-hard classes that give you a massive amount of inclass work. So much so, that everytime you enter the classroom you are handed a massive "stack" of papers (see what I did there?). Except, maybe this is a creative art class so each paper is blank and you have to write your own information on it. When you're done with this information, you set it to the side, neatly stacking each new peice of paper on top of the last.
This is exactly what the kernel does to it's programs. It gives it a large area in memory, known as "the stack". You can write to the stack whenever you want, but it is a FILO (first-in-last-out) datatype. Similar to your blank stack of papers, whenever you write to it, you take it off, and then put it on the bottom. Then, each consecutive write goes on top of the previous one. Which means that if you want to get to your first peice of data (or paper), you're going to have to go through all the other ones to get to it.
To give a better example, here is an example program in NASM with the stack shown throughout:
CODE :
SECTION .text
global _start
_start:
mov eax, 8 ;Moving 8 into EAX
push eax ;Pushing the vaue of EAX (8) onto the stack. The stack currently contains the number 8 on the bottom, and nothing else.
mov ebx, 10 ;Moving 10 into EBX
push ebx ;Pushing the value of EBX (10) onto the stack. The stack currently contains the number 8 on the bottom, and the nubmer 10 on top of it.
pop eax ;Taking the FIRST value (10) off of the stack and moving it into EAX. This means EAX equals 10. The stack currently contains the number 8 on the bottom, and nothing else.
pop ebx ;Taking the FIRST value (8) off of the stack and moving it into EBX. This means EBX equals 8. The stack currently contains nothing.
...
If you still aren't exactly sure, I emplore you to look it up else-where on the Internet. This is vital to your understanding of lower-level concepts.
High-level loops in NASM:
Two sections ago, we already saw everything we need to build our own high-level loop in NASM. We were able to compare numbers, we were able to add numbers, we were able to subtract numbers, and we were able to jump to different sections of code based on said numbers. Given these abilities, a "for" loop should be no problem at all. Let's see if we can make a "for" loop that repeats a message 20 times.
CODE :
SECTION .data
msg: db "Repeating this message!", 10;The message we're going to repeat
len: equ $-msg ;The length of the message
SECTION .text
global _start
_start:
mov eax, 0 ;Clearing out EAX before we start our loop
jmp forLoop ;Off to our loop!
forLoop:
cmp eax, 20 ;Does EAX equal 20 yet?
je finished ;If it does, then we want to quit our loop.
push eax ;Putting what EAX currently equals on the top of the stack, to save it for later.
mov eax, 4 ;Telling our kernel that we're going to do a write
mov ebx, 1 ; Specifying that we're going to write to STDOUT
mov ecx, msg ;The message we're going to write to STDOUT
mov edx, len ;The length of the message
int 0x80 ;Calling the kernel
pop eax ;Getting out previous EAX value off of the stack and back into the register
add eax, 1 ;Adding one to EAX
jmp forLoop ;Repeating our loop
finished:
mov eax, 1 ;Telling our kernel that we're going to exit this progam
mov ebx, 0 ;Our exit code (0)
int 0x80 ;Calling our kernel
Save this program as "looping.asm" (of course you can change the name).
To assemble this code:
nasm -f elf "looping.asm"
To link this program (for x32 bit OS users):
ld -s -o looping looping.o
To link this program (for x64 bit OS users):
ld -m elf_i386 -s -o looping looping.o
And, hey, look at that, we did it! The output of this program (should be) "Repeating this message!" 20 times in the terminal. We started out by clearing out EAX, then jumping to our forLoop, which compared EAX to 20 to see if it was equal or not. If it was, then we finished, if it wasn't then we pushed the number we were at to the stack, printed out our message, poped the current number back into EAX from the stack, added one to EAX and then repeated the loop.
If you have any questions AT ALL, I'm just a PM away!
Thank you all for reading, and there shall be more to come!
~Centip3de
P.S. Don't blame the crappy code formatting on me, blame it on HTS' inability to add tabs. However, in an effort to combat said bad formatting, I've bolded ALL code. Everything that isn't bolded, is a comment andshould be preceded by a ";". If it isn't, please change it, or PM me for the full code.
In this tutorial we'll be covering loops, what they are, how to use them, and then comparing them to other types of loops you might find in higher level programming languages.
Preface:
Just some information/expectations for you guys:
1. I will be using Linux for these articles. No, they will not work on Windows. However, because of the nature of this site, I expect that most if not all people on here have access to at the vary least, a virtual machine they can use. If not, direct yourself over to VMware, and to Ubuntu
2. I will be using Ubuntu 12.10 for these articles. Don't fret, the code will assemble on any Linux-based distro, but you may have to use a different package management system to download the assembler.
3. You need to know hexadecimal for these articles. You don't need to know all that much, just what it is, how to convert it, and just be generally comfortable seeing it.
4. This is not for people new to programming. In my opinion, you should learn assembler after you know a bit more about C/C++, or some other local language. Because this is not for those who are new to programming, I will not explain certain programming paradigms I expect most people with moderate knowledge of programming will know. (Functions, pointers, arrays, etc.)
5. After a recent upgrade, I now find myself using Ubuntu 12.10 x64. This means that I will be including the linking/assembling commands for that system as well. However, because all CPU's are backwards compatible, I will be only using x32 bit OPCodes and registers. This means for people on a x64 bit OS, such as myself, will have to use a bit longer linking command than those on a x32-bit OS. Everything else should be the same.
6. I expect that you all have read and comprehended to some degree my previous tutorial. If not, you can find it here.
7. That's it! Let's get started!
Loops in NASM:
If you will all recall from last time, there is a kind of function-esque feature in NASM that look something like this:
CODE :
MyFunction:
add eax, 8
sub ebx, 12
cmp eax, ebx
...
These little buggers are an interesting part of the language, in that they are both similar and different from functions in higher-level languages at the same time. They allow you to call them at any given time and execute the code they contain, much like a function. On the other hand, as described in the last tutorial, they also bleed into each other and they don't take any arguments (although through the stack you can theoretically pass arguments). "So what's the use of them?" a silly person might ask and although this person is reasonably silly, it's still a sound argument. Functions that bleed into each other and don't accept arguments seem pretty pointless, except for maybe acting as comments in the code. But they are very much important, in that they work perfectly for loops. A loop in NASM might look something like this:
CODE :
MyFunction:
add eax, 8
sub ebx, 12
cmp eax, ebx
je MyFunction
jmp somethingWentWrong
somethingWentWrong:
...
As you can see, there are two clearly defined functions, MyFunction, and somethingWentWrong. In this example, we can also deduce that some sort of comparison between eax and ebx is taking place (line 4 of the example). Furthermore, we can see that the output of this comparison will result in either jumping back to "MyFunction", or going to "somethingWentWrong". To paint a clearer picture, I labeled the same example above, to show the possible paths of the function.
CODE :
MyFunction:
add eax, 8 ;Adding 8 to EAX
sub ebx, 12 ;Subtracting 12 from EBX
cmp eax, ebx ;Comparing EAX and EBX. This is essentially the equivilent of subtracting one from the other to see if they equal 0, i.e. are the same number.
je MyFunction ;If the output of this comparison is 0, and they are in fact equal, then this will move us back to the top of "MyFunction". This is the "Jump if equal" OPCode
jmp somethingWentWrong ;If we didn't take the previous jump, then they must not be equal. So in this example, we're going to error and jump to "somethingWentWrong"
somethingWentWrong:
....
"Alright, so that's sort of cool, you can make an "If" statement in NASM. (If this is equal to that, then do this. If it isn't, then do something else). But what about actual looping?"
Well, my friend, that can be accomplished relatively easily. But first, let's dissect actual loops in higher level languages.
Loops in Higher Level Languages:
(Creative title, no?)
All loops in higher level languages are the same save for the syntax. A "for" loop repeats something for "x" amount of times. This can also be represented as a "while" loop that looks something like this:
CODE :
x = 0;
while(x < 30)
{
printf("%d", x);
x++;
}
That does the same thing as a "for" loop, doesn't it? Or, take a "while" loop for example. A "while" loop repeats something until it's true. This can also be represented as a "for" loop that looks something like this:
CODE :
conditions = true;
for(x = 0; x = 20; x++)
{
printf("%d", x);
if(conditions = false)
{
x = 0;
}
else
{
x = 20;
}
}
As you can see, both can accomplish what the other can do. Granted, it's easier to use them for their intended purposes, besides syntax, they're the same thing. Now, let's extend these out to NASM:
The stack:
Ha, sure tricked you. You thought we were going straight to the code, well, no. We are, but not right now. Right now, we're going to talk about something called "the stack". However, this really shouldn't take that long and is critical to all future tutorials anyway so we might as well get it done now.
You can think of an operating system as a classroom. In this metaphore, the professor is the kernel, and each student is a program. Now, this is one of those super-hard classes that give you a massive amount of inclass work. So much so, that everytime you enter the classroom you are handed a massive "stack" of papers (see what I did there?). Except, maybe this is a creative art class so each paper is blank and you have to write your own information on it. When you're done with this information, you set it to the side, neatly stacking each new peice of paper on top of the last.
This is exactly what the kernel does to it's programs. It gives it a large area in memory, known as "the stack". You can write to the stack whenever you want, but it is a FILO (first-in-last-out) datatype. Similar to your blank stack of papers, whenever you write to it, you take it off, and then put it on the bottom. Then, each consecutive write goes on top of the previous one. Which means that if you want to get to your first peice of data (or paper), you're going to have to go through all the other ones to get to it.
To give a better example, here is an example program in NASM with the stack shown throughout:
CODE :
SECTION .text
global _start
_start:
mov eax, 8 ;Moving 8 into EAX
push eax ;Pushing the vaue of EAX (8) onto the stack. The stack currently contains the number 8 on the bottom, and nothing else.
mov ebx, 10 ;Moving 10 into EBX
push ebx ;Pushing the value of EBX (10) onto the stack. The stack currently contains the number 8 on the bottom, and the nubmer 10 on top of it.
pop eax ;Taking the FIRST value (10) off of the stack and moving it into EAX. This means EAX equals 10. The stack currently contains the number 8 on the bottom, and nothing else.
pop ebx ;Taking the FIRST value (8) off of the stack and moving it into EBX. This means EBX equals 8. The stack currently contains nothing.
...
If you still aren't exactly sure, I emplore you to look it up else-where on the Internet. This is vital to your understanding of lower-level concepts.
High-level loops in NASM:
Two sections ago, we already saw everything we need to build our own high-level loop in NASM. We were able to compare numbers, we were able to add numbers, we were able to subtract numbers, and we were able to jump to different sections of code based on said numbers. Given these abilities, a "for" loop should be no problem at all. Let's see if we can make a "for" loop that repeats a message 20 times.
CODE :
SECTION .data
msg: db "Repeating this message!", 10;The message we're going to repeat
len: equ $-msg ;The length of the message
SECTION .text
global _start
_start:
mov eax, 0 ;Clearing out EAX before we start our loop
jmp forLoop ;Off to our loop!
forLoop:
cmp eax, 20 ;Does EAX equal 20 yet?
je finished ;If it does, then we want to quit our loop.
push eax ;Putting what EAX currently equals on the top of the stack, to save it for later.
mov eax, 4 ;Telling our kernel that we're going to do a write
mov ebx, 1 ; Specifying that we're going to write to STDOUT
mov ecx, msg ;The message we're going to write to STDOUT
mov edx, len ;The length of the message
int 0x80 ;Calling the kernel
pop eax ;Getting out previous EAX value off of the stack and back into the register
add eax, 1 ;Adding one to EAX
jmp forLoop ;Repeating our loop
finished:
mov eax, 1 ;Telling our kernel that we're going to exit this progam
mov ebx, 0 ;Our exit code (0)
int 0x80 ;Calling our kernel
Save this program as "looping.asm" (of course you can change the name).
To assemble this code:
nasm -f elf "looping.asm"
To link this program (for x32 bit OS users):
ld -s -o looping looping.o
To link this program (for x64 bit OS users):
ld -m elf_i386 -s -o looping looping.o
And, hey, look at that, we did it! The output of this program (should be) "Repeating this message!" 20 times in the terminal. We started out by clearing out EAX, then jumping to our forLoop, which compared EAX to 20 to see if it was equal or not. If it was, then we finished, if it wasn't then we pushed the number we were at to the stack, printed out our message, poped the current number back into EAX from the stack, added one to EAX and then repeated the loop.
If you have any questions AT ALL, I'm just a PM away!
Thank you all for reading, and there shall be more to come!
~Centip3de
P.S. Don't blame the crappy code formatting on me, blame it on HTS' inability to add tabs. However, in an effort to combat said bad formatting, I've bolded ALL code. Everything that isn't bolded, is a comment andshould be preceded by a ";". If it isn't, please change it, or PM me for the full code.