Assembly Language Chapter 1
Assembly language is the oldest programming language. Of all languages, it closely resembles the native language of a computer.
Direct access to a computer’s hardware.
To understand a great deal about your computer’s architecture and operating system.
Questions Related to the AL
What background should I have?
Computer Programming (C++, C#, Java, etc)
What is an Assembler?
An assembler is like a translator for computers. It takes instructions written in a language that’s easy for humans to understand (Assembly Language) and converts them into a language that computers can directly follow (Machine Code).
MASM (Microsoft Assembler),
NASM (Netwide Assembler)
TASM (Borland turbo Assembler)
What is a Linker?
A linker is an important utility program that takes the object files, produced by the assembler to join them into a single executable file. There are two types of linkers, dynamic and linkage.
Both linkers are used to create executable files (EXEs) or dynamic-link libraries (DLLs) from object files. LINK.EXE is the 16-bit linker and LINK32.EXE is the 32-bit linker.
What is a Debugger?
A debugger is a tool that allows you to examine the state of a running program, including the contents of memory locations and registers.
MASM CodeView, Turbo Debugger, and Visual Studio are debuggers that allow you to step through code, examine registers and memory, and set breakpoints.
What hardware/software do I need for assembly language?
A computer with an Intel386, Intel486 or one of the Pentium processors (IA-32 processor family).
OS: Microsoft Windows, MS-DOS, Linux running a DOS emulator. Editor (VS Code, Sublime Text, Notepad++ etc), Linker, Assembler, Debugger.
What types of programs will I create?
6-Bit Real-Address Mode: In 16-bit real-address mode, you can create programs that are small and simple. These programs can access a maximum of 1 MB of memory and cannot use multitasking or memory protection features. e.g., Text editors, Calculators, Games, Device drivers etc.
32-Bit Protected Mode: In 32-bit protected mode, you can create more complex programs that can access more memory and use multitasking and memory protection features. e.g., Operating systems, Word processors, Spreadsheets, Databases, Web browsers etc.
How does assembly language (AL) relate to machine language?
Assembly language has a one-to-one relationship with machine language. Each assembly language instruction corresponds directly to a specific machine language instruction that the computer’s central processing unit (CPU) can understand and execute.
What will I learn?
Basic Boolean logic: Learn the principle of Boolean algebra and how logic gates work.
Basic principle of computer architecture: Understand the fundamental components of a computer system and how they interact.
IA-32 Processors and Memory Management: Delve into the architecture of IA-32 processors and explore memory management modes.
High-Level Language Compilation: Discover how high-level programming languages (such as C++) are translated into machine code.
Machine-Level Debugging: Improve your skills in debugging programs at the machine level.
Interaction with the Operating System: Explore how application programs communicate with an operating system.
How do C++ and Java Relate to AL?
C++ and Java are high–level programming languages which means they are designed to be more human-readable and easier to work with as compared to assembly language and machine code. Programs written in high level programming languages can be converted into assembly code using compilers.
Is AL portable?
Assembly language (AL) is generally not portable across different computer architectures or processor families. Each computer architecture (e.g., x86, ARM, MIPS) has its own specific assembly language with unique instructions and syntax.
Why learn AL?
Assembly language is used to write programs for embedded systems, which are small, specialized computers that control devices like cars, appliances, and medical equipment.
It can be used to write highly optimized programs that are very fast and efficient.
It can help you to understand the inner workings of the computer hardware.
Assembly is often used to write device drivers, which are programs that control the hardware devices that are connected to a computer.
Are there any rules in AL?
Assembly Language Application
It is rare to see large application programs written completely in assembly language because they would take too much time to write and maintain.
AL is used to optimize certain sections of application programs for speed and to access computer hardware. It can be used in
Business application for single & multiple platforms
Hardware device driver
Embedded systems & computer games.
Comparing ASM to High-Level Languages
Type of Application | High-Level Languages | Assembly Languages |
Business application software, written for single platform, medium to large size | Easy to organize and maintain large sections of code. Portable between operating systems. | Difficult to maintain. Not portable between operating systems. |
Hardware device driver | May not provide for direct hardware access. Awkward coding techniques may be required. | Easy to access hardware directly. |
Business application written for multiple platforms (different operating systems) | Portable between operating systems. | Not portable between operating systems. |
Embedded systems and computer games requiring direct hardware access | Difficult to maintain. Not portable between operating systems. | Easy to access hardware directly. Ideal for small, efficient programs. |
Virtual Machine Concept
It is an effective way to explain how a computer’s hardware and software are related.
In terms of programming languages:
Each computer has a native machine language (language L0) that runs directly on its hardware.
A more human-friendly language is usually constructed above machine language, called Language L1.
💡Programs written in L1 can run in two different ways:Interpretation - L0 program interprets and executes L1 instructions one-by-one.
Translation - L1 program is completely translated into an L0 program, which then runs on the computer hardware.
In terms of a hypothetical computer:
VM1 for Language L1: First, you create a virtual machine (VM1) that understands and executes commands written in language L1.
VM2 for Language L2: Then, you create a virtual machine (VM2) that understands and executes commands written in language L2.
The process can repeat until a virtual machine VMn can be designed that supports a powerful, easy-to-use language. This language is easy for humans to write and understand while being able to perform complex tasks.
The Java programming language is based on the virtual machine concept.
A program written in the Java language is translated by a Java compiler into Java Byte code.
Java Byte Code: A low-level language that is quickly executed at run time by Java virtual machine (JVM).
The JVM has been implemented on many different computer System, making Java programs relatively system-independent.
Specific Machine Levels
Level 0 - Digital Logic
Digital logic system is the lowest level of computer architecture, consisting of logic gates that are used to construct the CPU and other components.
Level 1 - Microarchitecture
Microarchitecture is the way that a computer's processor interprets and executes instructions. It is a lower level of abstraction than the instruction set architecture (ISA), which is the set of instructions that the processor can understand. The microarchitecture is typically a proprietary secret of the processor's manufacturer, and it is not generally possible for average users to write microinstructions.
Level 2 - Instruction Set Architecture
Instruction set architecture (ISA) is the set of instructions that a computer processor can understand and execute. It is a level of abstraction that is above the physical hardware, but below the programming languages that humans use.
Level 3 - Operating System
An operating system (OS) is a software program that manages the computer's hardware and software resources and provides common services for computer programs. It is a level of abstraction that sits between the hardware and the user.
Level 4 - Assembly Language
Assembly language is a low-level programming language that is used to control the hardware of a computer. It uses instruction mnemonics such as ADD, SUB, and MOV that are easily translated to the instruction set architecture level. Assembly language programs are usually translated into machine language before they begin to execute.
Level 5 - High-level Language
High-level languages are programming languages that are designed to be easy for humans to read and write. They are usually compiled into assembly language (level 4) before they can be executed. Some popular high-level languages include C++, C#, Java, and Python.
Data Representation
Binary Numbers
Binary digits are 1 and 0 and they represent true and false respectively.
MSB – most significant bit The bit with the greatest weight. In a binary number, the bits are numbered from right to left, the leftmost bit is the MSB.
LSB – least significant bit The bit with the least weight. In a binary number, the LSB is the rightmost bit.
Each digit (bit) is either 1 or 0
Each bit represents a power of 2.
e.g., the binary number for 143 is 10001111. It has 8 bits, each of which represents a power of 2.
Bit | Value | Power of 2 |
7 | 1 | 2^7 = 128 |
6 | 0 | 2^6 = 64 |
5 | 0 | 2^5 = 32 |
4 | 0 | 2^4 = 16 |
3 | 1 | 2^3 = 8 |
2 | 1 | 2^2 = 4 |
1 | 1 | 2^1 = 2 |
0 | 1 | 2^0 = 1 |
The value is 128 + 8 + 4 + 2 + 1 = 143.
Binary to Decimal Conversion
To convert a unsigned binary number into a decimal number.
bit1 * 2^0 + bit2 * 2^1 . . .
here bit1
is rightmost bit.
e.g., convert 101 into decimal.
1 * 2^0 + 0 * 2^1 + 1 * 2^2 = 5
Decimal to Binary Conversion
To convert a decimal number into binary, divide the number by 2 repeatedly until we get 0 as quotient and then arrange remainders from bottom to top to get binary number.
e.g, to convert 143
Number | Quotient | Remainder |
143 | 71 | 1 |
71 | 35 | 1 |
35 | 17 | 1 |
17 | 8 | 1 |
8 | 4 | 0 |
4 | 2 | 0 |
2 | 1 | 0 |
1 | 0 | 1 |
So the binary number of 143 is 10001111.
Binary Addition
Integer Storage Sizes
A word is a standard unit of data for a certain processor architecture. The fundamental data types of the Intel Architecture are bytes, words, doublewords, and quadwords.
Storage Size | Number of bits | Range of unsigned integers |
Byte | 8 bits | 0 to 255 (28 - 1) |
Word | 16 bits | 0 to 65,535 (216 - 1) |
Double word | 32 bits | 0 to 4,294,967,295 (232 - 1) |
Quadword | 64 bits | 0 to 18,446,744,073,709,551,615 (264 - 1) |
Hexadecimal Integers
All values in memory are stored in binary. Because long binary numbers are hard to read, we use hexadecimal representation.
Decimal | Binary | Hexadecimal |
0 | 0000 | 0 |
1 | 0001 | 1 |
2 | 0010 | 2 |
3 | 0011 | 3 |
4 | 0100 | 4 |
5 | 0101 | 5 |
6 | 0110 | 6 |
7 | 0111 | 7 |
8 | 1000 | 8 |
9 | 1001 | 9 |
10 | 1010 | A |
11 | 1011 | B |
12 | 1100 | C |
13 | 1101 | D |
14 | 1110 | E |
15 | 1111 | F |
Binary to Hexadecimal
To convert binary to hexadecimal, make groups of 4 bits from rightmost and then convert them into hexadecimal using above table.
Bit group | Hexadecimal |
0100 | 4 |
1001 | 9 |
0111 | 7 |
1010 | A |
0110 | 6 |
0001 | 1 |
16A794 is its hexadecimal.
Hexadecimal to Decimal
You can apply this formula too. digit1 * 16^0 + digit2 * 16^1 . . .
here digit1
is the rightmost hexadecimal number.
e.g., to convert 3BA4 4 * 16^0 + A * 16^1 + B * 16^2 + 3 * 16^3
4 * 1 + 10 * 16 + 11 * 256 + 3 * 4096
4 + 160 + 2816 + 12288 = 15268
15268 is the decimal conversion of 3BA4.
Decimal to Hexadecimal
Divide the decimal number until get quotient 0 and then arrange remainders from bottom to top to get the hexadecimal number.
e.g., convert 422
Number | Quotient | Remainder |
422 | 26 | 6 |
26 | 16 | 10 (A) |
16 | 0 | 1 |
1A6 is the hexadecimal equivalent of 422.
Hexadecimal Addition
Hexadecimal Subtraction
Signed Integers
The most significant bit (MSB) represents the sign of a binary number i.e., 0 is for positive and 1 is for negative.
Decimal | Signed Binary |
0 | 0000 |
1 | 0001 |
2 | 0010 |
3 | 0011 |
4 | 0100 |
5 | 0101 |
6 | 0110 |
7 | 0111 |
-0 | 1000 |
-1 | 1001 |
-2 | 1010 |
-3 | 1011 |
-4 | 1100 |
-5 | 1101 |
-6 | 1110 |
-7 | 1111 |
Two's Compliment
Two's complement is a way of representing signed integers in binary.
To do this, invert all the bits (change 0s to 1s and 1s to 0s) and then add 1 to the number.
10001111 is +143.
1. Invert all bits: 01110000
2. Add 1 to 01110000
01110000 + 00000001 = 01110001
01110001 is -143.
Note 10001111 + 01110001 = 00000000
Binary Subtraction using 2s Compliment
To subtract two binary numbers using 2's complement, we do the following:
Find the 2's complement of the subtrahend.
Add the 2's complement of the subtrahend to the minuend.
If there is a carry-over, ignore it.
Solve 1011 - 0101
2's complement of the subtrahend = 1010
1011 + 1010 = 1101
2's complement of 1101 = 0010 Therefore, the difference is 0010.
Learn how to do the following
Forming the two's complement of a hexadecimal integer
Convert the hexadecimal number to Binary and then take the 2s complement of that number and convert it back to hexadecimal.
For example, the two's complement of the hexadecimal integer AB
is 55
.
AB = 10101011 (binary)
~AB = 01010100 (binary)
1 + ~AB = 01010101 (binary)
Converting signed binary to decimal
If the most significant bit (MSB) is 0, the number is positive. Otherwise, the number is negative.
If the number is positive, convert the binary number to decimal as usual.
If the number is negative, take the two's complement of the binary number and add 1. Then, convert the resulting binary number to decimal.
Converting signed decimal to binary
If the number is positive, convert it into binary as usual.
In case of a negative number, first, ignore the sign and convert it into binary and then take 2s compliment of that binary number.
e.g., Convert -10 to binary.
10 is 1010 in binary
2s compliment of 1010 is 0110
0110 is -10 in signed binary.
Ranges of Signed Integers
Data Type | Minimum and Maximum | Power of 2 |
signed byte | -128 to 127 | -2^7 to 2^7-1 |
signed word | -32,768 to 32,767 | -2^15 to 2^15-1 |
signed double word | -2,147,483,648 to 2,147,483,647 | -2^31 to 2^31-1 |
signed quadword | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | -2^63 to 2^63-1 |
Character Storage
- The character set is a collection of characters that can be used to represent text. The most common character set is ASCII (American Standard Code for Information Interchange), which uses 7 bits to represent each character. This means that there are a total of 128 characters in ASCII, including letters, numbers, punctuation marks, and control characters.
- Standard ASCII is the first 127 characters of the ASCII character set. It includes all of the characters that are necessary for basic text processing.
Extended ASCII is the range of characters from 128 to 255 in the ASCII character set. It includes characters that are not used in basic text processing, such as graphics symbols and Greek characters.
A null-terminated string is a string of characters that is terminated by a special character called a null byte.
ASCII table is a reference table that maps each ASCII character to its corresponding numeric value.
- 💡String "ABC123" is stored as a sequence of 6 ASCII characters: 41h, 42h, 43h, 31h, 32h, and 33h.
Numeric Data Representation
Numeric data can be represented in two ways: pure binary and ASCII digit string.
Pure binary is a number stored in its raw binary format.
ASCII digit string is a string of ASCII characters that represent a number.
Pure binary is a raw format that can be directly calculated by computers.
ASCII digit string is a human-readable format that uses ASCII characters to represent numbers.
Boolean Algebra
Boolean algebra deals with logical operations on binary variables. It was founded by George Boole in the mid-19th century. Boolean algebra is used in many areas of computer science, including digital logic, circuit design etc.
The basic operations of Boolean algebra are AND, OR, and NOT.
A truth table shows all the inputs and outputs of a Boolean function.
NOT Operation
Inverts a Boolean Value.
Input A | NOT A |
0 | 1 |
1 | 0 |
AND Operation
Gives true if both values are true and false otherwise.
Input A | Input B | Output |
0 | 0 | 0 |
0 | 1 | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
OR Operation
Gives true if at least one value is true and false otherwise.
Input A | Input B | Output |
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 1 |