patterncMinor
What are pointers in low-level language like C
Viewed 0 times
whatlowarelevellikelanguagepointers
Problem
I was trying to understand pointers by watching YouTube videos. However, I could not understand
- How do they work?
- Why do we use them?
- When do we use them?
Solution
In general, a pointer is a variable which holds the address of a variable. You can use https://godbolt.org/ to find out the assembly equivalent of a pointer. For example,
compiles to
This is a full function so there is some useless stuff. The useful stuff is the following:
The first line puts 3 in the variable
Now for the stuff I called useless earlier. When you enter a function in C++/C, it will:
-
Push RBP on the stack;
-
Put RSP in RBP;
-
Decrement RSP of the allocated stack space for the function.
After those 3 steps, the variables local to the function are accessed using negative relative offsets from RBP. In this case, the substracting RSP part has been optimized away.
This is how pointers work at the low level. They are allocated on the stack if they are local to the function. If they aren't, then they are allocated in the data segment of the executable. This segment will be accessed by default using a relative offset from RIP (the instruction pointer).
Now that excludes smart pointers and stuff like that which are best practice nowadays. A useful program (excluding libraries) is probably using C++ and some variant/custom smart pointers nowadays. Otherwise, this developer is wasting time to end up with a buggy/crashing program. As to libraries, most of the time, the allocation is left to the user of the library by calling some allocating functions.
Why do we use them?
When do we use them?
I think those 2 questions ask the same thing. In general, pointers are used because they allow to allocate memory. You cannot allocate memory in C++/C without receiving a pointer to the allocated memory. This is due to the underlying way that the operating system works. The user mode program makes a syscall asking for space in RAM and the OS returns a pointer to the beginning of the space it allocated.
This allows for dynamic programming where you don't know the size of everything at compilation time.
This allocated space will also survive exiting the local scope of the function where it was allocated. This is useful especially if you are working in several threads which is often the case in real world applications.
I would tend to say that statically allocated pointers are used mostly with arrays. The name of the array in C++/C is a pointer to the first member of the array. Otherwise, they are used for dynamic allocation because it cannot be done differently due to the way the underlying OS works.
How do they work?
This is hard to explain in one small answer. The statically allocated pointer is a variable allocated on the stack. It contains the address of another variable or of an array member. It can even contain an absolute address. For example, one can write:
The pointer called
void func(){
int* pointer;
int a = 3;
pointer = &a;
}compiles to
pushq %rbp
movq %rsp, %rbp
movl $3, -12(%rbp)
leaq -12(%rbp), %rax
movq %rax, -8(%rbp)
nop
popq %rbp
retThis is a full function so there is some useless stuff. The useful stuff is the following:
movl $3, -12(%rbp)
leaq -12(%rbp), %rax
movq %rax, -8(%rbp)The first line puts 3 in the variable
a which is allocated on the stack. The second line uses the lea instruction to put the address of the variable a in RAX. The third line puts RAX in the pointer called pointer.Now for the stuff I called useless earlier. When you enter a function in C++/C, it will:
-
Push RBP on the stack;
-
Put RSP in RBP;
-
Decrement RSP of the allocated stack space for the function.
After those 3 steps, the variables local to the function are accessed using negative relative offsets from RBP. In this case, the substracting RSP part has been optimized away.
This is how pointers work at the low level. They are allocated on the stack if they are local to the function. If they aren't, then they are allocated in the data segment of the executable. This segment will be accessed by default using a relative offset from RIP (the instruction pointer).
Now that excludes smart pointers and stuff like that which are best practice nowadays. A useful program (excluding libraries) is probably using C++ and some variant/custom smart pointers nowadays. Otherwise, this developer is wasting time to end up with a buggy/crashing program. As to libraries, most of the time, the allocation is left to the user of the library by calling some allocating functions.
Why do we use them?
When do we use them?
I think those 2 questions ask the same thing. In general, pointers are used because they allow to allocate memory. You cannot allocate memory in C++/C without receiving a pointer to the allocated memory. This is due to the underlying way that the operating system works. The user mode program makes a syscall asking for space in RAM and the OS returns a pointer to the beginning of the space it allocated.
This allows for dynamic programming where you don't know the size of everything at compilation time.
This allocated space will also survive exiting the local scope of the function where it was allocated. This is useful especially if you are working in several threads which is often the case in real world applications.
I would tend to say that statically allocated pointers are used mostly with arrays. The name of the array in C++/C is a pointer to the first member of the array. Otherwise, they are used for dynamic allocation because it cannot be done differently due to the way the underlying OS works.
How do they work?
This is hard to explain in one small answer. The statically allocated pointer is a variable allocated on the stack. It contains the address of another variable or of an array member. It can even contain an absolute address. For example, one can write:
unsigned int* pointer = (unsigned int*)0x8000;The pointer called
pointer now points to address 0x8000. By typing *pointer = 3;, address 0x8000 now contains 3. This is to avoid of course in a user mode program but can be useful when writing an OS from scratch. This is undefined behaviour in a user mode program because no one knows if the address 0x8000 in the virtual address space has been allocated for your program. Accessing this address could do anything including a page fault and killing your program.Code Snippets
void func(){
int* pointer;
int a = 3;
pointer = &a;
}pushq %rbp
movq %rsp, %rbp
movl $3, -12(%rbp)
leaq -12(%rbp), %rax
movq %rax, -8(%rbp)
nop
popq %rbp
retmovl $3, -12(%rbp)
leaq -12(%rbp), %rax
movq %rax, -8(%rbp)unsigned int* pointer = (unsigned int*)0x8000;Context
StackExchange Computer Science Q#145305, answer score: 6
Revisions (0)
No revisions yet.