My Memory Allocator
my own memory allocator in C
It seems like every C programmer builds their own memory allocator at some point. It’s almost like a rite of passage. I figured it was my turn. I had the opportunity of implementing one while I was tutoring for CSE29. One of the PA’s we had to guide students (PA3 was the malloc simulator). Since I had never taken the course before, one of my duties was to get familiar with the PA and if I had time, implement it myself. I probably should’ve but felt strange just filling out #TODO’s. I felt like I was cheating. If I was going to build my own memory allocator, it should be 100% my own. I think I’m just stubborn. Anyways I finally built my own, it’s not as elaborate but it’s mine. You can check it out on my Github :D
Why do we need Malloc?
C has a pretty rigid system for handling memory. To start of, lets focus on the different segments that make up a program’s virtual address space.
- Text Segment (also known as the code segment)
- a. Contains the instructions for executing the program.
- b. Read only, so nothing can change the instructions of the program while it’s running.
- c. Fixed size and read directly from the executable file on disk.
- d. Stores any literals from the program.
- Static memory
- a. Uninitialized (Also known as the BSS segment)
- i. Stores uninitialized global and static variables (i.e., holds values that are not known at compile time).
- ii. Fixed size.
- iii. Read and Write accessible.
- b. Initialized (also known as the data segment)
- i. Holds initialized global and static variables (i.e., holds values that are known at compile time).
- ii. Read directly from the executable.
- iii. Fixed size.
- iv. Read and Write accessible.
- a. Uninitialized (Also known as the BSS segment)
- Dynamic memory
- a. Heap
- i. Used for dynamic memory allocation (i.e.,
mallocandfree). - ii. Grows and shrinks as needed during runtime, for sizes that can’t be determined during compile time.
- iii. Grows upwards (towards higher memory addresses).
- i. Used for dynamic memory allocation (i.e.,
- b. Stack
- i. Used for function call management, local variables, and arguments.
- ii. Grows downwards (towards lower memory addresses).
- iii. Automatically managed by the system (push and pop operations).
- iv. Has a limited size, which can lead to stack overflow if exceeded.
- a. Heap
We need malloc for variables with unknown compile-time sizes or those requiring a lifecycle that extends beyond the scope of a single function. It’s for the situations where we don’t know “how much” and “how long” at compile time.
mmap()
To get memory for my heap, I decided to use mmap(). mmap() is a system call that creates a memory mapping in the virtual address space of the process. I saw different implementations of malloc using sbrk() and mmap(). I chose mmap() because it allows for more flexible memory management and can be used to allocate larger blocks of memory (which I needed). It also provides better performance for large allocations (which is nice).
I also saw other implementations that called mmap() every malloc() call, but I wanted to minimize the number of system calls for performance reasons, so I decided to call mmap() once to get a large block of memory and then manage that block for subsequent malloc() and free() calls.
My implementation!
Functions:
-
my_malloc: Asks for a block of memory and returns a pointer to it. -
my_calloc: Allocates memory for an array of elements and strictly initializes all of its bits to zero before handing it over. -
my_free: Takes a pointer back when you’re done and marks that specific segment of memory as available for reuse. -
my_realloc: Lets you resize a block of memory you already allocated, keeping the original data intact up to the new size.
How I organized memory:
-
Heap:A single, massive 10 MB region of memory I request from the OS exactly once using mmap. -
Block: A smaller piece carved out of that giant heap that I hand out whenever malloc or calloc is called.
Both of these levels need metadata to track what is currently happening in memory. I put a small header at the very beginning of the giant heap, and another header at the beginning of each individual block. After a single malloc call, the 10 MB memory map is essentially split into two pieces: a “Used” block containing the requested bytes, and a massive “Free” block containing the rest.
Here are the C structures I used to manage the heap and blocks plus the functions I built:
typedef struct s_block{
size_t size;
t_bool is_free;
struct s_block *next;
struct s_block *prev;
}t_block;
typedef struct s_heap{
size_t total_size;
size_t free_size;
size_t block_count;
}t_heap;
void *my_malloc(size_t size);
void my_free(void *ptr);
void *my_calloc(size_t num, size_t size);
void *my_realloc(void *ptr, size_t new_size);
By giving each block next and prev pointers, I effectively created a doubly linked list directly inside the memory arena. This lets the allocator walk through the giant heap to find free spots, or to easily peek at the neighbors of a block I want to free so they can be merged back together to prevent fragmentation.