As of this revision. We are able to run CPU2000int with refernece input
successfully! These benchmarks are linked against ptmalloc3 (not part of
this project) and the libadaptor.so.
Running CPU2000int is just a way to stress test this work.
and page initialization via zero-filling.
Suppose a allocated block B1, whose virtual address is [ad1, ad2], is going
to deallocated. One Linux, it seems the only way to deallocate the pages
associated with the block is to call madvise(..MADV_DONTNEED...) (
hereinafter, call it madvise() for short unless otherwise noted).
madvise() *immediately* remove all the pages involved, and invalidate the
related TLB entries. So, if later on we allocate a block overlapping with
B1 in virtual address; accessing to the overlapping space will result in
re-establishing TLB entries, and zero-fill-pages, which is bit expensive.
This cost can be reduced by keeping few blocks in memory, and re-use the
memory resident pages over and over again. This is the rationale behind the
"block cache". The "cache" here may be a misnomer; it dosen't cache any data,
it just provide a way to keep small sum of idle pages in memory to avoid
cost of TLB manipulation and page initialization via zero-filling.