Project Sileby

Project Sileby implements a new memory management API to share page tables between processes, reducing their per-process overheads.


int mshare(void *addr, size_t length, long flags);

mshare() creates a shared range of address space. The specified range may already contain mappings. Mappings within the range behave as if they were shared between threads, so a write to a MAP_PRIVATE mapping will create a page which is shared between all the sharers. Anonymous mappings will be shared and not COWed.

It returns a file descriptor which refers to this area of the creating process's address space. That file descriptor can then be shared with other processes either by inheritance over fork() and exec() or sent over SCM_RIGHTS messages with unix(7) sockets. The receiving process can pass it to mmap().

Calls to mmap() will not (by default) be placed in the shared region. Specifying an address within the shared region as the first argument to mmap() will place the mapping in the shared region only, whether or not MAP_FIXED is supplied.

Reads and writes to this file descriptor are not currently defined. Only mmap() and close() have defined meanings. Other operations may become useful.

flags values:

(should we define our own namespace for these flags instead?) (what other flags might be useful?)

It is not possible to revoke access to the shared memory region. Recipients of the fd may unmap the entire area with a call to munmap() which matches their call to mmap() which attached the shared region. Attempts to munmap() a subset of the attached region will be interpreted as an attempt to modify the shared region, which may or may not succeed depending if it's O_RDONLY (see above).

Need to define what happens with overlapping calls to mshare(). Just error? Truncating the previous mapping seems hard to implement. Creating a sub-mapping might be useful.

Need to define the acceptable granularities of addr & length. Nothing smaller than PMD (2MB) makes any sense. It does make sense to share on a 4MB boundary; we'd have two PMD entries to copy. but after we get to a 1GB boundary, it doesn't really make sense to allow 1GB + 2MB. For ease of implementation, allow n * 2^9p for n < 2^9 and p >= 1 (on x86).


This is a suggestion; other implementations are surely possible.

Calling mshare() creates a new mm_struct. The existing VMAs within the range are reparented to this mm_struct and a new VMA occupying the entire range is created. This mm_struct has its own page table tree, starting at mm->pgd. Depending on the granularity of the sharing, it will have p4ds, puds or pmds, which should be copied to the page tables of the receiving task on a call to mmap().

Why Sileby?

It's just a place name. It has no significance.