mirror of
https://github.com/torvalds/linux.git
synced 2024-11-24 13:11:40 +00:00
[PATCH] page migration: Update documentation
Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This commit is contained in:
parent
04e62a29bf
commit
8d3c138b77
@ -62,15 +62,15 @@ A. In kernel use of migrate_pages()
|
||||
It also prevents the swapper or other scans to encounter
|
||||
the page.
|
||||
|
||||
2. Generate a list of newly allocates page. These pages will contain the
|
||||
2. Generate a list of newly allocates pages. These pages will contain the
|
||||
contents of the pages from the first list after page migration is
|
||||
complete.
|
||||
|
||||
3. The migrate_pages() function is called which attempts
|
||||
to do the migration. It returns the moved pages in the
|
||||
list specified as the third parameter and the failed
|
||||
migrations in the fourth parameter. The first parameter
|
||||
will contain the pages that could still be retried.
|
||||
migrations in the fourth parameter. When the function
|
||||
returns the first list will contain the pages that could still be retried.
|
||||
|
||||
4. The leftover pages of various types are returned
|
||||
to the LRU using putback_to_lru_pages() or otherwise
|
||||
@ -93,83 +93,58 @@ Steps:
|
||||
|
||||
2. Insure that writeback is complete.
|
||||
|
||||
3. Make sure that the page has assigned swap cache entry if
|
||||
it is an anonyous page. The swap cache reference is necessary
|
||||
to preserve the information contain in the page table maps while
|
||||
page migration occurs.
|
||||
|
||||
4. Prep the new page that we want to move to. It is locked
|
||||
3. Prep the new page that we want to move to. It is locked
|
||||
and set to not being uptodate so that all accesses to the new
|
||||
page immediately lock while the move is in progress.
|
||||
|
||||
5. All the page table references to the page are either dropped (file
|
||||
backed pages) or converted to swap references (anonymous pages).
|
||||
This should decrease the reference count.
|
||||
4. The new page is prepped with some settings from the old page so that
|
||||
accesses to the new page will discover a page with the correct settings.
|
||||
|
||||
5. All the page table references to the page are converted
|
||||
to migration entries or dropped (nonlinear vmas).
|
||||
This decrease the mapcount of a page. If the resulting
|
||||
mapcount is not zero then we do not migrate the page.
|
||||
All user space processes that attempt to access the page
|
||||
will now wait on the page lock.
|
||||
|
||||
6. The radix tree lock is taken. This will cause all processes trying
|
||||
to reestablish a pte to block on the radix tree spinlock.
|
||||
to access the page via the mapping to block on the radix tree spinlock.
|
||||
|
||||
7. The refcount of the page is examined and we back out if references remain
|
||||
otherwise we know that we are the only one referencing this page.
|
||||
|
||||
8. The radix tree is checked and if it does not contain the pointer to this
|
||||
page then we back out because someone else modified the mapping first.
|
||||
page then we back out because someone else modified the radix tree.
|
||||
|
||||
9. The mapping is checked. If the mapping is gone then a truncate action may
|
||||
be in progress and we back out.
|
||||
9. The radix tree is changed to point to the new page.
|
||||
|
||||
10. The new page is prepped with some settings from the old page so that
|
||||
accesses to the new page will be discovered to have the correct settings.
|
||||
10. The reference count of the old page is dropped because the radix tree
|
||||
reference is gone. A reference to the new page is established because
|
||||
the new page is referenced to by the radix tree.
|
||||
|
||||
11. The radix tree is changed to point to the new page.
|
||||
11. The radix tree lock is dropped. With that lookups in the mapping
|
||||
become possible again. Processes will move from spinning on the tree_lock
|
||||
to sleeping on the locked new page.
|
||||
|
||||
12. The reference count of the old page is dropped because the radix tree
|
||||
reference is gone.
|
||||
12. The page contents are copied to the new page.
|
||||
|
||||
13. The radix tree lock is dropped. With that lookups become possible again
|
||||
and other processes will move from spinning on the tree lock to sleeping on
|
||||
the locked new page.
|
||||
13. The remaining page flags are copied to the new page.
|
||||
|
||||
14. The page contents are copied to the new page.
|
||||
14. The old page flags are cleared to indicate that the page does
|
||||
not provide any information anymore.
|
||||
|
||||
15. The remaining page flags are copied to the new page.
|
||||
15. Queued up writeback on the new page is triggered.
|
||||
|
||||
16. The old page flags are cleared to indicate that the page does
|
||||
not use any information anymore.
|
||||
|
||||
17. Queued up writeback on the new page is triggered.
|
||||
|
||||
18. If swap pte's were generated for the page then replace them with real
|
||||
ptes. This will reenable access for processes not blocked by the page lock.
|
||||
16. If migration entries were page then replace them with real ptes. Doing
|
||||
so will enable access for user space processes not already waiting for
|
||||
the page lock.
|
||||
|
||||
19. The page locks are dropped from the old and new page.
|
||||
Processes waiting on the page lock can continue.
|
||||
Processes waiting on the page lock will redo their page faults
|
||||
and will reach the new page.
|
||||
|
||||
20. The new page is moved to the LRU and can be scanned by the swapper
|
||||
etc again.
|
||||
|
||||
TODO list
|
||||
---------
|
||||
|
||||
- Page migration requires the use of swap handles to preserve the
|
||||
information of the anonymous page table entries. This means that swap
|
||||
space is reserved but never used. The maximum number of swap handles used
|
||||
is determined by CHUNK_SIZE (see mm/mempolicy.c) per ongoing migration.
|
||||
Reservation of pages could be avoided by having a special type of swap
|
||||
handle that does not require swap space and that would only track the page
|
||||
references. Something like that was proposed by Marcelo Tosatti in the
|
||||
past (search for migration cache on lkml or linux-mm@kvack.org).
|
||||
|
||||
- Page migration unmaps ptes for file backed pages and requires page
|
||||
faults to reestablish these ptes. This could be optimized by somehow
|
||||
recording the references before migration and then reestablish them later.
|
||||
However, there are several locking challenges that have to be overcome
|
||||
before this is possible.
|
||||
|
||||
- Page migration generates read ptes for anonymous pages. Dirty page
|
||||
faults are required to make the pages writable again. It may be possible
|
||||
to generate a pte marked dirty if it is known that the page is dirty and
|
||||
that this process has the only reference to that page.
|
||||
|
||||
Christoph Lameter, March 8, 2006.
|
||||
Christoph Lameter, May 8, 2006.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user