Garbage Collection
Garbage collection (GC) is a single background thread that coalesces partially invalidated Super Blocks into a new Super Block so the partially invalid ones can be released. This makes them available for future writes for the QoS Domains in the same Virtual Device. It uses the Super Block layer to know which blocks are available as sources and for allocating the destination Super Block. The actual data movement is performed by the SEF Unit, but the decisions of when and what to move are made by GC. The priority used for copy I/O is selected externally from the garbage collection. At the start of each source-block copy, a notification is sent allowing the block layer to modify the queue-weight overrides for the copy. The garbage collection thread also has other related housekeeping functions for wear leveling, running patrols and releasing blocks marked for maintenance.
Garbage collection is started by calling GCTrigger()
. This signals the GC thread to attempt a
garbage collection. It loops garbage collecting the domain until the number of allocated ADUs is
below the GC trigger value. The GC trigger value is near the end of the over-provisioning set by
SEFBlockConfig. When a patrol has been indicated for the device
from the SEF Library, a call to SEFCheckSuperBlock()
is issued for each Super Block returned
by SEFGetCheckList()
. With the collection cycle complete, the GC thread waits for the next
trigger.
Garbage collection is performed by gcQoSDomain()
. It starts by enumerating all the collectable
blocks allocated to the QoS Domain using the function SSBEnumBlocks()
. Enumerated blocks are
sorted by placement ID and ranked by gcBuildListOfCandidates()
. The rank of a candidate
source Super Block is simply the number of valid ADUs it has. The rank is hard-coded to zero for
Super Blocks marked for maintenance. This gives them highest priority within a placement ID.
Next, the placement ID to collect is selected by gcSelectPlacementIdToCollect()
. For a
placement ID to be collected, it must have a Super Block’s worth of invalid ADUs. It does not need
to have a Super Block’s worth of valid ADUs. When this occurs, the destination Super Block is
closed before it’s full. Placement IDs with Super Blocks marked for maintenance are processed first.
Otherwise, the placement ID with the highest ratio of invalid ADUs to allocated blocks is selected.
During the process of selecting a placement ID for GC, information about the top source candidates
is cached in the sbList
member of the struct gcPidStats
in the struct gcContext
. The
amount cached is set at compile time by the value of GC_SB_LIST_SIZE
. It is a compromise
between using the most up-to-date information and the CPU required to re-enumerate the blocks in
a QoS Domain. Because the cache can be smaller than the number of source blocks required to fill
the destination, the struct gcContext
has a bitmap of which blocks have already been used as
a source in its member srcBitmap
. This is used to prevent a Super Block from being used twice
for the same destination.
With a placement ID selected for collection, gcToASuperBlock()
issues a nameless copy to
the SEF Unit for each source Super Block in rank order until the number of ADUs requested
to copy is greater than the number of writable ADUs in the destination Super Block. As
each nameless copy completes, it is queued to be processed after all the copies have completed
and the destination Super Block is closed. gcPostProcessCopyIOCB()
processes the com-
pleted nameless copies with SEFProcessAddressChangeRequests()
, which will generate
kAddressUpdate
notification events for each copied ADU. The FTL’s QoS Domain notification
handler, HandleSEFNotification()
, calls SFTUpdate()
to do a non-authoritative update of
the LUT so a new flash address will only be updated if unchanged since copied. This is done so that
whenever there is a race between an LBA being rewritten and garbage collect to update the LUT,
the rewrite always wins. Once gcToASuperBlock()
completes, the cycle starts again, evaluating
the need for GC, selecting a placement ID to collect, and performing the garbage collection.