public class DupConvert
extends java.lang.Object
Performs post-recovery conversion of all dup DBs during Environment
construction, when upgrading from JE 4.1 and earlier. In JE 5.0, duplicates
are represented by a two-part (key + data) key, and empty data. In JE 4.1
and earlier, the key and data were separate as with non-dup DBs.
Uses the DbTree.DUPS_CONVERTED_BIT to determine whether conversion of the
environment is necessary. When all databases are successfully converted,
this bit is set and the mapping tree is flushed. See
EnvironmentImpl.convertDupDatabases.
Uses DatabaseImpl.DUPS_CONVERTED to determine whether an individual database
has been converted, to handle the case where the conversion crashes and is
restarted later. When a database is successfully converted, this bit is set
and the entire database is flushed using Database.sync.
The conversion of each database is atomic -- either all INs or none are
converted and made durable. This is accomplished by putting the database
into Deferred Write mode so that splits won't log and eviction will be
provisional (eviction will not flush the root IN if it is dirty). The
Deferred Write mode is cleared after conversion is complete and
Database.sync has been called.
The memory budget is updated during conversion and daemon eviction is
invoked periodically. This provides support for arbitrarily large DBs.
Uses preload to load all dup trees (DINs/DBINs) prior to conversion, to
minimize random I/O. See EnvironmentConfig.ENV_DUP_CONVERT_PRELOAD_ALL.
The preload config does not specify loading of LNs, because we do not need
to load LNs from DBINs. The fact that DBIN LNs are not loaded is the main
reason that conversion is quick. LNs are converted lazily instead; see
LNLogEntry.postFetchInit. The DBIN LNs do not need to be loaded because the
DBIN slot key contains the LN 'data' that is needed to create the two-part
key.
Even when LN loading is not configured, it turns out that preload does load
BIN (not DBIN) LNs in a dup DB, which is what we want. The singleton LNs
must be loaded in order to get the LN data to create the two-part key. When
preload has not loaded a singleton LN, it will be fetched during conversion.
The DIN, DBIN and DupCount LSN are counted obsolete during conversion using
a local utilization tracker. The tracker must not be flushed until the
conversion of a database is complete. Inexact counting can be used, because
DIN/DBIN/DupCountLN entries are automatically considered obsolete by the
cleaner. Since only totals are tracked, the memory overhead of the local
tracker is not substantial.
Database Conversion Algorithm
-----------------------------
1. Set Deferred Write mode for the database. Preload the database, including
INs/BINs/DINs/DBINs, but not LNs except for singleton LNs (LNs with a BIN
parent).
2. Convert all IN/BIN keys to "prefix keys", which are defined by the
DupKeyData class. This allows tree searches and slot insertions to work
correctly as the conversion is performed.
3. Traverse through the BIN slots in forward order.
4. If a singleton LN is encountered, ensure it is loaded. IN.fetchTarget
automatically updates the slot key if the LNLogEntry's key is different
from the one already in the slot. Because LNLogEntry's key is converted
on the fly, a two-part key is set in the slot as a side effect of
fetching the LN.
5. If a DIN is encountered, first delete the BIN slot containing the DIN.
Then iterate through all LNs in the DBINs of this dup tree, assign each
a two-part key, and insert the slot into a BIN. The LSN and state flags
of the DBIN slot are copied to the new BIN slot.
6. If a deleted singleton (BIN) LN is encountered, delete the slot rather
than converting the key. If a deleted DBIN LN is encountered, simply
discard it.
7. Count the DIN and DupCount LSN obsolete for each DIN encountered, using
a local utilization tracker.
8. When all BIN slots have been processed, set the
DatabaseImpl.DUPS_CONVERTED flag, call Database.sync to flush all INs and
the MapLN, clear Deferred Write mode, and flush the local utilization
tracker.