Release 6.2.7
In order to perform write operations in such cases, the application must now call TransactionConfig.setLocalWrite(true) and use this configuration to create a Transaction for performing writes to the non-replicated database.
In addition, it is no longer possible to use a single transaction to write to both replicated and a non-replicated databases. IllegalOperationException will be thrown if this is attempted.
These changes were necessary to prevent corruption when a transaction contains write operations for both replicated and non-replicated databases, and a failover occurs that causes a rollback of this transaction. The probability of corruption is low, but it can occur under the right conditions.
For more information see the javadoc for TransactionConfig.setLocalWrite(true), and the "Non-replicated Databases in a Replicated Environment" section of the ReplicatedEnvironment class javadoc.
One of two utility programs must be used, which are available in the release package for JE 4.1.20, or a later release of JE 4.1. If you are currently running a release earlier than JE 4.1.20, then you must download the latest JE 4.1 release package in order to run these utilities.
The steps for upgrading are as follows.
Environment:
java -jar je-4.1.20.jar DbPreUpgrade_4_1 -h <dir>If you are using a JE
ReplicatedEnvironment:
java -jar je-4.1.20.jar DbRepPreUpgrade_4_1
-h <dir>
-groupName <group name>
-nodeName <node name>
-nodeHostPort <host:port>The second step -- running the utility program -- does not perform data conversion. This step simply performs a special checkpoint to prepare the environment for upgrade. It should take no longer than an ordinary startup and shutdown.
During the last step -- when the application opens the JE environment using the
current release (JE 5 or later) -- all databases configured for duplicates will
automatically be converted before the Environment or
ReplicatedEnvironment constructor returns. Note that a database
might be explicitly configured for duplicates using
DatabaseConfig.setSortedDuplicates(true), or implicitly configured
for duplicates by using a DPL MANY_TO_XXX relationship
(Relationship.MANY_TO_ONE or
Relationship.MANY_TO_MANY).
The duplicate database conversion only rewrites internal nodes in the Btree, not leaf nodes. In a test with a 500 MB cache, conversion of a 10 million record data set (8 byte key and data) took between 1.5 and 6.5 minutes, depending on number of duplicates per key. The high end of this range is when 10 duplicates per key were used; the low end is with 1 million duplicates per key.
To make the duplicate database conversion predictable during deployment, users
should measure the conversion time on a non-production system before upgrading
a deployed system. When duplicates are converted, the Btree internal nodes are
preloaded into the JE cache. A new configuration option,
EnvironmentConfig.ENV_DUP_CONVERT_PRELOAD_ALL, can be set to false
to optimize this process if the cache is not large enough to hold the internal
nodes for all databases. For more information, see the javadoc for this
property.
If an application has no databases configured for duplicates, then the last step simply opens the JE environment normally, and no data conversion is performed.
If the user fails to run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility
program before opening an environment with JE 5 or later for the first time, an
exception such as the following will normally be thrown by the
Environment or ReplicatedEnvironment constructor:
com.sleepycat.je.EnvironmentFailureException: (JE 6.0.1) JE 4.1 duplicate DB
entries were found in the recovery interval. Before upgrading to JE 5.0, the
following utility must be run using JE 4.1 (4.1.20 or later):
DbPreUpgrade_4_1. See the change log.
UNEXPECTED_STATE: Unexpected internal state, may have side effects.
at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:376)
at com.sleepycat.je.recovery.RecoveryManager.checkLogVersion8UpgradeViolations(RecoveryManager.java:2694)
at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:549)
at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:198)
at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:610)
...
If the user fails to run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility
program, but no exception is thrown when the environment is opened with JE 5
or later, this is probably because the application performed an
Environment.sync before last closing the environment with JE 4.1
or earlier, and nothing else happened to be written (by the application or JE
background threads) after the sync operation. In this case, running the
upgrade utility is not necessary.
[#23775] (6.2.0)
For backgroud and previous work in this area, see the changelog for the 6.1 release. In this release we have extended the set of CRUD operations that are performed in BIN-deltas, without the need to mutate them to full BINs (and thus saving the disk reads that would be required to fetch the full BINs in memory). Specifically, the following additional operations can now exploit BIN-deltas:
Insertions and updates, when no tree node splits are required and the key of the record to be inserted/updated is found in a BIN-delta.
Blind operations: we say that a record operation (insertion, update, or deletion) is performed "blindly" in a BIN-delta, when the delta does not contain a slot with the operation's key and we don't need to access the full BIN to check whether such a slot exists there or to extract any information from the full-BIN slot, if it exists. The condition that no tree node splits are required applies to blind operations as well. The following operations can be performed blindly: - Replay of insertions at replica nodes. - Insertions during recovery redo. - Updates and deletes during recovery redo, for databases with duplicates.
A new statistic has been added to count the number blind operations performed,
including the blind put operations described below. This count can be obtained
via the EnvironmentStats.getNBINDeltaBlindOps() method.
[#23680] (6.2.0)
Normally, blind puts are not possible: we need to know whether the put is actually an update or an insertion, i.e., whether the key exists in the full BIN or not. Furthermore, in case of update we also need to know the location of the previous record version to make the current update abortable. However, it is possible to answer at least the key existence question by adding a small amount of extra information in the deltas. If we do so, puts that are actual insertions can be done blindly.
To answer whether a key exists in a full BIN or not, each BIN-delta stores a bloom filter, which is a very compact, approximate representation of the set of keys in the full BIN. Bloom filters can answer set membership questions with no false negatives and very low probability of false positives. As a result, put operations that are actual insertions can almost always be performed blindly.
To make possible the blind puts optimization in JE databases that use custom
BTree and/or duplicates comparators, these comparators must perform "binary
equality", that is, they must consider two keys (byte arrays) to be equal if
and only if they have the same length and they are equal byte-per-byte. To
communicate to the JE engine that a comparator does binary equality, the
comparator must implement the new BinaryEqualityComparator tag
interface.
[#23768] (6.2.1)
[#23660] (6.2.1)
[#23326] (6.2.2)
[#23687] (6.2.2)
Exception in thread "main" com.sleepycat.je.DatabaseNotFoundException: (JE 6.1.5) Attempted to remove non-existent database ... at com.sleepycat.je.dbi.DbTree.lockNameLN(DbTree.java:869) at com.sleepycat.je.dbi.DbTree.doRemoveDb(DbTree.java:1130) at com.sleepycat.je.dbi.DbTree.dbRemove(DbTree.java:1183) at com.sleepycat.je.Environment$1.runWork(Environment.java:947) at com.sleepycat.je.Environment$DbNameOperation.runOnce(Environment.java:1172) at com.sleepycat.je.Environment$DbNameOperation.run(Environment.java:1155) at com.sleepycat.je.Environment.removeDatabase(Environment.java:941) ...A workaround for the problem in earlier releases is to avoid using read-committed for a transaction used to perform a DB remove or truncate operation.
[#23821] (6.2.3)
com.sleepycat.je.EnvironmentFailureException: Environment invalid because of
previous exception: (JE 6.1.0) ...
at com.sleepycat.je.EnvironmentFailureException.unexpectedException(EnvironmentFailureException.java:315)
at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:477)
at com.sleepycat.je.log.LogManager.logItems(LogManager.java:419)
at com.sleepycat.je.log.LogManager.multiLog(LogManager.java:324)
at com.sleepycat.je.log.LogManager.log(LogManager.java:272)
at com.sleepycat.je.log.LogManager.log(LogManager.java:261)
at com.sleepycat.je.log.LogManager.log(LogManager.java:223)
at com.sleepycat.je.dbi.EnvironmentImpl.rewriteMapTreeRoot(EnvironmentImpl.java:1285)
at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:701)
at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:274)
at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:137)
at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:148)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 111
at com.sleepycat.util.PackedInteger.writeInt(PackedInteger.java:188)
at com.sleepycat.je.log.LogUtils.writePackedInt(LogUtils.java:155)
at com.sleepycat.je.cleaner.DbFileSummary.writeToLog(DbFileSummary.java:79)
at com.sleepycat.je.dbi.DatabaseImpl.writeToLog(DatabaseImpl.java:2410)
at com.sleepycat.je.dbi.DbTree.writeToLog(DbTree.java:2050)
at com.sleepycat.je.log.entry.SingleItemEntry.writeEntry(SingleItemEntry.java:114)
at com.sleepycat.je.log.LogManager.marshallIntoBuffer(LogManager.java:745)
at com.sleepycat.je.log.LogManager.serialLogWork(LogManager.java:611)
at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:461)
... 11 more
Another instance of the same problem with a slightly different stack trace is
below:
java.nio.BufferOverflowException UNEXPECTED_EXCEPTION_FATAL: Unexpected
internal Exception, unable to continue. Environment is invalid and must be
closed.
at com.sleepycat.je.EnvironmentFailureException.unexpectedException(EnvironmentFailureException.java:315)
at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:481)
at com.sleepycat.je.log.LogManager.logItems(LogManager.java:423)
at com.sleepycat.je.log.LogManager.multiLog(LogManager.java:325)
at com.sleepycat.je.log.LogManager.log(LogManager.java:273)
at com.sleepycat.je.tree.LN.logInternal(LN.java:600)
at com.sleepycat.je.tree.LN.log(LN.java:411)
at com.sleepycat.je.cleaner.FileProcessor.processFoundLN(FileProcessor.java:1070)
at com.sleepycat.je.cleaner.FileProcessor.processLN(FileProcessor.java:884)
at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:673)
at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:278)
at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:137)
at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:148)
Caused by: java.nio.BufferOverflowException
at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:189)
at java.nio.ByteBuffer.put(ByteBuffer.java:859)
at com.sleepycat.je.log.LogUtils.writeBytesNoLength(LogUtils.java:350)
at com.sleepycat.je.log.entry.LNLogEntry.writeBaseLNEntry(LNLogEntry.java:371)
at com.sleepycat.je.log.entry.LNLogEntry.writeEntry(LNLogEntry.java:333)
at com.sleepycat.je.log.entry.BaseReplicableEntry.writeEntry(BaseReplicableEntry.java:48)
at com.sleepycat.je.log.entry.LNLogEntry.writeEntry(LNLogEntry.java:52)
at com.sleepycat.je.log.LogManager.marshallIntoBuffer(LogManager.java:751)
at com.sleepycat.je.log.LogManager.serialLogWork(LogManager.java:617)
at com.sleepycat.je.log.LogManager.serialLog(LogManager.java:465)
[#23492] (6.2.3)
The bug shows up with cursors using read-committed isolation. Here is the specific scenario: 1. Cursor C1 in thread T1 reads a record R using Transaction X1. C1 creates a ReadCommittedLocker L1, with X1 as its buddy. L1 locks R. 2. Cursor C2 in thread T2 tries to write-lock R, using another Transaction X2. X2 waits for L1 (T2 waits for T1). 3. Cursor C3 in thread T1 tries to read R using X1. C3 creates a ReadCommittedLocker L3, with X1 as its buddy. L3 tries to lock R. L1 and L3 are not recognized as buddies, so L3 waits for X2 (T1 waits for T2)
[#23821] (6.2.4)