m4_comment([$Id: faq.so,v 10.11 2006/05/09 19:46:59 bostic Exp $]) m4_ref_title(m4_tam Applications, Transaction FAQ, @transaction FAQ, transapp/throughput, rep/intro) m4_nlistbegin m4_nlist([dnl m4_bold([What should a transactional program do when an error occurs?]) m4_p([dnl Any time an error occurs, such that a transactionally protected set of operations cannot complete successfully, the transaction must be aborted. While deadlock is by far the most common of these errors, there are other possibilities; for example, running out of disk space for the filesystem. In m4_db transactional applications, there are three classes of error returns: "expected" errors, "unexpected but recoverable" errors, and a single "unrecoverable" error. Expected errors are errors like m4_ref(DB_NOTFOUND), which indicates that a searched-for key item is not present in the database. Applications may want to explicitly test for and handle this error, or, in the case where the absence of a key implies the enclosing transaction should fail, simply call m4_ref(txn_abort). Unexpected but recoverable errors are errors like m4_ref(DB_LOCK_DEADLOCK), which indicates that an operation has been selected to resolve a deadlock, or a system error such as EIO, which likely indicates that the filesystem has no available disk space. Applications must immediately call m4_ref(txn_abort) when these returns occur, as it is not possible to proceed otherwise. The only unrecoverable error is m4_ref(DB_RUNRECOVERY), which indicates that the system must stop and recovery must be run.])]) m4_nlist([dnl m4_bold([How can hot backups work? Can't you get an inconsistent picture of the database when you copy it?]) m4_p([dnl First, m4_db is based on the technique of "write-ahead logging", which means that before any change is made to a database, a log record is written that describes the change. Further, m4_db guarantees that the log record that describes the change will always be written to stable storage (that is, disk) before the database page where the change was made is written to stable storage. Because of this guarantee, we know that any change made to a database will appear either in just a log file, or both the database and a log file, but never in just the database.]) m4_p([dnl Second, you can always create a consistent and correct database based on the log files and the databases from a database environment. So, during a hot backup, we first make a copy of the databases and then a copy of the log files. The tricky part is that there may be pages in the database that are related for which we won't get a consistent picture during this copy. For example, let's say that we copy pages 1-4 of the database, and then are swapped out. For whatever reason (perhaps because we needed to flush pages from the cache, or because of a checkpoint), the database pages 1 and 5 are written. Then, the hot backup process is re-scheduled, and it copies page 5. Obviously, we have an inconsistent database snapshot, because we have a copy of page 1 from before it was written by the other thread of control, and a copy of page 5 after it was written by the other thread. What makes this work is the order of operations in a hot backup. Because of the write-ahead logging guarantees, we know that any page written to the database will first be referenced in the log. If we copy the database first, then we can also know that any inconsistency in the database will be described in the log files, and so we know that we can fix everything up during recovery.])]) m4_nlist([dnl m4_bold([My application has m4_ref(DB_LOCK_DEADLOCK) errors. Is the normal, and what should I do?]) m4_p([dnl It is quite rare for a transactional application to be deadlock free. All applications should be prepared to handle deadlock returns, because even if the application is deadlock free when deployed, future changes to the application or the m4_db implementation might introduce deadlocks.]) m4_p([dnl Practices which reduce the chance of deadlock include: m4_bulletbegin m4_bullet([dnl Not using cursors which move backwards through the database (m4_ref(DB_PREV)), as backward scanning cursors can deadlock with page splits;]) m4_bullet([dnl Configuring m4_ref(DB_REVSPLITOFF) to turn off reverse splits in applications which repeatedly delete and re-insert the same keys, to minimize the number of page splits as keys are re-inserted;]) m4_bullet([dnl Not configuring m4_ref(DB_READ_UNCOMMITTED) as that flag requires write transactions upgrade their locks when aborted, which can lead to deadlock. Generally, m4_ref(DB_READ_COMMITTED) or non-transactional read operations are less prone to deadlock than m4_ref(DB_READ_UNCOMMITTED).]) m4_bulletend])]) m4_nlist([dnl m4_bold([How can I move a database from one transactional environment into another?]) m4_p([dnl Because database pages contain references to log records, databases cannot be simply moved into different database environments. To move a database into a different environment, dump and reload the database before moving it. If the database is too large to dump and reload, the database may be prepared in place using the m4_refT(dbenv_lsn_reset) or the m4_option(r) argument to the m4_link(M4RELDIR/utility/db_load, db_load) utility.])]) m4_nlist([dnl m4_bold([I'm seeing the error "log_flush: LSN past current end-of-log", what does that mean?]) m4_p([dnl The most common cause of this error is that a system administrator has removed all of the log files from a database environment. You should shut down your database environment as gracefully as possible, first flushing the database environment cache to disk, if that's possible. Then, dump and reload your databases. If the database is too large to dump and reload, the database may be reset in place using the m4_refT(dbenv_lsn_reset) or the m4_option(r) argument to the m4_link(M4RELDIR/utility/db_load, db_load) utility. However, if you reset the database in place, you should verify your databases before using them again. (It is possible for the databases to be corrupted by running after all of the log files have been removed, and the longer the application runs, the worse it can get.)])]) m4_nlistend m4_page_footer