管理SQL Server事务日志

SQL Server databases consist of two physical files types; the data file in which the data and the database objects such as tables and indexes are stored, and the transaction log file in which SQL Server stores records for all transactions performed on that database.

SQL Server数据库由两种物理文件类型组成: 存储数据和数据库对象(例如表和索引)的数据文件,以及SQL Server存储在该数据库上执行的所有事务的记录的事务日志文件。

SQL Server transaction log contains metadata about the recorded transaction, such as the transaction ID, the transaction start and end time, the changed pages and the data changes.

SQL Server事务日志包含有关记录的事务的元数据,例如事务ID,事务开始和结束时间,更改的页面以及数据更改。

The SQL Server transaction log is an important database component that is used to make sure that the committed transaction’s data is preserved and the uncommitted transaction’s change will be rolled back. The transaction log is very useful when a hardware or application failures occur, by restoring the database to a previous point in time, as the log will be written to the log file before writing the data in the buffer cache to the data file.

SQL Server事务日志是重要的数据库组件,用于确保保留已提交事务的数据,并且将回退未提交事务的更改。 当发生硬件或应用程序故障时,通过将数据库还原到先前的时间点,事务日志非常有用,因为在将缓冲区高速缓存中的数据写入数据文件之前,该日志将被写入日志文件。

During the database recovery process, any transaction that is committed, without reflecting the data changes to the data files due to failure will be redone again, as the record already written to the database transaction log file before writing it to the data file. On the other hand, any data changes associated with uncommitted transaction will be rolled back by reversing the operation written in the transaction log file. In this way SQL Server guarantees the database consistency.

在数据库恢复过程中,将重新执行任何提交的事务,而不会由于失败而将数据更改反映到数据文件中,因为在将记录写入数据文件之前已将记录写入数据库事务日志文件中。 另一方面,与未提交的事务相关联的任何数据更改将通过反转事务日志文件中写入的操作来回滚。 这样,SQL Server可以保证数据库的一致性。

Data is written to the database data files randomly, but the logs are written to the transaction log file sequentially one by one, making the log write process faster. No performance gain can be taken from having multiple transaction log files in the database, as the logs will be written one at a time. It will be useful if the transaction log file become full and you need to create another file in a different disk that has more space.

数据是随机写入数据库数据文件的,但是日志是逐一依次写入事务日志文件的,这使得日志写入过程更快。 在数据库中拥有多个事务日志文件不会带来任何性能提升,因为这些日志将一次写入一个。 如果事务日志文件已满,并且您需要在另一个具有更多空间的磁盘中创建另一个文件,这将很有用。

It is recommended also to keep the transaction log file in a separate drive from the database data files, as placing both data and log files on the same drive can result poor database performance.

还建议将事务日志文件与数据库数据文件保存在单独的驱动器中,因为将数据和日志文件放在同一驱动器上可能会导致数据库性能下降。

The transaction log file internally consists of small units called Virtual Log Files (VLF).

事务日志文件在内部由称为虚拟日志文件(VLF)的小单元组成

And the process in which SQL Server marks these VLFs as inactive, making it available for reuse by the database is called Truncation. In order to mark these VLFs as inactive, it shouldn’t contain any active log related to an opened transaction, as these logs may be used for redone or redo process. Log truncation doesn’t mean that the transaction log size will be reduced, only it will be marked as inactive to be used in future.

SQL Server将这些VLF标记为非活动状态以使其可供数据库重用的过程称为Truncation 。 为了将这些VLF标记为非活动状态,它不应包含与已打开的事务相关的任何活动日志,因为这些日志可用于重做或重做过程。 日志截断并不意味着事务日志大小会减少,只是将其标记为非活动状态以备将来使用。

Once the transaction log record inserted into the log file, it will be marked with a specific value called Logical Sequence Number (LSN). The log with the lowest LSN specifies the start of the active log that may be needed for any database activity, and the log with the highest LSN specifies the end of that active log. Any transaction log with LSN value less than the minimum LSN will be considered as inactive transaction log.

将事务日志记录插入日志文件后,将使用称为逻辑序列号(LSN)的特定值进行标记 LSN最低的日志指定任何数据库活动可能需要的活动日志的开始,而LSN最高的日志指定该活动日志的结束。 LSN值小于最小LSN的任何事务日志都将被视为非活动事务日志。

The transaction log records that are needed by SQL Server Replication, Mirroring, Change Data Capture or Log Backup processes will remain as active until released by these activities.

SQL Server复制,镜像,更改数据捕获或日志备份过程所需的事务日志记录将保持活动状态,直到由这些活动释放为止。

The SQL Server database recovery model property determines how the transactions are logged. The default recovery model for the newly created databases is the same recovery model as the model system database. If the database recovery model is SIMPLE, the data pages will be written to data files, and all VLFs with no active log will be truncated. If the recovery model of the database is FULL or BULK LOGGED, the VLFs will be truncated only if log is backed up, unless there are other activities that still need this logs as mentioned previously.

SQL Server数据库恢复模型属性确定如何记录事务。 新创建的数据库的默认恢复模型与模型系统数据库的恢复模型相同。 如果数据库恢复模型为SIMPLE ,则数据页面将被写入数据文件,并且所有没有活动日志的VLF将被截断。 如果数据库的恢复模型为FULLBULK LOGGED ,则仅当备份日志后,VLF才会被截断,除非有其他活动如前所述仍然需要此日志。

If the transaction log file grows and extended in small chunks, this will result huge number of VLFs, but if the allocation is performed in large chunks, this will result small number of VLFs. A huge number of VLFs will impact the database performance especially during the database backup, restore and recovery operations. So it is recommended to set the appropriate initial size, maximum size and autogrowth increment values for the transaction log, as every growth of that file is expensive.

如果事务日志文件以小块的形式增长和扩展,则将导致大量的VLF,但是如果以大块的形式进行分配,则将导致少量的VLF。 大量的VLF将影响数据库性能,尤其是在数据库备份,还原和恢复操作期间。 因此,建议为事务日志设置适当的初始大小,最大大小和自动增长增量值,因为该文件的每次增长都很昂贵。

You can easily check the number of VLFs in your database by running the below simple DBCC query:

通过运行以下简单的DBCC查询,您可以轻松地检查数据库中的VLF数量:

 
USE SQLShackDemo 
GO
DBCC LOGINFO
 

The query result below shows that we have only 4 VLFs in the SQLShackDemo database log file:

下面的查询结果显示,SQLShackDemo数据库日志文件中只有4个VLF:

管理SQL Server事务日志_第1张图片

Before allocating space to the transaction log file, the space will be filled with zeroes by the operating system. This prevents any corruption on the database that is caused by processing the data written previously on that disk, having the same parity bit as the transaction logs. Using the Instance File Initialization feature, space can be allocated to the database data files only without zeroing the space.

在为事务日志文件分配空间之前,操作系统将使用零填充该空间。 这样可以防止由于处理先前写入该磁盘的数据(具有与事务日志相同的奇偶校验位)而导致的数据库损坏。 使用实例文件初始化功能,可以仅将空间分配给数据库数据文件,而无需将空间清零。

It is better to monitor the transaction log file growth, as the transaction log could be expanded till it takes all the free space in the disk, generating error number 9002, telling us that the transaction log is full, and the database will become read-only.

最好监视事务日志文件的增长,因为可以扩展事务日志,直到它占用磁盘上的所有可用空间为止,并生成错误号9002 ,告诉我们事务日志已满,并且数据库将变为已读取-只要。

The first step in managing the transaction log file is to verify the transaction log files information, which can be retrieved from the sys.database_files system catalog view as follows:

管理事务日志文件的第一步是验证事务日志文件信息,可以从sys.database_files系统目录视图检索该信息,如下所示:

 
USE SQLShackDemo 
GO
SELECT name AS FileName,
  size FileSizeInKB,
  max_size FileSizeLimitInKB,
  growth SizeIncrementAmount,
  is_percent_growth
FROM sys.database_files
WHERE type_desc = 'LOG'
 

From the query result you can check the transaction log file current size, the maximum limit for the file, the increment amount or percentage and if the growth is in percent or in megabytes:

从查询结果中,您可以检查事务日志文件的当前大小,文件的最大限制,增量金额或百分比,以及增长百分比还是兆字节:

Now we will go through the transaction log file space consumptions that prevent the log from being reused and lead to the transaction log full error.

现在,我们将介绍事务日志文件空间的消耗情况,这些空间消耗将防止日志被重用并导致事务日志已满错误。

The first transaction log space consumer is the Index Maintenance operations; the Index Rebuild and Index Reorganize. Rebuilding the indexes in a database with FULL recovery model is heavily logged, but in the SIMPLE and BULK LOGGED recovery models, the index rebuild operation is minimally logged, where only the allocations are logged. On the other hand,

第一个事务日志空间使用者是索引维护操作 ; 索引重建和索引重组。 在使用FULL恢复模型重建数据库中的索引时,会记录大量日志,但是在SIMPLE和BULK LOGGED恢复模型中,会以最少的方式记录索引重建操作,其中仅记录分配。 另一方面,

Index reorganize operation is fully logged regardless of the database recovery model.

不管数据库恢复模型如何,都完全记录了索引重组操作。

If your database is included in one of the SQL Server availability or disaster recovery solutions, and you are not able to change the database recovery model to SIMPLE before rebuilding the index and change it back to FULL after you finish, then you need to increase the log backup frequency during the index rebuilding operation in order to truncate inactive logs as possible.

如果您的数据库包含在SQL Server可用性或灾难恢复解决方案之一中,并且您无法在重建索引之前将数据库恢复模型更改为SIMPLE并在完成索引后将其更改回FULL,则需要增加在索引重建操作期间记录日志备份频率,以便截断不活动的日志。

If the transaction log consumer is not clear on the surface, the first step you can perform is to check what is preventing the transaction log from being reused by querying the sys.databases system table as below:

如果表面上的事务日志使用方不清晰,那么您可以执行的第一步是通过查询sys.databases系统表来检查是什么阻止了事务日志被重用:

 
SELECT name AS Database_Name
,log_reuse_wait_desc
FROM sys.databases
 

管理SQL Server事务日志_第2张图片

We are interested here in the log_reuse_wait_desc column value, which shows us what is preventing the transaction log from being reused.

我们在这里对log_reuse_wait_desc列值感兴趣,该列值向我们显示了阻止事务日志重用的原因。

The healthy value for the log_reuse_wait_desc column is NOTHING, which indicates that the transaction log is reusable.

log_reuse_wait_desc列的正常值是NOTHING ,指示事务日志是可重用的。

If the log_reuse_wait_desc column value is Log Backup, this means that your database recovery model is FULL and transaction log is pending log backup operation to be truncated.

如果log_reuse_wait_desc列值为Log Backup ,则意味着您的数据库恢复模型为FULL,并且事务日志正在等待日志备份操作被截断。

The database full backup will not truncate the transaction log. The transaction log truncation can be achieved only by taking log backup. Without the log backup, the transaction log file will keep growing without allowing the space to be reused.

数据库完全备份不会截断事务日志。 事务日志截断只能通过进行日志备份来实现。 没有日志备份,事务日志文件将继续增长,而不允许重复使用空间。

The first thing you need to think about here, is that , do you really need the database recovery model to be FULL, depending on the disaster recovery solution you may use and the company business requirements. If it is not required, then it is better to switch the database recovery model to simple, where no log backup is required here as the inactive transaction logs will be automatically marked as reusable at the checkpoint.

您首先需要考虑的事情是,您是否真的需要数据库恢复模型为FULL,这取决于您可能使用的灾难恢复解决方案和公司的业务需求。 如果不需要,最好将数据库恢复模型切换为简单模式,此处不需要日志备份,因为非活动事务日志将在检查点自动标记为可重用。

If you find that the database should operate in FULL recovery model, then you need to schedule a transaction log backup job with frequency depends on the data changes frequency and the data loss acceptance if a crash occur.

如果发现数据库应在完整恢复模式下运行,则需要安排事务日志备份作业的频率取决于数据更改频率和发生崩溃时的数据丢失接受程度。

When the returned value for the log_reuse_wait_desc column is ACTIVE_TRANSACTION, then there is an uncommitted transaction that is running for a long time and consuming the database log file space.

当log_reuse_wait_desc列的返回值是ACTIVE_TRANSACTION时 ,那么存在一个长时间未运行的未提交事务,该事务占用了数据库日志文件空间。

This transaction will delay the truncation of any transaction log that is generated when it starts, even the transaction logs of the committed changes.

此事务将延迟启动时生成的任何事务日志的截断,即使已提交更改的事务日志也是如此。

In SQL Server, any single data modification statement will act as an implicit transaction, in order to ensure the data consistency.

在SQL Server中,任何单个数据修改语句都将充当隐式事务,以确保数据一致性。

A common example of transactions that take long time and fill the transaction log file are purging and archiving operations. The more records you are trying to delete from your table, the more logs will be written to the transaction log, the longer time this transaction will take.

清除和归档操作是需要很长时间并填充事务日志文件的常见事务示例。 您尝试从表中删除的记录越多,越多的日志将被写入事务日志,该事务将花费的时间越长。

The complexity of the delete transaction will be increased if the table from where you are deleting data has FOREIGN KEY constraints with CASCADE ON DELETE. Here the transaction log records will be written for the data deleted by the cascade delete operation. The same thing will happen if there is an ON DELETE trigger in the table you are deleting data from, where transaction log records will be written for the operation performed by that trigger.

如果要删除数据的表具有带有CASCADE ON DELETE的FOREIGN KEY约束,则删除事务的复杂性将增加。 在这里,将为级联删除操作删除的数据写入事务日志记录。 如果您要从中删除数据的表中有一个ON DELETE触发器,则将发生同样的事情,在该触发器中,将为该触发器执行的操作写入事务日志记录。

Rather than purging the data using a single DELETE statement, you can minimize the logging for such operation by deleting the data in batches, using the TOP operator with the DELETE statement. In this way the large transaction will be divided into multiple smaller transactions.

您可以使用TOP运算符和DELETE语句批量删除数据,而不必使用单个DELETE语句清除数据,而是可以最大程度地减少此类操作的日志记录。 这样,大交易将被分为多个小交易。

Any created transaction will remain active in the database till it is committed, rolled back or the session lost the connection with the SQL Server. If the transaction has not terminated properly, it will not close its connection to the database, as the transaction still active from the database side, preventing the transaction log truncation. As a result, the transaction log will keep growing, and filling up the disk. This type of transactions that left uncommitted from the database side and disconnected from the application side is called Orphaned Transactions.

任何创建的事务将在数据库中保持活动状态,直到被提交,回滚或会话断开与SQL Server的连接。 如果事务未正确终止,则不会关闭其与数据库的连接,因为事务仍在数据库端处于活动状态,从而防止了事务日志被截断。 结果,事务日志将继续增长,并填满磁盘。 这种从数据库端取消提交但与应用程序端断开连接的事务类型称为孤立事务

The below T-SQL script can be used to list all active transactions in your database. You can use this script to monitor the active transactions, and find the long-running ones with the oldest database_Transaction_Begin_Time value:

下面的T-SQL脚本可用于列出数据库中的所有活动事务。 您可以使用此脚本来监视活动事务,并查找具有最旧database_Transaction_Begin_Time值的长时间运行的事务:

 
SELECT transaction_id , 
	   database_ID , 
	   database_Transaction_Begin_Time,
	   database_transaction_log_record_count,
	   database_transaction_begin_lsn ,
	   database_transaction_last_lsn,
	  CASE database_transaction_state
         WHEN 1 THEN 'The transaction has not been initialized.'
         WHEN 3 THEN 'The transaction has been initialized but has not generated any log recorst.'
         WHEN 4 THEN 'The transaction has generated log records.'
         WHEN 5 THEN 'The transaction has been prepared.'
         WHEN 10 THEN 'The transaction has been committed.'
         WHEN 11 THEN 'The transaction has been rolled back.'
         WHEN 12 THEN 'The transaction is being committed. In this state the log record is being generated, but it has not been materialized or persisted'
      END database_transaction_state
FROM   sys.dm_tran_database_transactions 
 

The result will be like:

结果将如下所示:

Another useful indicator from the previous query is the database_transaction_log_record_count column that can show you which transaction is filling the database transaction log file.

上一个查询的另一个有用的指示符是database_transaction_log_record_count列,该列可以向您显示哪个事务正在填充数据库事务日志文件。

In order to get the session running the transaction that is consuming your transaction log file, you can query the sys.dm_tran_session_transactions for the TransactionID derived from the previous query as follows:

为了使运行正在消耗您的事务日志文件的事务的会话,您可以在sys.dm_tran_session_transactions中查询从上一个查询派生的TransactionID,如下所示:

 
DECLARE @TransID as bigint
SET @TransID= ---The Transaction ID resulted from the old query
SELECT   session_id
FROM    sys.dm_tran_session_transactions
WHERE transaction_id = @TransID 
 

To stop the orphaned transaction that is consuming space in your transaction log, you need to KILL the session derived from the previous script. The transaction will be rolled back and the space will be available during the next transaction log backup.

要停止在事务日志中占用空间的孤立事务,您需要杀死从先前脚本派生的会话。 事务将回滚,并且在下一次事务日志备份期间空间将可用。

If the returned value for the log_reuse_wait_desc column is REPLICATION, then the transaction logs are pending replication for long time and can’t be truncated due to slow log reader agent activities. This is a problem you may face if you configure SQL Server Replication or Change Data Capture (CDC) features in your database.

如果log_reuse_wait_desc列的返回值为REPLICATION ,则事务日志长时间处于挂起复制状态,并且由于日志读取器代理程序活动缓慢而无法被截断。 如果在数据库中配置SQL Server复制或更改数据捕获(CDC)功能,则可能会遇到此问题。

To troubleshoot this type of issues you need first to make sure that the SQL Server Agent service is running, then check that the Log Reader Agent jobs are running. If the jobs are running and the issue still active, you need to check the replication monitor and the Log Reader Agent jobs history to check what the cause of that delay is. You may need to re-initialize the subscribers and create a new snapshot if the subscriber was inactive for the configurable max_distretention period.

要解决此类问题,首先需要确保SQL Server代理服务正在运行,然后检查日志读取器代理作业是否正在运行。 如果作业正在运行并且问题仍然存在,则需要检查复制监视器和日志读取器代理作业历史记录,以检查造成该延迟的原因。 如果订阅服务器在可配置的max_distretention期间处于非活动状态,则可能需要重新初始化订阅服务器并创建新的快照。

If the cause of this issue is CDC, then you need to check that the CDC capture job is running and tracking changes. If the CDC capture job is running and the issue still then you need to check what is exactly delaying the log reading which could be a change on a huge amount of data on a table with CDC enabled on it. Disable the CDC and enable it again before doing such huge changes.

如果此问题的原因是CDC,则需要检查CDC捕获作业是否正在运行并跟踪更改。 如果CDC捕获作业正在运行并且问题仍然存在,那么您需要检查到底是什么在延迟日志读取,这可能是对启用CDC的表上的大量数据进行了更改。 在进行如此大的更改之前,请禁用CDC并再次启用它。

Once the replication or the CDC issue resolved, the transaction log will be truncated and available for reuse in the next log backup.

复制或CDC问题解决后,事务日志将被截断并在下一个日志备份中可供重用。

When the returned value for the log_reuse_wait_desc column is DATABASE_MIRRORING, then the database Mirroring is the main cause for the transaction log truncation and reuse issue.

当log_reuse_wait_desc列的返回值为DATABASE_MIRRORING时 ,则数据库镜像是导致事务日志被截断和重用的主要原因。

There are two modes of SQL Server Mirroring; Synchronous Mirroring on which the transaction will be committed on the principal server only once that transaction log records have been copied to the mirrored server. And Asynchronous Mirroring on which the transaction will be committed directly without waiting for transaction log records to be copied to the mirrored server.

SQL Server镜像有两种模式: 仅在将事务日志记录复制到镜像服务器后,才可以在主体服务器上落实事务的同步镜像。 异步镜像,可以直接在其上提交事务,而无需等待将事务日志记录复制到镜像服务器。

To troubleshoot the mirroring wait, you need to check the mirroring state on the principal server first, the mirroring could be suspended due to a sudden disconnection and you need to resume it back. If the state of the mirroring is disconnected, you need to check the connection between the principal and the mirrored servers that could be broken. When using synchronous mirroring, slow connections between the principal and mirrored servers could grow the transaction log file and consume disk space, as the logs will remain active till copying it to the mirrored server.

要对镜像等待进行故障排除,您需要首先检查主体服务器上的镜像状态,由于突然断开连接,镜像可能会挂起,您需要恢复它。 如果镜像状态断开,则需要检查主体和镜像服务器之间的连接是否可能断开。 使用同步镜像时,主体服务器和镜像服务器之间的慢速连接可能会增加事务日志文件并占用磁盘空间,因为日志将保持活动状态,直到将其复制到镜像服务器为止。

Resolving the mirroring delay or disconnection problem, the transaction log will be truncated and available for reuse in the next log backup. As these records already copied to the mirrored server and not part of the active log now.

解决了镜像延迟或断开连接问题,事务日志将被截断并可以在下一个日志备份中重用。 由于这些记录已复制到镜像服务器,因此现在不属于活动日志。

The last log_reuse_wait_desc returned value we will check in this article is the ACTIVE_BACKUP_OR_RESTORE. This type indicates that a Full or Differential backup operation, running for a long time, is the cause of the transaction log reuse issue.

我们将在本文中检查的最后一个log_reuse_wait_desc返回值是ACTIVE_BACKUP_OR_RESTORE 。 此类型表示长时间运行的完全或差异备份操作是事务日志重用问题的原因。

During Full and Differential backup operations, the SQL Server will delay the transaction log truncation process in order to include the active part of the transaction logs in the backup in order to ensure the database consistent during the restoration process. You need to go deeply to investigate what makes the backup slow such as checking the disk IO system, so that, the backup will not take long time and consume the transaction log in future.

在完全备份和差异备份操作期间,SQL Server将延迟事务日志截断过程,以便将事务日志的活动部分包括在备份中,以确保数据库在还原过程中保持一致。 您需要深入研究导致备份变慢的原因,例如检查磁盘IO系统,以便备份不会花费很长时间并且将来会消耗事务日志。

结论 (Conclusion)

The SQL Server transaction log file is as an important component of the SQL Server database as the data file itself. SQL Server stores records for all database modifications and transactions to be used in the case of disaster or corruption and ensure the data consistency and integrity.

SQL Server事务日志文件与数据文件本身一样,是SQL Server数据库的重要组成部分。 SQL Server存储在发生灾难或损坏时要使用的所有数据库修改和事务的记录,并确保数据的一致性和完整性。

As a DBA, you should maintain the transaction log and keep t healthy by monitoring it and managing its growth. You can use any monitoring tool such as Microsoft SCOM to create an alert to notify you when the transaction log file free space reaches a specific threshold. Once you detect a transaction log issue you have to take it seriously and do an immediate action to resolve it, in order to prevent the growth side effects in the future.

作为DBA,您应该维护事务日志并通过监视和管理其增长来保持健康。 您可以使用任何监视工具(例如Microsoft SCOM)创建警报,以在事务日志文件的可用空间达到特定阈值时通知您。 一旦检测到事务日志问题,就必须认真对待并立即采取措施解决它,以防止将来出现增长副作用。

翻译自: https://www.sqlshack.com/managing-sql-server-transaction-logs/

你可能感兴趣的:(数据库,python,mysql,java,linux)