SQL Server Knowledge share: DBA Myths

Showing posts with label DBA Myths. Show all posts

Saturday, February 18, 2012

All about “Not for Replication”

“Not for Replication” is a property which can be set for different objects like Check constraints, Foreign Key constraints, Triggers , Identity columns etc while using SQL Server Replication. Feature used when the DBA would like the transactions to behave differently when the changes are being made by the Replication Agents as compared to changes coming from a normal user transaction.

Let’s take a simple example

A company has various sales stores in different parts of the country, each sale store takes an order from the end user and replicates the request to the main production and dispatch unit of the company.
The main production and dispatch unit Server is acting as the Publisher whereas all the sale stores are acting as Subscribers in Merge Replication. Every Sale Store needs to have the information about the current stock level of a product and should not accept any new orders if the product is out of stock at the main production unit.
The tables participating in the Replication topology are as follows, with Products being the parent table and joined with PK-FK constraints with the Orders table on “Product_id” Column. The

Product table also has another column named as “Stock_Available” which keeps track of the current available stock for the particular product.

Products:

Orders:

In Addition to this there is also an AFTER INSERT Trigger (named as “Update_Stock”) on the Orders Table which checks if we have sufficient stock available for the product_id, if yes, it updates the Products table by subtracting the Quantity ordered from the Available Stock.

Sample Trigger:

create TRIGGER [dbo].[Update_Stock]

ON [dbo].[Orders]

AFTER INSERT

BEGIN

declare @quantity as int

declare @product_id as int

declare @stock_available as int

select @product_id= product_id , @quantity = quantity from Inserted

select @stock_available= stock_available from Products where product_id=@product_id

if (@Quantity > @stock_available)

rollback

else

update Products set stock_available= stock_available- @quantity where product_id=@product_id

END

Now let’s assume that a new order is to be placed for the product “Baseball Bats” (Product_id=88) at one of the subscriber servers , let’s call it “Subscriber 1”, after the insert the data at the “Subscriber 1” will look like as below:-

Orders:

This insert (3^rd row) will fire the insert trigger and will deduct 20 (100 – 20 =80 )from the “Stock_Available” column in the Products table for Product_id=88.

Products:

After this transaction at “Subscriber 1” , the “Subscriber 1” synchronizes with the Publisher. Now Replication detects that there has been in total two transactions made on “Subscriber 1” which needs to be uploaded to the main Publisher Server.

“Insert” made on the table Orders .
“Update” made for the table Products for the column “Stock_Available”. (Replication does not care if the update came from a trigger, for Replication it’s another normal update transaction made on the Products table).

After the Sync completes , we expect the data from “Subscriber 1” to get merged with the main Publisher and both the tables should look identical, but this what you will get on the Publisher after the first sync completes.

Orders:

Products:

Oops, what happened there !!! The “Stock_Available” column at Publisher should have been set to “80” and not “60”.

Let me explain

Products is the parent table and Orders is the child table, now by default Merge processing order will replicate the transactions for parent table first followed by child table. As said earlier the update made by the insert trigger is also considered as a normal user transaction , Products table being the parent table, this update transaction flows to the Publisher first which sets the “Stock_Available” column to “80”. Now the following transaction, Insert on Orders table, is also propagated to the Publisher which further kicks off the “AFTER INSERT” trigger present on the Orders table at the Publisher and this trigger goes back to Products table and updates again the “Stock_Available” column value to Current value- Ordered Quantity (80-20=60).
You guessed it right, if you make another sync followed by the first sync without making any further changes, both the Publisher and Subscriber will have the same but wrong value set for “Stock_Available” column (i.e. 60).
Why do we need the “AFTER INSERT” trigger at the Publisher? What if orders can be entered directly at the production server (Publisher) in addition to dispatch unit servers (Subscribers)? The trigger at the Publisher would be required to correctly decrement the Products table.

How to avoid this situation

This is where the “Not for Replication” property comes into picture. Our intention is that if the INSERT is coming from the replication Merge Agent, it does not need to fire the trigger , if the INSERT is coming from a normal user connection, the trigger should get fired to update the available stock level. We can achieve this by marking the trigger as “Not for Replication” at all the replicas.
You can do this by changing the trigger definition as below:

ALTER TRIGGER [dbo].[Update_Stock]

ON [dbo].[Orders]

AFTER INSERT

NOT for Replication -- ‘marked as not for replication’

Side Note: In the above scenario if Orders Table was the Parent table and Products the child table, you will not run into this issue. I think by now you know why. Try it!

The same “Not for Replication” setting applies for Check constraints, Foreign Key constraints and Identity columns on the Publisher and/or the Subscriber. For example, you may want to set a subscriber’s Check Constraint as “Not for Replication” to prevent Replication Agent from being restricted in what data it can replicated from the Publisher.

You can read more about NFR (“Not for Replication”) from the below link from SQL Books Online:-
Controlling Constraints, Identities, and Triggers with NOT FOR REPLICATION
http://msdn.microsoft.com/en-us/library/ms152529.aspx

Stored Procedure schema changes are not being replicated when using ”initialize with backup”

SQL Replication allows you to replicate user-defined Stored Procedures code to subscribers similar to replicating tables. When the sp is modified, the new schema/script is transferred to the subscriber. This ensures all subscribers have the most recent version of the SP.

Transactional Replication also allows you to setup a subscriber using a backup of the Publisher. This option is called ”initialize with backup”. We encountered a problem in SQL 2005/2008 replicating a stored procedure schema change if the subscriber was initialized using the ”initialize with backup” feature. The Log Reader would not distributed the SP schema change to the Distribution database and therefore not update the subscribers. We’re still investigating the root cause but wanted to provide these ways to avoid the problem.

sp_addsubscription . . . @sync_type = N'initialize with backup',

Solutions
-------------------
1.a) Create 1 subscriber (even if on a same publisher) using Snapshot Replication

1.b) Drop the "snapshot" subscriber, the SP changes are correctly replicated.

2) Create ALL subscribers using Replication Snapshot or Backup/Restore @sync_type = N'replication support only'.

3) Don’t Replicate user-defined SP schema modifications. Instead using Scripting to SSIS packages to move the modified SP code.

Troubleshooting replication issues when the subscriber is initialized from the backup

When the subscriber is initialized from backup, there can be so many things that can go wrong. Below are some of the error messages that might encounter during replication setup.

Msg 21408

Cannot create the subscription. You must specify a value of "Active" or "Subscribed" for the @status parameter. This is because the value specified for the @sync_type parameter is "initialize with backup" or "replication support only".

Msg 18786

The specified publication does not allow subscriptions to be initialized from a backup. To allow initialization from a backup, use sp_changepublication: set 'allow_initialize_from_backup' to 'true'.

Msg 21407

Cannot create the subscription. If you specify a value of "initialize with backup" for the @sync_type parameter, you must subscribe to all articles in the publication by specifying a value of "all" for the @article parameter.

Msg 21399

The transactions required for synchronizing the subscription with the specified log sequence number (LSN) are unavailable at the Distributor. Specify a higher LSN.

Msg 21397

The transactions required for synchronizing the nosync subscription created from the specified backup are unavailable at the Distributor. Retry the operation again with a more up-to-date log, differential, or full database backup.

Quick Checks:

-- on the publisher database

select allow_initialize_from_backup, min_autonosync_lsn,* from syspublications

allow_initialize_from_backup = 1 min_autonosync_lsn = the lsn from which the commands should be applied at the subscriber

--on distribution database

select min_autonosync_lsn,* from distribution.dbo.mspublications

min_autonosync_lsn will be the same as the syspublications.min_autonosync_lsn

During the execution of sp_addsubscription, we perform validation in sp_MSaddautonosyncsubscription procedure before we create the subscription.

--on distribution database

select subscription_seqno,publisher_seqno,ss_cplt_seqno,* from distribution.dbo.mssubscriptions

min_autonosync_lsn should be always less than the subscription_seqno

When we execute sp_addsubscrption with @backupdevicename='backup device', we are executing the “RESTORE HEADERONLY” on the backup and getting the LASTLSN of the latest backup if the backup set contains multiple backups. All this logic is handled in sp_MSextractlastlsnfrombackup proc. This proc is executed inside sp_MSaddautonosyncsubscription procedure.

RESTORE HEADERONLY FROM DISK on the backup set and get the lastLSN from the latest backup. Use the following SQL to get the binary form of the lsn.

declare @numericlsn numeric(25,0)

declare @high4bytelsncomponent bigint,

@mid4bytelsncomponent bigint,

@low2bytelsncomponent int

--set the lsn here

set @numericlsn = 93000000070800001

select @high4bytelsncomponent = convert(bigint, floor(@numericlsn / 1000000000000000))

select @numericlsn = @numericlsn - convert(numeric(25,0), @high4bytelsncomponent) * 1000000000000000

select @mid4bytelsncomponent = convert(bigint,floor(@numericlsn / 100000))

select @numericlsn = @numericlsn - convert(numeric(25,0), @mid4bytelsncomponent) * 100000

select @low2bytelsncomponent = convert(int, @numericlsn)

SELECT convert(binary(4), @high4bytelsncomponent) + convert(binary(4), @mid4bytelsncomponent) + convert(binary(2), @low2bytelsncomponent)

This is the lsn from which the Distribution agent will pick up the commands from msrepl_transactions and apply at the subscriber.

Potential Problems

If the backup set contains multiple backup sets, the sp_addsubscrption logic select the first backup set and not the most recent backup set. As result, the Subscriber may not have the most recent copy of the Publication database. After all, reason we we’re doing this method is to reduce the data transfer time to the Subscriber. We didn’t count on the backup set having multiple copies of the Published database. Kinds of defeats trying to “save time” using Backups.

Solution: When using “allow initialization from backup files” always create the Publisher backup to a NEW backup (*.bak) file.

Monday, February 13, 2012

Do you know the Maximum number of databases that can be mirrored?

Recently I came across an interesting scenario where the SQL Server was frequently generating Deadlock Scheduler mini dumps on the mirrored SQL Server instance. In this post I put together various pieces of information, so thought of sharing with you.

Scenario: Suppose you are using SQL Server 2008 R2 Enterprise x64 edition installed on windows Server 2008 SP1. You are using SQL Server database mirroring feature and has setup database mirroring on 104 databases. You are frequently getting Deadlock Scheduler dump (Error 17884) on the mirror site, even though you restart SQL Server instance on the mirror site, still the Error 17884 deadlock scheduler appears within few minutes on the mirror site.

After analyzing the Deadlock Scheduler mini dump, generated on the very first occurrence of error 17884.

Important to note, SQL Server generate mini memory dump only on the first occurrence of Error 17884 and for the subsequent occurrences, it simply job an entry in the SQL Server error log something like

"New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 360 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the ""max worker threads"" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 10%%. System Idle: 88%%."

I found that the following:

Ø There were 464 workers performing Parallel operation (i.e. Query Parallelism).

Ø No lock blocking seen at the Server.

Ø There were 372 workers performing DBM redo operation.

Total workers: 710

Workers in RUNNABLE state: 0

As per SQL Server configuration settings

name	minimum	maximum	config_value	run_value
max worker threads	128	32767	704	704

Before I proceed further Lets check the Mirror Server configuration

Intel(R) Xeon(R) CPU L5630 @ 2.13GHz. Bios Version is HP - 2

16 x64 level 8664, 2 Mhz processor (s).

Total Physical Memory = 49141 MB
Available Physical Memory = 45098 MB

For 64 bit operating system:

If total # of available logical CPU’s > 4 :

Max worker threads = 512 + ((logical CPUS’s - 4) * 16

In our case max worker threads = 512 + ((16 - 4) * 16=704 which is default no. of worker thread SQL Server will spawn.

You must be wondering Max Worker Threads setting is correctly set as per above calculation

Before I comment on this, let’s talk about Database mirroring requirements for worker threads

· Principal server:

1 global thread and 2 threads per each of the mirrored databases.

· Mirror server: 64 bit architectures:

1 global thread, 2 threads per mirrored databases and one additional thread for each mirrored database for every 4 processor cores.

In short, the number of thread needed on mirror server are greater than those on the primary, so you need to use the mirror server system configuration settings to calculate the no. of databases that can be mirrored.

Number of threads required per mirrored database = (2+16/4) = 6 threads per database. Where 16 is the number of logical CPU cores on the mirror server

Maximum number of databases that can be mirrored = 704/6 = 117

Now you must be wondering, in our scenario, we are just mirroring 104 databases (which is less than 117 i.e. Max databases that can be mirrored), then what could be the cause error 17884 (Deadlock Scheduler) occurring too frequently on the mirror server.

In Summary, not all the 704 workers would be available to perform mirroring activities as some threads are assigned to core database activity on the server. Hence you have to set the value of max server thread to 1024 and you will be able to bring your mirror server back online and operational.

Additionally, you can run the below query to determine the number of worker threads available on your SQL Server.

select count(*) from sys.dm_os_threads

In my case it returned 759

Monday, January 30, 2012

Does using Bulk Logged recovery model for Bulk Operations will reduce the size of Transaction log backups..?

I've been working on SQL Server from more than 5 years and I keep coming across various SQL Server misconceptions (or say a DBA Myth). In this specific post I’ll demystify a common DBA myth around log-shipping: "In a log shipping setup, to have smaller size T-log log backups being shipped to secondary, switch to "BULK_LOGGED" recovery model before performing any bulk operation"

SCENERIO:

- You have configured Log Shipping for a high OLTP database
- On Primary Server, to optimize the database performance, a daily RE-INDEXING job is scheduled
- Everyday, after the RE-INDEXING job is executed, T-log grows huge and size consecutive T-Log backup is large. It takes lot of time (Of course the Network Resources!!) to ship T-log backups to secondary server
- So, to resolve this issue, DBA came with below plan:

1. Change the recovery model of database to BULK_LOGGED
2. Perform RE-INDEXING (which qualifies to be minimally logged operation)
3. Change the recovery model back to FULL

- you believes, following these steps will reduce the size of T-log and thereby minimize network resource usage. This is "THE MYTH"

EXPLANATION:

- Backups under BULK_LOGGED recovery model, will include log records and the data pages changed by bulk operations, which will actually make a log backup very large.

Here’s an extract from the Books Online (http://technet.microsoft.com/en-us/library/ms190692(SQL.90).aspx). Note the parts marked in Bold Red below:

Compared to the full recovery model, which fully logs all transactions, the bulk-logged recovery model minimally logs bulk operations, although fully logging other transactions. The bulk-logged recovery model protects against media failure and, for bulk operations, provides the best performance and least log space usage.

Under the bulk-logged recovery model, if a log backup covers any bulk operations, the log backup contains both log records and the data pages that were changed by bulk operations. This is necessary to capture the results of the bulk-logged operations. The incorporated data extents can make a log backup very large. Additionally, backing up the log requires access to the data files that contain the bulk-logged transactions. If any affected database file is inaccessible, the transaction log cannot be backed up and all operations committed in that log are lost.

LET’S BREAK THIS MYTH – With some testing!!

For purpose of demonstration, I will be using AdventureWorks sample database.

--1. Verify the recovery model of the AdventureWorks database

sp_helpdb 'AdventureWorks' 
Go

AdventureWorks Recovery=FULL

--2. Verify current T-log usage

dbcc sqlperf(logspace)

Database Name Log Size (MB) Log Space Used (%) Status
------------------ ------------- ------------------ -----------
AdventureWorks 257.9922 1.920442 0

--3. Perform re-indexing on AdventureWorks database

Use AdventureWorks 
Go 
exec sp_MSforeachtable "DBCC DBREINDEX ('?')" 
Go

--4. Verify T-log usage after Re-Indexing

dbcc sqlperf(logspace)

Database Name Log Size (MB) Log Space Used (%) Status
----------------- ------------- ------------------ -----------
AdventureWorks 257.9922 41.09579 0

--5. Backup T-Log in FULL recovery model

BACKUP LOG [AdventureWorks] TO  DISK = N'C:\Backups\LOG_BCKUP_IN_FULL_RECOV.trn' 
GO

--6. Verify the size of T-log Backup in FULL recovery

Size = 102.9 MB

--8. Change the Recovery Model to BULK_LOGGED

USE [master] 
GO 
ALTER DATABASE [AdventureWorks] SET RECOVERY BULK_LOGGED WITH NO_WAIT 
GO

--9. Verify the recovery model of the AdventureWorks database

sp_helpdb 'AdventureWorks' 
Go

AdventureWorks Recovery=BULK_LOGGED

--10. Verify current T-log usage

dbcc sqlperf(logspace)

Database Name Log Size (MB) Log Space Used (%) Status
----------------- ------------- ------------------ -----------
AdventureWorks 257.9922 2.983337 0

--11. Perform re-indexing on AdventureWorks database

Use AdventureWorks 
Go 
exec sp_MSforeachtable "DBCC DBREINDEX ('?')" 
Go

--12. Verify T-log usage after Re-Indexing

dbcc sqlperf(logspace)

Database Name Log Size (MB) Log Space Used (%) Status
----------------- ------------- ------------------ -----------
AdventureWorks 257.9922 4.773189 0

--13. Backup T-Log in BULK_LOGGED recovery model

BACKUP LOG [AdventureWorks] TO  DISK = N'C:\Backups\ADW_REINDX_IN_BULK.trn' 
GO

--14. Verify the size of T-log Backup in BULK_LOGGED

Size = 109.7 MB

CONCLUSION: Myth Busted!!

- Bulk operations (like Re-indexing) under BULK_LOGGED recovery model are minimally logged, which will reduce T-log file growth. However this will NOT reduce the size of consecutive T-log backups[1]. So this approach is no good for scenario in subject.

NEXT STEPS:
- Go and test this yourself :)

REFERENCE READ:

Backup Under the Bulk-Logged Recovery Model : http://technet.microsoft.com/en-us/library/ms190692(SQL.90).aspx

Overview of the Recovery Models : http://msdn.microsoft.com/en-us/library/ms189275(SQL.90).aspx

[1] A T-log Backup under Bulk Logged recovery model will contain both ‘modified data pages’ and ‘information from T-log files’. As such, there are 2 additional factors that can influence number of data pages being backed up and consequently the overall size of T-log backup in bulk logged recovery model:
Page Fullness – Is derived from Average Bytes free per page and Average Page Density(FULL). The higher the numbers are, the less full the pages. So basis page fullness, we can consume more or less data pages for transactions.
Index Fill Factor – Fill factor is the % of space in an Index Page that is used when an index is created or rebuild. The purpose is to leave behind some free space in the index page to accommodate future insert/update to avoid page splits.

Pages