Pages

Saturday, February 18, 2012

All about “Not for Replication”

“Not for Replication” is a property which can be set for different objects like Check constraints, Foreign Key constraints, Triggers , Identity columns etc while using SQL Server Replication. Feature used when the DBA would like the transactions to behave differently when the changes are being made by the Replication Agents as compared to changes coming from a normal user transaction.
Let’s take a simple example

A company has various sales stores in different parts of the country, each sale store takes an order from the end user and replicates the request to the main production and dispatch unit of the company.
The main production and dispatch unit Server is acting as the Publisher whereas all the sale stores are acting as Subscribers in Merge Replication. Every Sale Store needs to have the information about the current stock level of a product and should not accept any new orders if the product is out of stock at the main production unit.
The tables participating in the Replication topology are as follows, with Products being the parent table and joined with PK-FK constraints with the Orders table on “Product_id” Column. The

Product table also has another column named as “Stock_Available” which keeps track of the current available stock for the particular product.
clip_image002


Products:
clip_image004


Orders:
clip_image006


In Addition to this there is also an AFTER INSERT Trigger (named as “Update_Stock”) on the Orders Table which checks if we have sufficient stock available for the product_id, if yes, it updates the Products table by subtracting the Quantity ordered from the Available Stock.

Sample Trigger:
create TRIGGER [dbo].[Update_Stock]

ON [dbo].[Orders]

AFTER INSERT

AS

BEGIN

declare @quantity as int

declare @product_id as int

declare @stock_available as int

select @product_id= product_id , @quantity = quantity from Inserted

select @stock_available= stock_available from Products where product_id=@product_id

if (@Quantity > @stock_available)

rollback

else

update Products set stock_available= stock_available- @quantity where product_id=@product_id

END

Now let’s assume that a new order is to be placed for the product “Baseball Bats” (Product_id=88) at one of the subscriber servers , let’s call it “Subscriber 1”, after the insert the data at the “Subscriber 1” will look like as below:-

Orders:
clip_image008

This insert (3rd row) will fire the insert trigger and will deduct 20 (100 – 20 =80 )from the “Stock_Available” column in the Products table for Product_id=88.

Products:
clip_image010

After this transaction at “Subscriber 1” , the “Subscriber 1” synchronizes with the Publisher. Now Replication detects that there has been in total two transactions made on “Subscriber 1” which needs to be uploaded to the main Publisher Server.
  1. “Insert” made on the table Orders .
  2. “Update” made for the table Products for the column “Stock_Available”. (Replication does not care if the update came from a trigger, for Replication it’s another normal update transaction made on the Products table).

After the Sync completes , we expect the data from “Subscriber 1” to get merged with the main Publisher and both the tables should look identical, but this what you will get on the Publisher after the first sync completes.

Orders:
clip_image008[1]

Products:
clip_image012

Oops, what happened there !!! The “Stock_Available” column at Publisher should have been set to “80” and not “60”.
Let me explain

Products is the parent table and Orders is the child table, now by default Merge processing order will replicate the transactions for parent table first followed by child table. As said earlier the update made by the insert trigger is also considered as a normal user transaction , Products table being the parent table, this update transaction flows to the Publisher first which sets the “Stock_Available” column to “80”. Now the following transaction, Insert on Orders table, is also propagated to the Publisher which further kicks off the “AFTER INSERT” trigger present on the Orders table at the Publisher and this trigger goes back to Products table and updates again the “Stock_Available” column value to Current value- Ordered Quantity (80-20=60).
You guessed it right, if you make another sync followed by the first sync without making any further changes, both the Publisher and Subscriber will have the same but wrong value set for “Stock_Available” column (i.e. 60).
Why do we need the “AFTER INSERT” trigger at the Publisher? What if orders can be entered directly at the production server (Publisher) in addition to dispatch unit servers (Subscribers)? The trigger at the Publisher would be required to correctly decrement the Products table.
How to avoid this situation

This is where the “Not for Replication” property comes into picture. Our intention is that if the INSERT is coming from the replication Merge Agent, it does not need to fire the trigger , if the INSERT is coming from a normal user connection, the trigger should get fired to update the available stock level. We can achieve this by marking the trigger as “Not for Replication” at all the replicas.
You can do this by changing the trigger definition as below:

ALTER TRIGGER [dbo].[Update_Stock]

ON [dbo].[Orders]

AFTER INSERT

NOT for Replication -- ‘marked as not for replication’

Side Note:  In the above scenario if Orders Table was the Parent table and Products the child table, you will *not* run into this issue. I think by now you know why. Try it!
The same “Not for Replication” setting applies for Check constraints, Foreign Key constraints and Identity columns on the Publisher and/or the Subscriber. For example, you may want to set a subscriber’s Check Constraint as “Not for Replication” to prevent Replication Agent from being restricted in what data it can replicated from the Publisher.

You can read more about NFR (“Not for Replication”) from the below link from SQL Books Online:-
Controlling Constraints, Identities, and Triggers with NOT FOR REPLICATION
http://msdn.microsoft.com/en-us/library/ms152529.aspx

Stored Procedure schema changes are not being replicated when using ”initialize with backup”

SQL Replication allows you to replicate user-defined Stored Procedures code to subscribers similar to replicating tables. When the sp is modified, the new schema/script is transferred to the subscriber. This ensures all subscribers have the most recent version of the SP.


Transactional Replication also allows you to setup a subscriber using a backup of the Publisher. This option is called ”initialize with backup”. We encountered a problem in SQL 2005/2008 replicating a stored procedure schema change if the subscriber was initialized using the ”initialize with backup” feature. The Log Reader would not distributed the SP schema change to the Distribution database and therefore not update the subscribers. We’re still investigating the root cause but wanted to provide these ways to avoid the problem.
sp_addsubscription . . . @sync_type = N'initialize with backup',
Solutions
-------------------
1.a) Create 1 subscriber (even if on a same publisher) using Snapshot Replication

1.b) Drop the "snapshot" subscriber, the SP changes are correctly replicated.

2) Create ALL subscribers using Replication Snapshot or Backup/Restore @sync_type = N'replication support only'.

3) Don’t Replicate user-defined SP schema modifications. Instead using Scripting to SSIS packages to move the modified SP code.

How to: Initialize a Transactional Subscription from a Backup with Multiple Backup Files

Below is a slight modification to the How to: Initialize a Transactional Subscription from a Backup steps to handle Publisher database backups to multiple BAK FILES.

1) Execute sp_addpublication (Transact-SQL) at the Publisher on the publication database. Specify a value of true for @allow_initialize_from_backup = N'true' and @immediate_sync = N'true'
2) Create a backup of the publication database using the BACKUP (Transact-SQL) statement.
a. Full Database Backup to multiple BAK files
b. Start copy of the multiple BAK files to the Subscriber (may take hours to perform)
c. Transaction Log backup to TRN file
d. Copy the Transaction log backup to the Subscriber
3) Restore the backup on the Subscriber using the RESTORE (Transact-SQL) statement.
a. Restore the Fullback from the BAK files using “no recovery”
b. Restore the Transaction log TRN backup file using “with recovery”
4) At the Publisher on the publication database, execute the stored procedure sp_addsubscription (Transact-SQL) with the following changes
5) Modify the parameters:
a. @sync_type = N'initialize with backup'
b. Add new parameters:
i. @backupdevicetype = 'Disk'
ii. @backupdevicename = 'C:\BACKUP\MyLogBackup.TRN'.

WATCH OUT:


For PUSH Distribution Agents, the TRN location (step 5.2) must be accessible from the Publisher by either
1) point to the original TRN file on the Publisher
-- or --
2) use UNC mapping to the TRN on the Subscriber.
If using @sync_type = N'initialize with backup' what do you plan to do about the user-defined Constraints, Identities, and Triggers? You have 2 options.
  1. Enable the “NOT FOR REPLICATION setting on the Publisher Constraints, Identities and Triggers before you create the Publication. Click HERE for details.
  2. Disable Constraints, Identities, and Triggers AFTER you restore on the Subscriber but before you enable the Subscription. For example:
EXEC sp_MSforeachtable @command1="ALTER TABLE ? DISABLE TRIGGER ALL"

Troubleshooting replication issues when the subscriber is initialized from the backup

When the subscriber is initialized from backup, there can be so many things that can go wrong. Below are some of the error messages that might encounter during replication setup.

Msg 21408

Cannot create the subscription. You must specify a value of "Active" or "Subscribed" for the @status parameter. This is because the value specified for the @sync_type parameter is "initialize with backup" or "replication support only".


Msg 18786

The specified publication does not allow subscriptions to be initialized from a backup. To allow initialization from a backup, use sp_changepublication: set 'allow_initialize_from_backup' to 'true'.


Msg 21407

Cannot create the subscription. If you specify a value of "initialize with backup" for the @sync_type parameter, you must subscribe to all articles in the publication by specifying a value of "all" for the @article parameter.


Msg 21399

The transactions required for synchronizing the subscription with the specified log sequence number (LSN) are unavailable at the Distributor. Specify a higher LSN.


Msg 21397

The transactions required for synchronizing the nosync subscription created from the specified backup are unavailable at the Distributor. Retry the operation again with a more up-to-date log, differential, or full database backup.


Quick Checks:

-- on the publisher database

select allow_initialize_from_backup, min_autonosync_lsn,* from syspublications

allow_initialize_from_backup = 1 min_autonosync_lsn = the lsn from which the commands should be applied at the subscriber

--on distribution database

select min_autonosync_lsn,* from distribution.dbo.mspublications

min_autonosync_lsn will be the same as the syspublications.min_autonosync_lsn

During the execution of sp_addsubscription, we perform validation in sp_MSaddautonosyncsubscription procedure before we create the subscription.

--on distribution database

select subscription_seqno,publisher_seqno,ss_cplt_seqno,* from distribution.dbo.mssubscriptions


min_autonosync_lsn should be always less than the subscription_seqno

When we execute sp_addsubscrption with @backupdevicename='backup device', we are executing the “RESTORE HEADERONLY” on the backup and getting the LASTLSN of the latest backup if the backup set contains multiple backups. All this logic is handled in sp_MSextractlastlsnfrombackup proc. This proc is executed inside sp_MSaddautonosyncsubscription procedure.

RESTORE HEADERONLY FROM DISK on the backup set and get the lastLSN from the latest backup. Use the following SQL to get the binary form of the lsn.
declare @numericlsn numeric(25,0)

declare @high4bytelsncomponent bigint,

@mid4bytelsncomponent bigint,

@low2bytelsncomponent int


--set the lsn here

set @numericlsn = 93000000070800001


select @high4bytelsncomponent = convert(bigint, floor(@numericlsn / 1000000000000000))

select @numericlsn = @numericlsn - convert(numeric(25,0), @high4bytelsncomponent) * 1000000000000000

select @mid4bytelsncomponent = convert(bigint,floor(@numericlsn / 100000))

select @numericlsn = @numericlsn - convert(numeric(25,0), @mid4bytelsncomponent) * 100000

select @low2bytelsncomponent = convert(int, @numericlsn)


SELECT convert(binary(4), @high4bytelsncomponent) + convert(binary(4), @mid4bytelsncomponent) + convert(binary(2), @low2bytelsncomponent)


This is the lsn from which the Distribution agent will pick up the commands from msrepl_transactions and apply at the subscriber.

Potential Problems

If the backup set contains multiple backup sets, the sp_addsubscrption logic select the first backup set and not the most recent backup set. As result, the Subscriber may not have the most recent copy of the Publication database. After all, reason we we’re doing this method is to reduce the data transfer time to the Subscriber. We didn’t count on the backup set having multiple copies of the Published database. Kinds of defeats trying to “save time” using Backups.

Solution: When using “allow initialization from backup files” always create the Publisher backup to a NEW backup (*.bak) file.

How to initialize the subscriber from backup

Here is the step by step procedure on how to initalize subscriber from backup

  1. Create the publication using user interface Replication Wizard.
  2. Under subscription options for the publication, set “allow initialization from backup files” to true
  3. Create a new full backup of the publisher database. If you have existing full backup of the publisher database, you can still use that backup set but we have take a new log backup or differential backup of the publisher database and restore at the subscriber.
  4. Replication UI does not allow the option to create the subscription to allow initialization from back. We have to TSQL when creating the subscription.
    exec sp_addsubscription @publication = N'Repl2000', …..
    @sync_type = N'initialize with backup',
    @backupdevicetype='Disk',
    @backupdevicename='C:\Repl2000_RestoreThis.bak'--this is the last backup used to restore on the subscriber that was taken after the publication was created
    go
    exec sp_addpushsubscription_agent …….
    go

  5. This should create the subscription and the Distribution agent should be replicating the commands successfully.

Friday, February 17, 2012

How to identify Foreign Key dependencies in SQL Server database?

To understand the database schema design especially to determine the referential integrity constraints and dependencies, here I’m sharing a sample script which will help you in following:
1)      To identify which table(s) are dependent on a given object
2)      Which columns are being referenced (have foreign keys defined on them)
3)      What action to perform in response to update/delete operations


Note: If you would like to run the below script to identify the forign key dependencies w.r.t to only one table e.g. ‘My Table’, uncomment the filter in the WHERE clause

SELECT ROW_NUMBER() OVER (ORDER BY OBJECT_NAME(fk.referenced_object_id), clm1.name) as 'S.No',fk.referenced_object_id as ReferencedObjID,
       constraint_column_id as ColumnID,
       OBJECT_NAME(fk.referenced_object_id) as [ReferencedTable(Parent)],
       SCHEMA_NAME (CAST(OBJECTPROPERTYEX(fk.referenced_object_id,N'SchemaId') AS bit)) as [ParentSchema],
       clm2.name as ReferencedColumnName,
       OBJECT_NAME(constraint_object_id) as ConstraintName,
       fk.parent_object_id as ReferencingObjID,
       OBJECT_NAME(fk.parent_object_id) as [ReferencingTable (Foreign)],
       clm1.name as ForeignKeyColumn,
       SCHEMA_NAME (CAST(OBJECTPROPERTYEX(fk.parent_object_id,N'SchemaId') AS bit)) as [ForeignSchema],
      [Action on Update] = CONVERT(varchar,CASE OBJECTPROPERTY(fk.referenced_object_id,'CnstIsUpdateCascade')
                           WHEN 1 THEN 'CASCADE'
                           ELSE 'NO_ACTION'
                           END),
       [Action on Delete] = CONVERT(varchar,CASE OBJECTPROPERTY(fk.referenced_object_id,'CnstIsDeleteCascade')
                           WHEN 1 THEN 'CASCADE'
                           ELSE 'NO_ACTION'
                           END)

FROM sys.foreign_key_columns fk
       JOIN sys.columns clm1
          ON fk.parent_column_id = clm1.column_id
          AND fk.parent_object_id = clm1.object_id
       JOIN sys.columns clm2
          ON fk.referenced_column_id = clm2.column_id
          AND fk.referenced_object_id= clm2.object_id
--WHERE OBJECT_NAME(fk.referenced_object_id) = 'My Table'  --- table name which is being referenced by other tables via Foreign Keys

ORDER BY OBJECT_NAME(fk.referenced_object_id)

Below is the output from AdventureWorks database






I hope this would help you at some point in carrying out your DBA or Developer task in understanding the database Schema design.

Monday, February 13, 2012

Do you know the Maximum number of databases that can be mirrored?

Recently I came across an interesting scenario where the SQL Server was frequently generating Deadlock Scheduler mini dumps on the mirrored SQL Server instance. In this post I put together various pieces of information, so thought of sharing with you.

Scenario: Suppose you are using SQL Server 2008 R2 Enterprise x64 edition installed on windows Server 2008 SP1. You are using SQL Server database mirroring feature and has setup database mirroring on 104 databases. You are frequently getting Deadlock Scheduler dump (Error 17884) on the mirror site, even though you restart SQL Server instance on the mirror site, still the Error 17884 deadlock scheduler appears within few minutes on the mirror site.

After analyzing the Deadlock Scheduler mini dump, generated on the very first occurrence of error 17884.

Important to note, SQL Server generate mini memory dump only on the first occurrence of Error 17884 and for the subsequent occurrences, it simply job an entry in the SQL Server error log something like

"New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 360 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the ""max worker threads"" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 10%%. System Idle: 88%%."


I found that the following:

Ø  There were 464 workers performing Parallel operation (i.e. Query Parallelism).

Ø  No lock blocking seen at the Server.

Ø  There were 372 workers performing DBM redo operation.

Total workers: 710
Workers in RUNNABLE state: 0

As per SQL Server configuration settings

name
minimum
maximum
config_value
run_value
max worker threads
128
32767
704
704

Before I proceed further Lets check the Mirror Server configuration

Intel(R) Xeon(R) CPU           L5630  @ 2.13GHz. Bios Version is HP  - 2
16 x64 level 8664, 2 Mhz processor (s).

Total Physical Memory = 49141 MB           
Available Physical Memory = 45098 MB       

For 64 bit operating system:

If total # of available logical CPU’s > 4 :   

Max worker threads = 512 + ((logical CPUS’s - 4) * 16

In our case max worker threads = 512 + ((16 - 4) * 16=704  which is default no. of worker thread SQL Server will spawn.

You must be wondering Max Worker Threads setting is correctly set as per above calculation

Before I comment on this, let’s talk about Database mirroring requirements for worker threads
·        Principal server:

1 global thread and 2 threads per each of the mirrored databases.

·        Mirror server: 64 bit architectures:

1 global thread, 2 threads per mirrored databases and one additional thread for each mirrored database for every 4 processor cores.

In short, the number of thread needed on mirror server are greater than those on the primary, so you need to use the mirror server system configuration settings to calculate the no. of databases that can be mirrored.

Number of threads required per mirrored database = (2+16/4) = 6 threads per database. Where 16 is the number of logical CPU cores on the mirror server

Maximum number of databases that can be mirrored = 704/6 = 117

Now you must be wondering, in our scenario, we are just mirroring 104 databases (which is less than 117 i.e. Max databases that can be mirrored), then what could be the cause error 17884 (Deadlock Scheduler) occurring too frequently on the mirror server.

In Summary, not all the 704 workers would be available to perform mirroring activities as some threads are assigned to core database activity on the server. Hence you have to set the value of max server thread to 1024 and you will be able to bring your mirror server back online and operational.

Additionally, you can run the below query to determine the number of worker threads available on your SQL Server.

select count(*) from sys.dm_os_threads

In my case it returned 759

Wednesday, February 8, 2012

Does SQL Server replication support secure FTP (FTPS) protocol ?

Few days back I came across an intertesting scenario and was asked the above question, so I thought of sharing with you guys...

Scenario: Let’s say, you are using SQL Server replication to replicate data between two sites which are connected over the public network via VPN. You are able to successfully initialize the subscription using the initial snapshot delivered over the FTP. Due to security reasons, you would like to replace FTP by an encrypted solution (FTPS, SFTP, FTP over SSH). Now the question is which of the these encrypted solutions (FTPS,SFTP,FTP over SSH) is compliant with Microsoft SQL Server replication?

To answer the above question:  In windows, the Secured version of FTP is “FTPS”. Whereas in UNIX System, it was called “FTP over SSH” or “FTPS”. The "FTPS" abbreviation is often mistakenly used to specify some kind of Secure FTP, by which people most often mean FTPS. Another (similar) mistake is that FTPS is thought to be some kind of FTP over SSL. In fact, FTPS is an abbreviation of "SSH File Transfer Protocol." This is not FTP over SSL and not FTP over SSH (which is also technically possible, but very rare).

Now I’ve successfully configured my FTPS site for snapshot folder and ensured that the following conditions are met:
1.     The Publisher must be running SQL Server.
2.     An FTP/FTPS server must be enabled and configured to point to the C:\Program Files\Microsoft SQL Server\MSSQL\Repldata\Ftp directory (default) or network share which you specified while configuring your distributor or in the specific publication properties SNAPHOT page.
For more information about configuring an FTP/FTPS server, see the Internet Information Services (IIS) documentation.
3.     The publication is configured to use an FTP snapshot: FTP Snapshot page in publication properties.
Note: This option is for delivery of Snapshot to subscriber using the FTP site/address you specify, not for generation of snapshot on FTP/FTPS site.
But I’m unable to successfully configure SQL Server replication. Though I’m able to create either PUSH/PULL subscription but unable to initialize the subscription by delivering an initial snapshot over the FTPS.

The replication (Distribution/Merge) agent fails with the below error:

The process could not connect to FTP site '<ftp_site_address>' using port n (Source: MSSQL_REPL20032)
The operation times out.

You are able to access the FTPS site from both the distributor and subscriber machine but the initial snapshot delivery is still failing with above error.
 
You must be wondering what could be the reason? Am I missing sometime? Did I misconfiguration something? and so on….
 
The cause behind this error is, SQL Server replication, till the latest release of SQL Server 2008 R2 doesn’t have any infrastructure to support FTPS. So what are my options, I’m concerned about data security while replicating the data over the wire on public network.
 
You can only use FTP for an initial snapshot delivery on to the subscriber in SQL Server replication. So far the security of data is concerned, the solution is twofold:
 
1)      If you are concerned about the security of initial snapshot delivery via FTP using VPN on public network, You use the following workaround which will require some custom code to be implemented by you:
a.      Generate an initial snapshot.
b.      Compress the snapshot folder, and encrypt the zip file. Or you can write your own custom code to encrypt the snapshot folder or compressed snapshot folder.
c.      Push the encrypted zip file to the FTP and transfer it across to the subscriber.
d.      De-crypt the zip file and uncompress the snapshot folder locally on the subscriber.
e.      Use –AltSnapshotFolder distribution/Merge agent command line parameter by specifying the location of the snapshot folder copied locally on the subscriber (step d above)
f.       Run the Distribution/Merge agent to deliver an initial snapshot to initialize the subscription.
2)      To ensure secure replication of on-going changes (TDS Streams) after the initial snapshot delivery, Use SSL Encryption on the distribution server.

I hope this would help you and/or could have clarified your doubts or have answered questions that you may have related to use of secured version of FTP (FTPS) in SQL Server replication. I look forward to your comments.

Happy Learning..!