Post-it notes: Chapter 9 – Managing Replication

April 8, 2013

Flirting with the idea of taking exam [70-432: Microsoft SQL Server 2008, Implementation and Maintenance] I created loads of post-it notes based on the book: “The Real MCTS SQL Server 2008 Exam 70-432 Prep kit”. I’ve decided to re-visit these notes (before the stickiness wears off and I lose them) and transfer them onto this blog. This will ultimately be a great knowledge refresher and quick reference guide for me. Hopefully you’ll find them useful too. Thanks for reading.

Replication 

What?

  • Replication – copying and distributing data between databases and then synchronizing them to maintain consistency and integrity
  • 1 of 4 HA methods (mirroring, clustering and log shipping)
  • Advantage over others
    • can filter rows and columns
    • Can read from primary and secondary
    • Performance – have reports running off secondary as read-only
    • Availability – in DR – change connection manually to point to active node
    • Mobile devices – connect and synch then disconnect
    • Could use replication to load data from DW to staging area
    • Replication is both DR and HA solution
    • Replication components
      • Article = objects that are replicated – table, sp, view, some rows or some columns Publication – collection of articles. Articles can be in more than on pub
      • Subscription – request for a copy of the publication – where, what and push/pull
      • Publisher – instance that holds publications
      • Distributor – Store replication status, meta data, data queue, can be local/remote
      • Subscriber – Receives replicated data – pulled or pushed

Replication types

  • SNAPSHOT Replication
  • TRANSACTIONAL Replication
    • Transaction Replication with Updateable Subscriptions
    • Peer-to-Peer Replication
    • MERGE Replication

Replication Versions

  • All Replication – Distributor SQL Version must be no earlier than the publisher
  • Trans Replication – subscriber to pub, any version within 2 of publisher. E.g.
    • SQL 2000 Pub -> SQL 2008 Sub
    • SQL 2008 Pub -> SQL 2000 Sub
    • Merge Replication – subscriber can be any version no later than publisher

SNAPSHOT REPLICATION

  • Distributes data point in time
  • Good for small amount of data
  • Infrequent data changes
  • Acceptable out of date data
  • Primary key not required
  • Can write to subscription database, however each time it runs, will overwrite data
  • Locks held during snapshot replication

Components

  • SNAPSHOT AGENT (snapshot.exe)
    • Shared agent for all replication types
    • Initial schema and data scripts
    • Generates scripts for objects
    • Writes to snapshot folder (bcp)
    • Writes command to distributor
    • DISTRIBUTION AGENT (distrib.exe)
      • Read from the snapshot folder
      • Read from distribution database
      • Propagate commands to subscriber end
      • Delete commands from distribution table distribution..MSreplCommands

TRANSACTIONAL REPLICATION

  • Most used type
  • Replicate incremental changes not all data set
  • Use t-logs to replicate (however full or bulk logged not required)
  • Must have primary key
  • More scalable – less data to transfer
  • Good solution for large data volumes
  • Real time applications – minimal latency
  • Replicate between non SQL, e.g. Oracle. Oracle can be pub/sub – enterprise Ed only.

Components

  • SNAPSHOT AGENT (snapshot.exe)
    • Same as above
    • LOG READER AGENT (logread.exe)
      • Monitor changes in t-logs and propagate to subscribers
      • Data in T-logs moved into > distribution..MSrepl_transactions
      • Then generates commands to run on sub > distribution..MSrepl_commands
      • Commands in binary, use sp_browsereplcommands to read
      • DISTRIBUTION AGENT (distrib.exe)
        • Same as above

Two type of Trans Replication

  • Updateable subscriptions
    • Introduced SQL 2005
    • When configured (new column added unique identifier and a trigger)
    • Changes at subscriber are propagated back to publisher using MSDTC
    • Can have queued updating subscribers (queue reader agent) when few subscribers and infrequent changes at subs
    • Peer-to-Peer
      • All nodes subscribe and publish to each other (sub and ps on all nodes)
      • Enterprise/Developer Edition only
      • Scale out solution, multiple databases, multiple servers (up to 10 db’s)
      • Clients can connect to any node
      • Can remove databases or a node, others will be ok.
  • Can add new node without stopping others. SQL 2008 onwards! SQL 2005 have to suspend all activity
  • Peer-to-peer topology wizard or sp_addsubscription (specify LSN#)
  • Can detect conflicts, enabled by default, stops processing on one node
  • After config, you cannot disable peer-to-peer
  • No filtering allowed in peer-to-peer

MERGE REPLICATION

  • Start with snapshot
  • Changes at pub and sub tracked with triggers
  • Does not propagate intermediate data – performance enhancement
  • Working offline (mobile users) POS
  • When sub requires different partition of data
  • Replicating between different SQL Version (2000-2005-2008)
  • Supports conflict resolution (same a p2p)
  • Supports filtering
  • Merge replicate tables without primary keys

When you configure MERGE Replication

  • Creates insert/update/delete triggers in published tables
    • Msmerge_ins_guid
    • Msmerge_upd_guid
    • Msmerge_del_guid
    • Creates sp, views and conflict tables
    • Creates unique id field if it does not exist

CONFIGURING REPLICATION (Merge example)

  • Sys_admins have to enable a database for replication. Right click replication – publisher – properties
  • Db_owners – can create publications

Configuration of Distributor

  • Configure the distributor first! If you configure the publisher it will ask for details
  • Right click replication – create distributor
  • Select server
  • Configure snapshot folder
  • Configure distribution database
  • Add users (who can use distributor) from publisher

Configuration of Publisher

  • Right click local pubs – new publications
  • Select pub type – notice no p2p! Done differently – see next section
  • Select databases
  • Cannot replicate system databases, filestream or new data types in 2005 or DDL changes in SQL 2000.
  • Replicating views require underlying tables – either replicate or create manually
  • Trans replication – not allowed to select tables without primary key
  • Filtered columns must include primary key
  • Static filters – where clause
  • Dynamic filters – different data for different nodes (merge replication only)

Configuration of Subscriber

  • Right click local pubs – new subscription
  • Select publisher
  • Choose push or pull.  If it’s push, agents run on the distributor. Easier for admin. Pull run  agents on subscribers (reduce load)
  • Allocate subscriber database. If select existing db – will recreate all objects – careful.
  • Select users
  • Run on demand / Run continuously / Run at schedule
  • When to synch – now or on first load. Select now if  fewer loads on publisher

 

CONFIGURING REPLICATION (P2P example)

  • To configure P2P – you need to configure transactional replication first
  • Right click create tran replication, properties, subscription options – select p2p.
  • After you set to True, you cannot set back to False.
  • Right click transactional publisher now, option configure p2p

Conflict Resolution in MERGE REPLICATION

  • Merge replication, Merge agent detects conflict
  • Resolution based on set of rules
  • Default, publisher always wins
  • Can customize conflict rules
  • Default resolver is pub or sub with highest priority

Conflict Resolution in P2P REPLICATION

  • Previous versions no conflict resolution – introduced in SQL 2008!
  • Replication fails when conflict – fix manually – restart replication
  • 2008 – each row hidden column peer originator
  • Distributor agent compares these id’s – highest value wins

DDL Replication

  • Not supported in SQL 2000. If 2000 is one node, then change will not reflect
  • Ok in SQL 2005 by default (replicate schema changes set to true) – pub properties
  • DDL changes must be done at publisher
  • DDL Changes must be done by T-sql or SMO not SSMS. Cannot drop published objects.

Security

  • Snapshot agent – do not use same as sql agent service – rights too high. Needs read rights to snapshot folder
  • Connecting to publisher – recommend win authentication
  • Publication Access List (PAL) – additional security measure, any login that needs to read data must be in this group. Maintained within Publisher database
  • Account that connects to sub, must be db_owner
  • Account that connects to dist, must be db_owner and PAL

 Replication Retention

  • Trans Retention Period – minim time data kept in dist db (distributor)
  • Max Retention time – max time dat kept in dist db (distributor)
  • History retention – performance meta data (distributor)
  • Pub retention Period – after this time, non-active subs will be dropped (pub)

 Replication Performance

  • Initial snapshot can impact. Use backup / restore
  • Don’t replicate unnecessary objects and columns
  • Transactional replication uses t-logs to read, move to different drive
  • Distributor on a separate server
  • Snapshot agent moves data from pub to distrib – schedule at low peak times
  • Schedule regular intervals instead of continuously
  • Avoid text, filestream columns etc
  • Push vs pull (work is on distrib or on sub)

Monitoring

  • Replication Monitor (sqlmonitor.exe) and Perfmon
  • Perfmon can log data and run for long time
  • Sp_publication_validation, sp_article_validation, sp_table_validation
  • Tracer tokens used to measure latency (Pub to Dist) and (Dist to Sub)
  • Alerts
  • Performance monitor counters
    • Replication agents
      • Number of agents running
  • Replication distribution
    • Info about Dist and Sub
    • Performance between the two
    • Delivered commands, trans/sec and latency
  • Replication log read
    • Info about pub and dist
    • Performance, monitor trans replicaton
  • Replication Merge
    • Conflicts/sec, downloaded changes/sec, uploaded changes/sec
  • Replication Snapshot
    • Info about Pub and Dist
    • Delivered cmds/sec, delivered trans/sec
    • DMV’s – introduced in SQl 2005
    • Four DMV’s for replication
      • Sys.dm_repl_articles – what objects
      • Sys.dm_repl_schemas – what columns
      • Sys.dm_repl_tranhash – trans being replicated
      • Sys.dm_repl_traninfo – tran or cdc info

 

Thanks for reading.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: