Post-it notes: Chapter 7 – Maintaining Your Database

March 6, 2013

Flirting with the idea of taking exam [70-432: Microsoft SQL Server 2008, Implementation and Maintenance] I created loads of post-it notes based on the book: “The Real MCTS SQL Server 2008 Exam 70-432 Prep kit”. I’ve decided to re-visit these notes (before the stickiness wears off and I lose them) and transfer them onto this blog. This will ultimately be a great knowledge refresher and quick reference guide for me. Hopefully you’ll find them useful too. Thanks for reading.

SQL Fragmentation

Data Collation

  • Allow to store Multiple languages in a single column
  • Specify languages for instance, db or column
  • Set of rules for sorting, storing and comparing text
  • Affects most DB operations
  • Affects query results, import/export, backup and restore
  • Model db collation will be used for new databases
  • Default is set to most appropriate based on win regional settings
  • Collation rules
    • Case sensitive (CS) – upper vs lower case in sorting order
    • Accent sensitive (AS) – vav = or != vӒv
    • Kanatype sensitive (KS)  –hiragana = or != katakana
    • Width sensitive (WS) –  char that is the same but 1byte vs 2 byte, same?
    • Default unless  specified – CI AI KI WI
  • Windows Collation
    • Based on regional  settings
    • Unicode and non-unicode sorted the same way (sort vchar = sort nvarchar)
    • Choose win collation if possible
  • SQL Server Collation
    • Backward compatibility
    • SQL 6.5 and 7
    • Unicode != non-unicode sorting
    • Prefixed with sql_***
  • Binary Collation
    • Sort and compare based on binary value
    • Improve query performance
    • Simpler than other collations
    • Suffix of BIN_ (backward compatibility) or BINX_ (new)
  • SQL 2008 has 80 new collations
  • Select * from fn_helpCollations – list of collations
  • Temp tables use the user db collation, not tempdb
  • Database level collation
    • Uses server as default if not specified
    • Default for columns
    • Used in any queries against db
    • T-SQL > CREATE  database
  • Column Level Collation
    • Uses db if not specified
    • Char, nchar, varchar, nvarchar, text, ntext
    • T-SQL > CREATE database
  • Expression level collation
    • Overrides db collation
    • T-SQL > COLLATE clause
  • How to decide on  Collation
    • Upgrading from 5.6/7 – use SQL
    • Languages not used, use win suggested
    • Many languages, choose the most common
  • Collation – Backup and Restore
    • DB always retain source collation

Data Compression (ROW | PAGE)

  • Save space
  • New to 2008 (enterprise and developer)
  • Decreases performance (writes)
  • Two types of compression Row or Page
  • Applied to tables and indexes, heaps and clustered indexes
  • Use compression on non-clustered (stored separately)
  • Can configure compression on individual partitions
  • Compression not for system tables
  • ROW Compression
    • Reduces space by converting fixed length to variable length
    • Allows you to fit more rows per page
    • No effect on tinyint, smalldatetime, date, unique identifier, text, image, xml, table, sql_variant, cursor
    • No effect on fields that are already variable
  • PAGE Compression
    • Does the following
      • Row
      • Prefix (repeating values in column, replaced by ref#)
      • Dictionary (repeating values in page, replaced by ref#)
    • Page compression goes further than row
    • Page compression profound effect on writes
    • Enable Page compression, enables row by default
  • Estimated space savings: sp_estimate_data_compression_savings

Sparse Columns

  • Reduce amount of space taken up by NULLS
  • Reads take longer on non-null values
  • Should only be used on columns with lots of NULLS
  • Column1 varchar (50) SPARSE NULL
  • ColumnSets is a calculated column, returns XML of all SPARSE columns
  • ColumnSets are updateable
  • Sparse columns appropriate for filtered indexes

Maintaining Indexes

  • Heap – table with no index
  • Clustered index – sorts the table, 1 per table, index within table
  • Non-clustered index – separate IDX – can have many.
  • If CREATE or DROP Clustered index, it rebuilds all non-clustered
  • If dropping lots of indexes, drop non-clustered first, then clustered
  • Cannot drop an index that is supporting a primary or unique constraint
  • Update row – index either in place update or delete and insert
  • Delete row – index corresponding row deleted. Either delete or mark for delete (ghosting) ghost records ignored. Used for fast rollback
  • SELECT INTO creates heaps – lower fill factor – more space on disk
  • CREATE CLUSTERED INDEX xxx on SALES (salesid, ordered) include (xxx, yyy)
    • Salesid – foreign key
    • Salesid and ordered – combined primary key
    • Salesid – good for where, groupby part of query
    • Include – good for select fields in query
  • Index objects
    • Sys.dm_db_index_usage_stats
    • Sys.dm_db_missing_index_details
    • Sys.dm_db_missing_index_columns
    • Sys.dm_exec_query_stats
    • Sys.dm_db_index_operational_stats
    • Dbcc showcontig(employees)
    • Fillfactor – desired page fullness
    • PadIndex – if fill factor impacts non-leaf level
    • Sort in tempdb – sorting while creating
    • Ignore dupkey – for unique contraints
    • Drop_existing – use free space of old one
    • Online – access while creating (dev, ent, eval only)
    • Maxdop – max cpu’s creating indexes
  • Internal vs External Fragmentation
    • External
      • Logical order of pages stretches across extents
      • SQL spends more time accessing drives
      • Split page, SQL must load 2 extents (16 pages)
    • Internal
      • Refers to empty space within pages (deleted data)
      • Improves write performance (insert without re-arranging pages)
      • Decreases read performance (load more extents)
  • Index Rebuild vs Re-organize
    • Rebuild
      • More than 30% fragged
      • Not accessible unless ‘ONLINE’ – alter index
      • Very intensive
      • Can specify fill factor
      • Rebuilding does update stats (no need to do separately)
    • Re-Organize
      • Less than 30%
      • Not as intensive
      • Users can access (use page locks)
      • Takes longer
      • Cannot specify fill factor
      • Can cancel – no rollback
      • Not as thorough
  • ALTER INDEX command
    • Can DISABLE index
    • Rebuild Index (REBUILD)
    • Re-Organize Index  (REORGANIZE)
    • Change Index Options
    • Cannot add or change columns must use CREATE INDEX DROP EXISTING
  • Index FILL FACTOR guidelines
    • Read only database – fill factor 100%
    • Mostly Writes – fill factor 50%-70%
    • Both Read and wrote – fill factor 80%-90%

SQL Indexes

DBCC Explained

  • Most DBCC commands are actioned on a snapshot of the master DB – then deleted after use
  • Avoids congestion on  main db
  • Processor intensive, don’t use on heavy load server
  • Consumes space in tempdb
    • NOINDEX command skips nonclustered index (speed up process)
    • Or specify the index ID
    • Cluster index, heaps are always checked
    • DB must be in single user mode for a repair to occur
    • TABLOCK – no snapshot, do on main table
    • ESTIMATEONLY – how much space it will use in tempdb
    • PHYSICAL_ONLY – physical structure of pages, rows etc
    • A DBCC CHECKDB runs the following commands on every object in the table:
      • Checkalloc
      • Checktable
      • Checkcatalog

TLOG Shrink vs TLOG Truncation 

    •  Truncating a log file will change one or more VLF file’s status from active to inactive
    • Occurs at checkpoint (Simple recovery mode) or when you backup the LOG (Full/Bulk recovery mode)
    • If full/bulk recovery, checkpoints simply write dirty pages to disk.
    • You have to perform a tlog backup to force truncation in full/bulk – cannot rely on checkpoint – does nothing.
    • Truncation – Does not change the physical size of the TLOG (just frees up VLF’s to be used).
    • There is an option in TLOG (Auto-Shrink) which would physically shrink the file periodically.
    • You can manually force log truncation by issuing backup log (NO_LOG or TRUNCATE_ONLY)
    • BACKUP LOG WITH TRUNCATE_ONLY or WITH NO_LOG is deprecated. Note:  When you use this command you can’t do regular log backup and point in time recovery too. This command will truncate the log files (VLF) and break the log chain. To maintain the log chain again you have to run full or differential backup asap.
    • Shrinking the log file will reduce the physical log file size.
    • Shrinking the log file is not best practice until you don’t have any other option.
    • How to avoid the shrinking?
    • Frequently run the log backup and maintain the log size as small as possible.
    • Alter the database and point the log file to another drive/Add new drive.
    • Ready to accept the data loss? – Then puts the database in simple recovery mode and run the differential backup every 2 to 3 hours.
    • You can manually shrink the log file by using: DBCC SHRINKFILE (Test_Log)
    • Important note: Shrinking the data/log file will increase the fragmentation. Try to truncate the log file instead of shrinking. This means running the log backup more frequently.
    • Sometimes your shrink command will not shrink the database. When there is an active portion of log is available OR the transaction is not closed (not yet committed).
      • EMPTYFILE – moves all data to another file in filegroup. Then you could delete file (alter database). SQL will not write to that file again.
      • TARGET_SIZE – Tries to shrink down to that size
      • NOTRUNCATE – Although data moved, not released to OS
      • TRUNCATEONLY – release space to OS from end of extent, no page rearrangment 

Database Recovery Modes

  • Recovery model determines how SQL Server uses the logs for a database
  • 3 recovery models
    • Simple – logs re-used as soon as full
    • Full – all logs kept, all transactions written to log, all data recoverable
    • Bulk Logged – similar to fill, but ignore non-logged operations
  • System tables in simply mode, except msdb can be full

Database Backups

  • Can be done online, however resource intensive
  • Full backup – backup all data + small portion of log while backup was running
  • Diff Backup – backup extents modified since last full backup
  • Partial Backup – backup file groups and partitions extents that were modified since last full or      partial backup
  • File/Filegroup backup – full backup of file or filegroup
  • Copy-only – same as full however does not disrupt LSN / transactional log chain
  • Transaction log – backup log entries since last full backup. Also truncates ldf file.
  • Tail log backup – backup t-log just before a restore operation

Compressed Backups

  • Enterprise Only, SQL 2008 only
  • Any edition can restore
  • Performance advantage
  • More processor intensive
  • Cannot be read by 2005, 2000, 7 etc

Restoring Databases

  • During full restore db is usually not available
  • Some cases we can restore only part, leave rest of db online
  • Online restore is default when performing file, filegroup, piecemeal restores
  • Restoring individual pages – check suspect_pages table in msdb
    • Suspect page in index – rebuild index
    • Suspect page in table – restore from backup
  • System tables DR – if you can start sql, restore system databases. If you cannot start SQL, then you need to rebuild system tables

SQL Server Agent

  • 3 Key concepts – jobs, alerts and operators
  • Info stored in msdb
  • There is a separate agent service for each instance
  • Dbmail uses smtp, not reliant on sql service
  • Maintenance plans – must be part of sysadmin to view
  • Maint plans – msdb..sysmaintplan_log and sysmaintplan_logdetail

Policy Based Management

  • Introduced 2008
  • View/configure settings multiple servers
  • Apply changes to many
  • Enforce standards – aka – explicit admin
  • Access under management > Policy Management
  • 3 concepts
    • Targets
      • Objects to be managed (db, instance, table, index etc)
      • Can have target sets (e.g. all db’s hr_*)
    • Facets
      • Properties of a target
      • server facet – collation, version etc
      • db facet – name, size collation
      • index facet – etc etc
    • Conditions
      • States allowed for target e.g. name like table_xxx
  • 4 Evaluation Modes
    • On demand
    • On change: persistant (ddl trigger)
    • On change: log only (ddl trigger)
    • On schedule: agent
    • Above are used to check if target complies with policy
    • Writes to event log
    • Sample policies in xml format available which can be imported [tools\policies\dbengine\1033]

Data Collector

  • Collect and consolidate performance data
  • Made up of agent jobs, data structures, RS  and tools
  • RS and SQL Agent is required
  • All editions except express
  • DC components
    • Management Data Warehouse stores data
    • Data providers – perfmon, trace, queries
    • Collection items – what to collect, how often. Collection sets (grouping)
    • Data Collection Reports (RS)
    • Proxy accounts – used       to access data sources

Thanks for reading.


One Response to “Post-it notes: Chapter 7 – Maintaining Your Database”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: