Fixing damaged pages using page restore or manual inserts
Published Mar 23 2019 04:49 AM 848 Views
Microsoft
First published on MSDN on Jan 17, 2007

Here's an interesting scenario that cropped up today. You have a database on a RAID array that failed and has zero'd out a page. How can you get the data back?

There are two ways to do it, depending on the database recovery model and version of SQL Server - single-page restore or manual insert/select - both of which rely on you having a backup of the database. You can use single-page restore if you're on SQL Server 2005 and the database is in Full or Bulk-Logged recovery mode, otherwise you need to use the manual method, and that will only work as long as you know the data being salvaged hasn't changed since the last backup.

Let's try them both. Here's a script to create a test database and make a backup of it:



-- Create the database.


USE

master;

GO


CREATE

DATABASE dbccpagetest;

GO


ALTER

DATABASE dbccpagetest SET RECOVERY FULL;

GO


-- Create a table to play with.


USE

dbccpagetest;

GO


CREATE

TABLE sales (

salesID

INT IDENTITY,

customerID

INT DEFAULT CONVERT (INT, 100000 * RAND ()),

salesDate

DATETIME DEFAULT GETDATE (),

salesAmount

MONEY);

CREATE

CLUSTERED INDEX salesCI ON sales (salesID);

GO


-- Populate the table


SET

NOCOUNT ON;

GO


DECLARE

@count INT

SELECT

@count = 0

WHILE

(@count < 5000)

BEGIN



INSERT INTO sales (salesAmount) VALUES (100 * RAND ());

SELECT @count = @count + 1

END

;

GO


-- Take a full backup.


BACKUP

DATABASE dbccpagetest TO DISK = 'C:dbccpagetest.bak' WITH INIT;

GO


I'm going to simulate our scenario by shutting down the database and using a hex editor to zero out page 158 of the database. (This translates to byte offset 1294336 of the file being zero'd for 8192 bytes).


Now if I run checkdb, I get the following:



Msg 8909, Level 16, State 1, Line 1


Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:158) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).


CHECKDB found 0 allocation errors and 1 consistency errors not associated with any single object.


Msg 8928, Level 16, State 1, Line 1


Object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data): Page (1:158) could not be processed. See other errors for details.


CHECKDB found 0 allocation errors and 1 consistency errors in table 'sales' (object ID 2073058421).


CHECKDB found 0 allocation errors and 2 consistency errors in database 'dbccpagetest'.


repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (dbccpagetest).


What does the page look like?

DBCC

TRACEON (3604);

GO


DBCC

PAGE (dbccpagetest, 1, 158, 3);

GO



DBCC execution completed. If DBCC printed error messages, contact your system administrator.


PAGE: (0:0)


BUFFER:



BUF @0x02C0632C


bpage = 0x04C12000 bhash = 0x00000000 bpageno = (1:158)


bdbid = 9 breferences = 0 bUse1 = 37241


bstat = 0xc00009 blog = 0x89898989 bnext = 0x00000000


PAGE HEADER:


Page @0x04C12000


m_pageId = (0:0) m_headerVersion = 0 m_type = 0


m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0x200


m_objId (AllocUnitId.idObj) = 0 m_indexId (AllocUnitId.idInd) = 0 Metadata: AllocUnitId = 0


Metadata: PartitionId = 0 Metadata: IndexId = -1 Metadata: ObjectId = 0


m_prevPage = (0:0) m_nextPage = (0:0) pminlen = 0


m_slotCnt = 0 m_freeCnt = 0 m_freeData = 0


m_reservedCnt = 0 m_lsn = (0:0:0) m_xactReserved = 0


m_xdesId = (0:0) m_ghostRecCnt = 0 m_tornBits = 16777216


Allocation Status


GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED


PFS (1:1) = 0x60 MIXED_EXT ALLOCATED 0_PCT_FULL DIFF (1:6) = NOT CHANGED


ML (1:7) = NOT MIN_LOGGED


Msg 2514, Level 16, State 5, Line 2


DBCC PAGE error: Invalid page type - dump style 3 not possible.


Note the error at the end of the output - DBCC PAGE can't do an in-depth dump because it doesn't know what page type the page is. Let's try a full page hex dump using dump style 2 instead:

DBCC

PAGE (dbccpagetest, 1, 158, 2);

GO


PAGE: (0:0)


<page header etc skipped for brevity>


DATA:


Memory Dump @0x44F3C000


44F3C000: 00000000 00020000 00000000 00000000 †................


44F3C010: 00000000 00000000 00000000 00000000 †................


<deleted for brevity>


44F3DFE0: 00000000 00000000 00000000 00000000 †................


44F3DFF0: 00000000 00000000 00000000 00000000 †................


DBCC execution completed. If DBCC printed error messages, contact your system administrator.

It really is all zero. First we'll fix it using page restore.


USE

master;

GO


RESTORE

DATABASE dbccpagetest PAGE = '1:158' FROM DISK = 'C:dbccpagetest.bak';

GO


Processed 1 pages for database 'dbccpagetest', file 'dbccpagetest' on file 1.


The roll forward start point is now at log sequence number (LSN) 32000000047000001. Additional roll forward past LSN 33000000001700001 is required to complete the restore sequence.


RESTORE DATABASE ... FILE=<name> successfully processed 1 pages in 0.176 seconds (0.046 MB/sec).

Isn't that cool? You can restore up to 1000 single pages from a backup at a time. For VLDBs, this cuts the recovery time WAY down. Now we need to roll forward the log. We don't have any more log backups so we can finish the roll forward by backing up and restoring the tail of the log.


-- Need to complete roll forward. Backup the log tail...


BACKUP

LOG dbccpagetest TO DISK = 'C:dbccpagetest_log.bak' WITH INIT;

GO


-- ... and restore it again.


RESTORE

LOG dbccpagetest FROM DISK = 'C:dbccpagetest_log.bak';

GO


Processed 5 pages for database 'dbccpagetest', file 'dbccpagetest_log' on file 1.


BACKUP LOG successfully processed 5 pages in 0.146 seconds (0.248 MB/sec).


Processed 0 pages for database 'dbccpagetest', file 'dbccpagetest' on file 1.


RESTORE LOG successfully processed 0 pages in 0.004 seconds (0.000 MB/sec).

And now we should have a clean database:


DBCC

CHECKDB (dbccpagetest) WITH NO_INFOMSGS;

GO


Command(s) completed successfully.

Easy. But what if we can't do a page restore? Assuming I've corrupted the database in exactly the same way again, the first thing is to do is make sure we can restore the backup and then see what data range is on that page:


RESTORE

DATABASE dbccpagetest_copy FROM DISK = 'C:dbccpagetest.bak' WITH

MOVE N'dbccpagetest' TO N'C:dbccpagetest_copy.mdf',

MOVE N'dbccpagetest_log' TO N'C:dbccpagetest_log.ldf',

REPLACE;

GO


DBCC

PAGE (dbccpagetest_copy, 1, 158, 3);

GO


Processed 184 pages for database 'dbccpagetest_copy', file 'dbccpagetest' on file 1.


Processed 2 pages for database 'dbccpagetest_copy', file 'dbccpagetest_log' on file 1.


RESTORE DATABASE successfully processed 186 pages in 0.361 seconds (4.205 MB/sec).


PAGE: (1:158)


BUFFER:


BUF @0x02BE8D38


bpage = 0x03FB4000 bhash = 0x00000000 bpageno = (1:158)


bdbid = 10 breferences = 1 bUse1 = 38283


bstat = 0xc00009 blog = 0x159a2159 bnext = 0x00000000


PAGE HEADER:


Page @0x03FB4000


m_pageId = (1:158) m_headerVersion = 1 m_type = 1


m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x8200


m_objId (AllocUnitId.idObj) = 68 m_indexId (AllocUnitId.idInd) = 256


Metadata: AllocUnitId = 72057594042384384


Metadata: PartitionId = 72057594038386688 Metadata: IndexId = 1


Metadata: ObjectId = 2073058421 m_prevPage = (1:157) m_nextPage = (1:159)


pminlen = 28 m_slotCnt = 245 m_freeCnt = 11


m_freeData = 7691 m_reservedCnt = 0 m_lsn = (24:453:8)


m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0


m_tornBits = -1020457745


Allocation Status


GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED


PFS (1:1) = 0x60 MIXED_EXT ALLOCATED 0_PCT_FULL DIFF (1:6) = NOT CHANGED


ML (1:7) = NOT MIN_LOGGED


Slot 0 Offset 0x60 Length 31


Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP


Memory Dump @0x4542C060


00000000: 10001c00 d5030000 5bd30000 3f771101 †........[...?w..


00000010: b9980000 baa10a00 00000000 0500e0††††...............


UNIQUIFIER = [NULL]


Slot 0 Column 1 Offset 0x4 Length 4


salesID = 981


Slot 0 Column 2 Offset 0x8 Length 4


customerID = 54107


Slot 0 Column 3 Offset 0xc Length 8


salesDate = Jan 17 2007 4:35PM


Slot 0 Column 4 Offset 0x14 Length 8


salesAmount = 69.68


<deleted for brevity>


Slot 244 Offset 0x1dec Length 31


Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP


Memory Dump @0x4542DDEC


00000000: 10001c00 c9040000 bfa10000 57771101 †............Ww..


00000010: b9980000 c6b80500 00000000 0500e0††††...............


UNIQUIFIER = [NULL]


Slot 244 Column 1 Offset 0x4 Length 4


salesID = 1225


Slot 244 Column 2 Offset 0x8 Length 4


customerID = 41407


Slot 244 Column 3 Offset 0xc Length 8


salesDate = Jan 17 2007 4:35PM


Slot 244 Column 4 Offset 0x14 Length 8


salesAmount = 37.50


DBCC execution completed. If DBCC printed error messages, contact your system administrator.


So we're looking at salesID range 981 to 1225 inclusive. Before we can copy the rows back to the damaged database, we need to get rid of the corrupt page. Repair should delete the page for us. First I'm going to take another backup though - just in case something goes wrong!



BACKUP

DATABASE dbccpagetest TO DISK = 'C:dbccpagetest_corrupt.bak' WITH INIT;

GO


ALTER DATABASE dbccpagetest SET SINGLE_USER;

GO


DBCC

CHECKDB (dbccpagetest, REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;

GO


ALTER

DATABASE dbccpagetest SET MULTI_USER;

GO


Processed 184 pages for database 'dbccpagetest', file 'dbccpagetest' on file 1.


Processed 4 pages for database 'dbccpagetest', file 'dbccpagetest_log' on file 1.


BACKUP DATABASE successfully processed 188 pages in 0.380 seconds (4.052 MB/sec).


Msg 8909, Level 16, State 1, Line 1

Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:158) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).


The error has been repaired.


CHECKDB found 0 allocation errors and 1 consistency errors not associated with any single object.


CHECKDB fixed 0 allocation errors and 1 consistency errors not associated with any single object.


Repair: The Clustered index successfully rebuilt for the object "dbo.sales" in database "dbccpagetest".


Repair: The page (1:158) has been deallocated from object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data).


Msg 8945, Level 16, State 1, Line 1


Table error: Object ID 2073058421, index ID 1 will be rebuilt.


The error has been repaired.


Msg 8928, Level 16, State 1, Line 1


Object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data): Page (1:158) could not be processed. See other errors for details.


The error has been repaired.


Msg 8976, Level 16, State 1, Line 1


Table error: Object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data). Page (1:158) was not seen in the scan although its parent (1:154) and previous (1:157) refer to it. Check any previous errors.


The error has been repaired.


Msg 8978, Level 16, State 1, Line 1


Table error: Object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data). Page (1:159) is missing a reference from previous page (1:158). Possible chain linkage problem.


The error has been repaired.


CHECKDB found 0 allocation errors and 3 consistency errors in table 'sales' (object ID 2073058421).


CHECKDB fixed 0 allocation errors and 3 consistency errors in table 'sales' (object ID 2073058421).


CHECKDB found 0 allocation errors and 4 consistency errors in database 'dbccpagetest'.


CHECKDB fixed 0 allocation errors and 4 consistency errors in database 'dbccpagetest'.

We should check the row count to see that the count has dropped from the initial 5000 rows we inserted:


USE

dbccpagetest;

GO


SELECT

COUNT (*) FROM SALES;

GO


SELECT

COUNT (*) FROM sales WHERE salesID > 980 AND salesID < 1226;

GO

And we're down to 4755 rows, as expected with zero rows in that range. All we need to do now is to copy the missing rows over from the restored copy. Remember, this will only work if you know that the data being salvaged hasn't changed since the backup was taken - otherwise you'll have old and new data mixed in the table which will play havoc with your business. Before we copy the rows, we know we're got an identity column we'd like to preserve so we set IDENTITY_INSERT on which tells the server not to generate new identity values for the inserted rows.


-- Make sure identity values survive.


SET

IDENTITY_INSERT sales ON;

GO


-- Insert the missing rows.


SET

NOCOUNT OFF;

GO


INSERT

INTO sales (salesID, customerID, salesDate, salesAmount)

SELECT * FROM dbccpagetest_copy.dbo.sales AS R

WHERE R.salesID > 980 AND R.salesID < 1226;

GO


-- Restore identity behavior.


SET

IDENTITY_INSERT sales OFF;

GO


(245 row(s) affected)

We copy over 245 rows and checking the row count again says we're back to 5000 rows.

I want to post a lot more scenarios like this from now on - this may mean the posts are a little less meaty than last year but should happen a lot more often. If there's any particular scenario you'd like to see covered, add a comment and let me know.

(First post of the year! The diving vacation was excellent - 100 foot visibility, 85F water, and amazing creatures. I managed to get in 28 dives in the 8 diving days I had. Here's a picture of me about 40ft underwater in Indonesia to close.)



Version history
Last update:
‎Mar 23 2019 04:49 AM
Updated by: