Monday, March 12, 2012

Log Shipping Fails after 45 minutes of Success

Greetings:
I have spent the past coupleof weeks investigating, testing, and now
deploying Log Shipping on our SQL server(s). I thought I had the
implementation process down, and had it working (well it was working fine on
the test environment), but I have run into something 'painful'...
I have configured several databases to 'Log Ship' from our primary server (a
Windows 2000 Advanced Server running SQL 2000 Enterprise w/ SP3a) to our
secondary server (a Windows 2003 Enterprise Ed. running SQL 2000 Enterprise
Ed. w/SP3a). At this point, I have successfully configured 7 databases to
'Log Ship' under this configuration, with different copy & load schedules.
Unfortunately, one of these databases, specifically the one called dbGlobal
that is configured to replicate the most frequently (every 5 minutes) has
displayed the following behavior BOTH times I attempted to configure it from
scratch using the EM Wizard.
The initial configuration, creation of the destination database, restoration
of data and initial transaction logs run no problem. As a matter of fact,
configured to run every 5 minutes, this database appears to do what it is
expected to for 45 minutes. After 45 minutes, the Maintenance Plan suddenly
fails to load additional transaction logs to the destination database. I
have noticed the following symptoms:
In the SQL Server logs, I continually see this sequence of messages, all in
the same second the Maintenance Plan is activated (both before and after the
failure):
Starting up database 'dbGlobal'
Bypassing Recovery for database 'dbGlobal' because it is marked IN LOAD.
Bypassing Recovery for database 'dbGlobal' because it is marked IN LOAD.
Starting up database 'dbGlobal'
Recovery is checkpointing database 'dbGlobal' (7)
Log Restored: Database dbGlobal....
Note that when I view the list of databases on the secondary server in EM,
it lists all Log Shipped databases, including dbGlobal, as Read-Only.
Secondly, and I think this is much more informative, is the information from
the Job history and SQL command. When reviewing the job history for the Log
Shipping Restore of this database on the secondary server, as I said, they
all appear successful for the first 45 minutes, and then suddenly just start
failing. When I copy the SQL command from the maintenance plan step to SQL
Query Analyzer, I get the following series of errors:
"[Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 4305: [Microsoft][ODBC
SQL Server Driver][SQL Server]The log in this backup set begins at LSN
xxxxxxxxxxxxxxxx, which is too late to apply to the database. An earlier
log backup that includes LSN yyyyyyyyyyyyyyy can be restored."
In the above error message, there is a series of these. In each one,
xxxxx... and yyyyy... are different and not equal.
Finally, when viewing the Log Shipping Monitor (on a separate Windows 2000
box running SQL 2000 Standard Edition), the status indicates:
"Secondary out of sync"
And finally, one more anomaly I have noticed. Each database has its
transaction logs backed up to a separate directory. In the directory for
the dbGlobal transaction logs, a file with an extension of .tuf was created
each time I configured the Log Shipping Maintenance Plan. This was not
created in any of the other directories for the other databases. Could this
have something to do with it? And yes, it is very likely this database is
being modified during the initial Log Shipping configuration.
I was supposed to have this running and working by the beginning of
November, so needless to say I need some assistance as quickly as possible.
If anyone can help me identify why I can get this going, but not keep it
going on only this one database, I would be very appreciative. Thanks in
advance for your help, and I hope I provided enough but not too much info.
Sincerely,
Keith C. Jakobs, MCP
Keith,
It looks to me like there is another transaction log backup that is
"stepping on" your log shipping process. Your log shipping setup
expects all transaction logs from a given database, and if a backup of
the log is done outside of that log-shipping system, it essentially
leaves a gap that the log-shipping hangs on.
In my situation, I set up log-shipping on a 15-minute schedule, but
forgot that I also had a general maintenance plan set up for all user
databases that included a log backup. So, log-shipping worked until my
general maint plan took a log backup, and then it hung.
Remove any other jobs that are taking backups of the log in question,
re-build log-shipping, and you should be good to go.
Hope this helps.
Keith Jakobs, MCP wrote:
> Greetings:
> I have spent the past coupleof weeks investigating, testing, and now
> deploying Log Shipping on our SQL server(s). I thought I had the
> implementation process down, and had it working (well it was working fine on
> the test environment), but I have run into something 'painful'...
> I have configured several databases to 'Log Ship' from our primary server (a
> Windows 2000 Advanced Server running SQL 2000 Enterprise w/ SP3a) to our
> secondary server (a Windows 2003 Enterprise Ed. running SQL 2000 Enterprise
> Ed. w/SP3a). At this point, I have successfully configured 7 databases to
> 'Log Ship' under this configuration, with different copy & load schedules.
> Unfortunately, one of these databases, specifically the one called dbGlobal
> that is configured to replicate the most frequently (every 5 minutes) has
> displayed the following behavior BOTH times I attempted to configure it from
> scratch using the EM Wizard.
> The initial configuration, creation of the destination database, restoration
> of data and initial transaction logs run no problem. As a matter of fact,
> configured to run every 5 minutes, this database appears to do what it is
> expected to for 45 minutes. After 45 minutes, the Maintenance Plan suddenly
> fails to load additional transaction logs to the destination database. I
> have noticed the following symptoms:
> In the SQL Server logs, I continually see this sequence of messages, all in
> the same second the Maintenance Plan is activated (both before and after the
> failure):
> Starting up database 'dbGlobal'
> Bypassing Recovery for database 'dbGlobal' because it is marked IN LOAD.
> Bypassing Recovery for database 'dbGlobal' because it is marked IN LOAD.
> Starting up database 'dbGlobal'
> Recovery is checkpointing database 'dbGlobal' (7)
> Log Restored: Database dbGlobal....
> Note that when I view the list of databases on the secondary server in EM,
> it lists all Log Shipped databases, including dbGlobal, as Read-Only.
> Secondly, and I think this is much more informative, is the information from
> the Job history and SQL command. When reviewing the job history for the Log
> Shipping Restore of this database on the secondary server, as I said, they
> all appear successful for the first 45 minutes, and then suddenly just start
> failing. When I copy the SQL command from the maintenance plan step to SQL
> Query Analyzer, I get the following series of errors:
> "[Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 4305: [Microsoft][ODBC
> SQL Server Driver][SQL Server]The log in this backup set begins at LSN
> xxxxxxxxxxxxxxxx, which is too late to apply to the database. An earlier
> log backup that includes LSN yyyyyyyyyyyyyyy can be restored."
> In the above error message, there is a series of these. In each one,
> xxxxx... and yyyyy... are different and not equal.
> Finally, when viewing the Log Shipping Monitor (on a separate Windows 2000
> box running SQL 2000 Standard Edition), the status indicates:
> "Secondary out of sync"
> And finally, one more anomaly I have noticed. Each database has its
> transaction logs backed up to a separate directory. In the directory for
> the dbGlobal transaction logs, a file with an extension of .tuf was created
> each time I configured the Log Shipping Maintenance Plan. This was not
> created in any of the other directories for the other databases. Could this
> have something to do with it? And yes, it is very likely this database is
> being modified during the initial Log Shipping configuration.
> I was supposed to have this running and working by the beginning of
> November, so needless to say I need some assistance as quickly as possible.
> If anyone can help me identify why I can get this going, but not keep it
> going on only this one database, I would be very appreciative. Thanks in
> advance for your help, and I hope I provided enough but not too much info.
> Sincerely,
> Keith C. Jakobs, MCP

No comments:

Post a Comment