Kevin Kempf's Blog

February 26, 2014

Controlling Data Guard Replication Lag

Filed under: Uncategorized — kkempf @ 12:35 pm

oracledg

Tuning Standby Lag with Oracle Active Data Guard

We bought active data guard because of increasing reporting demands on our primary database in our ERP environment. While active data guard doesn’t play nicely with Apps 11i out of the box (it’s read-only, and just to establish a forms session you need to be able to write), it can fill a nice role offloading CPU load by serving up near-real time reports for specific (in our case bolt on) applications which only need to read tables.

What is Near-Real Time?

Once an SLA is established for the “oldest” data the report can return, you can tweak Oracle to honor it. In my case, the example below shows me going from no lag target to 300 seconds and finally to 600 seconds. Note that once you settle on a number, you’d be wise to add a “scope=both” to the end of the alter system.

$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Wed Feb 26 10:26:49 2014

Copyright (c) 1982, 2013, Oracle. All rights reserved.
 
Connected to:
 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
 With the Partitioning, OLAP, Data Mining and Real Application Testing options
 SQL> show parameter archive_lag_target;

 NAME TYPE VALUE
 ------------------------------------ ----------- ------------------------------
 archive_lag_target integer 0
 SQL> alter system set archive_lag_target = 300;
 System altered.
 SQL> alter system set archive_lag_target = 600;
 System altered.
 

Lag Behavior

I found it interesting to note that left to “its own devices” the pattern of archivelog ships (and therefore application on the other end)becomes an inverse function of how active your database is. In other words, the recency of your data at your standby is related to how fast you fill your online redo logs (which is also a function of how big they are), plus the odd twist of system-driven logfile switches. Lets say you had 200MB redo logs with a nearly idle system. Your lag can get huge if not tuned!

The graphic below captures the result of SQL:
select to_number(substr(value,instr(value,':',1,2)+1,length(value))) + 60 * to_number(substr(value,instr(value,':',1,1)+1,2)) seconds from v$dataguard_stats@apps_to_dataguard where name = 'apply lag';

Dataguard

The left part of the graphic (before 10:21) shows data transport left untuned. From 10:21 to 11:21 you can see where I had it set to 300 seconds.
From 11:21 onward it’s set to 600 seconds.

Near-Real Time on Steroids: Real Time Apply

Check your licensing, your mileage may vary. The easiest way to keep the standby up to date is to use real-time apply and standby logs. To create standby logs, you go to your standby and cancel recovery:

alter database recover managed standby database cancel;

Next, add your standby logs. They need to be the same size as the online redo logs on the primary. Make N+1 of them on the standby. The syntax looks like this:

alter database add standby logfile group 41 ('/usr/local/oracle/redo/log41b.dbf','/u04/appprod/proddata/log41a.dbf') size 100M;
alter database add standby logfile group 42 ('/usr/local/oracle/redo/log42b.dbf','/u04/appprod/proddata/log42a.dbf') size 100M;
alter database add standby logfile group 43 ('/usr/local/oracle/redo/log43b.dbf','/u04/appprod/proddata/log43a.dbf') size 100M;
alter database add standby logfile group 44 ('/usr/local/oracle/redo/log44b.dbf','/u04/appprod/proddata/log44a.dbf') size 100M;

Finally, restart your recovery with the real-time apply:

alter database recover managed standby database disconnect using current logfile;
Now your apply lag and transport lag drop to zero:

Dataguard

In the graph above, you can see the final version of data apply rates to the standby.

8:58a-10:21a: unmanaged apply rate (mostly happening when a log on the primary got full)

10:21a-11:21a: honoring the 300 second alter system set archive_lag_target=300;

11:21a-4:30p: honoring the 600 second alter system set archive_lag_target=600;

4:30p-end: real time apply started with standby logfile groups (effectively 0).

Thanks Roth!

Thanks Roth!

Advertisements

February 24, 2014

Oracle RDBMS tier is Virtualized

Filed under: Uncategorized — kkempf @ 2:24 pm

vmware

Background

So we’ve been running non-Production versions of our Oracle 11i E-Business suite environment on VMWare since about 2006, and my Linux x86 PAE 11i front end on VMWare since at least 2008.  I never opened a ticket or had any issue with VMWare affecting a guest OS in any way, whether the kernel was Red Hat, Oracle Linux “Red Hat compatible” or even now the UEK.  Oracle protects themselves with this (Doc ID 249212.1):

 Oracle has not certified any of its products on VMware virtualized environments. Oracle Support will assist customers running Oracle products on VMware in the following manner: Oracle will only provide support for issues that either are known to occur on the native OS, or can be demonstrated not to be as a result of running on VMware.

The truth is, they try to scare you away from it, but it runs just fine, in my experience.  The big unknown was always the core database, running on 64-bit Linux.  It really does work hard, and has lots of moving parts.  As we began the project to convert it, we realized we had a lot of questions without firm answers, so we engaged a prominent 3rd party Oracle to VMWare integrator.  They did a great job managing the project, but to be honest, unbeknownst to us we’d already figured out most of the technical detail.

Project Flow

There was an extra wrinkle for our production go-live.  The server was physically moving from one data center to another one about 50 miles up the road.  We had good WAN links, but it still added some time.  While we considered leveraging dataguard to accomplish this, in the end we landed on an RMAN restore and Data Domain replication.

  • Create a VM with lots of CPUs and the same memory footprint as production.  Put in all the patches, updates, kernel parameters, and set up huge pages for the 64gb SGA.
  • Add disk from the SAN leveraging LVM and ext3.   We could have used ext4, but the Linux OS was Oracle Linux 5.10 (UEK) and ext3 felt more “tried and true”
  • Create a VMWare template at this point (this was actually done several times, after many, many OS tweaks)
  • Install the Oracle RDBMS software (11.2.0.4 in my case) as well as deploying the EM agent
  • Use RMAN to bring the database to the new VM from the physical box.  In the case of test runs, this was an RMAN duplicate.  For the final, live production run it was a little trickier:
    • RMAN full backup of the RDBMS the night prior, so that full can get across the WAN and my incremental difference will be smaller
    • Shut down the 11i front end so users can’t get in
    • RMAN (hot) backup the RDBMS, shut it down
    • Wait for the Data Domain to replicate the RMAN backup and archivelogs to where the new server was
    • RMAN restore and recover the database
    • IP changes
      • Perform internal DNS changes to the new vlan
      • Bring up the 11i front end at the new location with a VEEAM restore
      • Re-IP the two machines, via /etc/hosts and /etc/sysconfig/network, as well as FND_NODES changes (see 751328.1)

Results & Advantages

Since the move to VMWare, we’ve had no issues whatsoever.  The database runs mostly in memory, and the users are none the wiser.  VMWare does bring some advantages to the table:

  • Unlike a physical machine, where I have to be in the data center at a console, or possibly over a DRAC card or the like, when I reboot my machine now, I can watch it in VCenter.
  • I don’t have to worry about drivers.  I had a serious issue when our oddball 10gb NICs decided to stop working after a yum update on the physical box during a reboot.
  • If the physical server (which in the case of the RDBMS is effectively the same thing as the ESX host, as the host is entirely devoted to running Oracle) running the RDBMS breaks, overheats, has a bad memory chip, burns up a CPU, or gets struck by lightening, we can shut it down cold and storage v-motion it to another host in minutes.
  • It’s a mainstream, mature product.  I did a lot of homework on this, and the general consensus was that Oracle VM wasn’t ready yet, but there’s LOTS of people running Oracle on VMWare.

Blog at WordPress.com.