Create standby database using Active Data Guard

Database version: 11.2.0.x (11gR2) using ASM
OS: RedHat Enterprise Linux 6.x

One thing which is indispensable nowadays at any (reputable) company that works with IT is a copy of your primary database at a different datacenter/availability zone (cloud).

To do that, you have three options: create your own replication script (which can lead to an enormous amount of problems); using a non homologated software created by some company (which can be dangerous too), or buy an Oracle Data Guard license (which can be expensive).

Any one of those three strategies can present a considerable problem.

So, let's suppose that you have an Oracle Data Guard license. Now we can use a broker to control all replication between our primary and our new standby database.

Before starting our tutorial, let me define one simple rule: you have to install the same Oracle software version at your new machine (which is going to be your standby database), and preferably keep your storage solution (like ASM, but that is just to keep it beautiful - it is not mandatory).

Let's assume that we have a single instance primary database with a SID called "prod" and you want to create a single instance standby database with a SID called "dataguard", and you have 3 kinds of diskgroup: DATA, RECO and REDO. DATA will keep our datafiles, spfile, and these kind of things; RECO will keep our archives, and REDO will keep our redo log files.

1st: You have to do is to create some groups of standby logfile (depending on your archive generation, you may have to increase theses numbers):

ALTER DATABASE ADD STANDBY LOGFILE GROUP 1 ('+REDO', '+REDO');

2nd: Backup your primary database using rman. You have many ways to do that, but for example:


$ rman target /



Recovery Manager: Release 11.2.0.4.0 - Production on Wed Oct 7 14:46:41 2015



Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.



connected to target database: BIOPROD (DBID=1276674215)

RMAN> backup as compressed backupset database format '$BACKUP_HOME/BKP_%D%M%Y_%I_%s_%U.bkp' include current controlfile;

After that, transfer your backup (and archives) to your standby database machine (to the same directory).

3th: Configure your listener.ora adding the following settings.


$ vi $ORACLE_HOME/network/admin/listener.ora



SID_LIST_LISTENER =

  (SID_LIST=

    (SID_DESC=

      (GLOBAL_DBNAME=dataguard)

      (ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1/)

      (SID_NAME=dataguard))

    (SID_DESC=

      (GLOBAL_DBNAME=dataguard_DGMGRL)

      (ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1/)

      (SID_NAME=dataguard))

    (SID_DESC=

      (GLOBAL_DBNAME=dataguard_DGB)

      (ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1/)

      (SID_NAME=dataguard)))

Change your ORACLE_HOME variable to your own. The important here is to configure three kinds of service: dataguard, dataguard_DGMGRL and dataguard_DGB (they are essentials to broker).
Restart your listener.

4th: Configure your tnsnames.ora file on both machines (production and standby environment) with your database settings.


$ vi $ORACLE_HOME/network/admin/tnsnames.ora


dataguard =
(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = dataguard.com.br)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = dataguard)))


prod =
(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = prod.com.br)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = prod)))

5th: At your primary database, set the following parameters:


ALTER SYSTEM SET LOG_ARCHIVE_CONFIG='dg_config=(prod,dataguard)';

ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='service="dataguard", LGWR ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=5 reopen=15 db_unique_name="dataguard", valid_for=(all_logfiles,primary_role)';

ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2='DEFER';

ALTER SYSTEM SET DB_FILE_NAME_CONVERT='+DATA/dataguard','+DATA/prodbr' scope=spfile;

ALTER SYSTEM SET LOG_FILE_NAME_CONVERT='+RECO/dataguard','+RECO/prodbr' scope=spfile;

ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO;

ALTER SYSTEM SET LOG_ARCHIVE_MIN_SUCCEED_DEST=1;

A brief explanation about these parameters:
- The LOG_ARCHIVE_CONFIG parameter need to contain your DB_UNIQUE_NAME parameter from production and standby database. In this case, our SID and DB_UNIQUE_NAME are equal.
-The LOG_ARCHIVE_DEST_x configures a series of parameters about our replication, like if our synchronization will be asynchronous or synchronous (for more information, see the documentation). The important here is to know that data guard broker will synchronize his configuration with this parameter: if we change any property later, it will reflect on this parameter.
- The LOG_ARCHIVE_DEST_STATE_x just enable or disable your LOG_ARCHIVE_DEST_x configuration.
- The DB_FILE_NAME_CONVERT is a very important parameter to configure here. Be careful: wrong conversions could lead to a bunch of problems. Here, we need to configure what is our datafile directory source and say what will be our target source: in this example, we are reading our datafiles from +DATA/dataguard and we going to save it now at +DATA/prodbr. It's very important to configure this parameter if we are thinking to switchover the environments (change primary to standby, and the standby to primary roles).
- The LOG_FILE_NAME_CONVERT is like DB_FILE_NAME_CONVERT, except here we specify the red log files destinations. If you want to increase your security, you could configure this parameter with '+DATA' information (in case you have some redo log files lost in wrong directories, for example).
- The STANDBY_FILE_MANAGEMENT sets your file name configuration. Instead you define your datafile name or redo file name, the database will take care of it. So, when we create a datafile at +DATA, we don't need to specify the file path (like '+DATA/prod/datafile/datafile_01.dbf'); for example, just say it will be create at '+DATA' and oracle will create a file with a random id at the right location (like '+DATA/datafile/tablespace_name.123.4567891011').
- The LOG_ARCHIVE_MIN_SUCCEED_DEST defines the minimum success destination to archive logs before set the current redo log file to be available to use again. This prevent you lost data if something going wrong. If you configure more than one standby, and consequently configure more LOG_ARCHIVE_DEST_x parameters, consider increase this number.

Like I said, to understand better about these parameters, check Oracle documentation.

6th: Create your standby database pfile from your primary database, or create a new pfile with the minimum parameters to start it. For example:


$ sqlplus / as sysdba



SQL*Plus: Release 11.2.0.4.0 Production on Wed Oct 7 17:52:26 2015



Copyright (c) 1982, 2013, Oracle.  All rights reserved.





Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Partitioning, Automatic Storage Management, OLAP, Data Mining

and Real Application Testing options



SQL> create pfile = '/tmp/dataguard_pfile.ora' from spfile;



...



$ vi /tmp/dataguard_pfile.ora



*.db_name='

prod

'

*.memory_target=5G

*.processes = 500

*.audit_file_dest='/u01/app/oracle/admin/dataguard/adump'

*.audit_trail ='db'

*.db_block_size=8192

*.db_domain=''

*.db_recovery_file_dest='+RECO'

*.db_recovery_file_dest_size=100G

*.diagnostic_dest='/u01/app/oracle'

*.open_cursors=300

*.remote_login_passwordfile='EXCLUSIVE'

*.undo_tablespace='UNDOTBS1'

*.control_files = '+DATA/dataguard/control01.ctl', '+REDO/dataguard/control02.ctl'

*.compatible ='11.2.0.4'

*.log_archive_config='dg_config=(prod,dataguard)'

*.log_archive_dest_1='location=USE_DB_RECOVERY_FILE_DEST','valid_for=(ALL_LOGFILES, ALL_ROLES)'

*.log_archive_dest_2='service="prod", LGWR ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=5 reopen=15 db_unique_name="prod", valid_for=(all_logfiles,primary_role)'

*.log_archive_dest_state_1='enable'

*.log_archive_dest_state_2='defer'

*.db_file_name_convert='+DATA/prod','+DATA/dataguard'

*.log_file_name_convert='+DATA/prod','+DATA/dataguard','+RECO/prod','+RECO/dataguard','+REDO/prod','+REDO/dataguard'

*.standby_file_management=auto

*.log_archive_min_succeed_dest=1

*.db_unique_name='dataguard'

Remember two things: your db_name MUST be exactly like your production environment; and you need to create the audit_file_dest directory (in this case, /u01/app/oracle/admin/dataguard/adump'), otherwise you will receive a warning.

7th: Move your standby pfile to your standby machine, and then startup the new instance on nomount mode.


$ export ORACLE_SID=dataguard

$ sqlplus / as sysdba



SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 8 16:47:22 2015



Copyright (c) 1982, 2013, Oracle.  All rights reserved.



Connected to an idle instance.



SQL> startup nomount pfile='/home/oracle/dataguard_pfile.ora';

...

SQL> create spfile='+DATA/dataguard/spfiledataguard.ora' from pfile='/home/oracle/dataguard_pfile.ora';

File created.



SQL> shutdown immediate;

ORA-01507: database not mounted





ORACLE instance shut down.

SQL> startup nomount;

8th: Create your oracle password file (orapw) for SYS user, for both environments (you can use the same password from prod, just to keep it simple):


$ orapwd file=$ORACLE_HOME/dbs/orapwdataguard_nv password=dataguard_sys_password

$ orapwd file=$ORACLE_HOME/dbs/orapwprod_nv password=prod_sys_password

Do it on both environments (probably your prod machine will already have orapwprod file, but there is no problem if you overwrite it or not).

9th: Duplicate (restore) your dataguard database! Use the exactly command below:


$ rman target sys/prod_sys_password@prod auxiliary sys/dataguard_sys_password@dataguard



Recovery Manager: Release 11.2.0.4.0 - Production on Thu Oct 8 17:14:34 2015



Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.



connected to target database: PROD (DBID=12345678910)

connected to auxiliary database: PROD (not mounted)



RMAN> DUPLICATE TARGET DATABASE FOR STANDBY NOFILENAMECHECK DORECOVER;



Starting Duplicate Db at 08-OCT-15



contents of Memory Script:

{

   set until scn  54045879909;

   restore clone standby controlfile;

}

executing Memory Script



executing command: SET until clause



Starting restore at 08-OCT-15



channel C1: restoring control file

channel C1: copied control file copy

input file name=/backup/controlfile_bkp.bkp

output file name=+DATA/dataguard/control01.ctl

output file name=+REDO/dataguard/control02.ctl

Finished restore at 08-OCT-15



contents of Memory Script:

{

   sql clone 'alter database mount standby database';

}

executing Memory Script



sql statement: alter database mount standby database

RMAN-05529: WARNING: DB_FILE_NAME_CONVERT resulted in invalid ASM names; names changed to disk group only.



contents of Memory Script:

{

   set until scn  54045879909;

   set newname for tempfile  1 to

 "+data";



...



   restore

   clone database

   ;

}

executing Memory Script



executing command: SET until clause



executing command: SET NEWNAME



...



renamed tempfile 1 to +data in control file



...



executing command: SET NEWNAME



...



channel C1: starting datafile backup set restore

channel C1: specifying datafile(s) to restore from backup set

channel C1: restoring datafile 00001 to +data



...



channel C1: reading from backup piece /backup/BKP_06102015_1276674215_98210_t2qj01fp_1_1.bkp



...



Finished restore at 08-OCT-15



contents of Memory Script:

{

   switch clone datafile all;

}

executing Memory Script



...



released channel: C1

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of Duplicate Db command at 10/14/2015 01:40:41

RMAN-05501: aborting duplication of target database

RMAN-03015: error occurred in stored script Memory Script

RMAN-20506: no backup of archived log found

RMAN-06053: unable to perform media recovery because of missing log



RMAN-06025: no backup of archived log for thread 1 with sequence 109545 and starting SCN of 54055457212 found to restore

Do not be afraid: those final RMAN errors are "normal" (unfortunately). It means that you dont have the necessary archive with SCN ### to restore, which will be impossible to get anyway (to understand about that, please read more about archives and how they work).

10th: When you run the command DUPLICATE DATABASE with DORECOVER clause, all archives will be applied after restore. But it can fail if you do something wrong. So, just to insure that at leat all archives which was genereted together with backup was applied, let's do it manually.


$ rman target /



Recovery Manager: Release 11.2.0.4.0 - Production on Thu Oct 8 22:51:52 2015



Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.



connected to target database: PROD (DBID=1276674215, not open)



RMAN> catalog start with '/backup/archive';



Starting implicit crosscheck backup at 14-OCT-15

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=3395 device type=DISK

Crosschecked 173 objects

Finished implicit crosscheck backup at 14-OCT-15



Starting implicit crosscheck copy at 14-OCT-15

using channel ORA_DISK_1

Crosschecked 12 objects

Finished implicit crosscheck copy at 14-OCT-15



searching for all files in the recovery area

cataloging files...

no files cataloged



searching for all files that match the pattern /backup/archive



List of Files Unknown to the Database

=====================================

File Name: /backup/archive/ARCH_PROD_1276674215_99307_1_vbqjbmqr.BKP



...



Do you really want to catalog the above files (enter YES or NO)? YES

cataloging files...



cataloging done



List of Cataloged Files

=======================

File Name: /backup/archive/ARCH_BIOPROD_1276674215_99307_1_vbqjbmqr.BKP



...



RMAN> recover database;



Starting recover at 14-OCT-15

using channel ORA_DISK_1



starting media recovery

channel ORA_DISK_1: starting archived log restore to default destination

channel ORA_DISK_1: restoring archived log

channel ORA_DISK_1: reading from backup piece /backup/archive/ARCH_BIOPROD_1276674215_99508_1_5kqjdlul.BKP

channel ORA_DISK_1: piece handle=/backup/archive/ARCH_BIOPROD_1276674215_99508_1_5kqjdlul.BKP tag=BACKUP_ARCHIVELOG_BIONEXO



channel ORA_DISK_1: restored backup piece 1



...



unable to find archived log

archived log thread=1 sequence=109579

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of recover command at 10/15/2015 21:20:06



RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 109579 and starting SCN of 54055767606

11th: Now we need to configure our broker configuration. At your primary environment, run the dgmgrl command to access the broker and create a new configuration with the following commands:


$ dgmgrl /

DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production



Copyright (c) 2000, 2009, Oracle. All rights reserved.



Welcome to DGMGRL, type "help" for information.

Connected.

DGMGRL> create configuration MY_CONFIG as primary database is PROD connect identifier is PROD;



Configuration "MY_CONFIG" created with primary database "PROD"



DGMGRL> add database DATAGUARD as connect identifier is DATAGUARD maintained as physical;

Database "DATAGUARD" added

DGMGRL> show database DATAGUARD;



Database - DATAGUARD



  Role:            PHYSICAL STANDBY

  Intended State:  OFFLINE

  Transport Lag:   (unknown)

  Apply Lag:       (unknown)

  Apply Rate:      (unknown)

  Real Time Query: OFF

  Instance(s):

    DATAGUARD



Database Status:

DISABLED



DGMGRL> enable database DATAGUARD;

Enabled.

DGMGRL> edit database prodbr set state='transport-on';

Succeeded.

DGMGRL> edit database dgprodbr set state='apply-on';

Succeeded.

DGMGRL> show database verbose DATAGUARD;



Database - dgprodbr



  Role:            PHYSICAL STANDBY

  Intended State:  APPLY-ON

  Transport Lag:

5 days 16 hours 32 minutes 12 seconds (computed 1 second ago)

  Apply Lag:       5 days 20 hours 12 minutes 3 seconds (computed 1 second ago)

  Apply Rate:      (unknown)

  Real Time Query: ON

  Instance(s):

    DATAGUARD



  Database Warning(s):

    ORA-16855: transport lag has exceeded specified threshold



  Properties:

    DGConnectIdentifier             = 'dataguard'

    ObserverConnectIdentifier       = ''

    LogXptMode                      = 'ASYNC'

    DelayMins                       = '0'

    Binding                         = 'OPTIONAL'

    MaxFailure                      = '0'

    MaxConnections                  = '5'

    ReopenSecs                      = '15'

    NetTimeout                      = '900'

    RedoCompression                 = 'DISABLE'

    LogShipping                     = 'ON'

    PreferredApplyInstance          = ''

    ApplyInstanceTimeout            = '0'

    ApplyParallel                   = 'AUTO'

    StandbyFileManagement           = 'auto'

    ArchiveLagTarget                = '0'

    LogArchiveMaxProcesses          = '10'

    LogArchiveMinSucceedDest        = '1'

    DbFileNameConvert               = '+DATA/prodbr/, +DATA/dgprodbr/, +DATA/dataguard/, +DATA/dgprodbr/'

    LogFileNameConvert              = '+DATA/prodbr/, +DATA/dgprodbr/, +DATA/dataguard/, +DATA/dgprodbr/, +REDO/prodbr/, +REDO/dgprodbr/, +REDO/dataguard/, +REDO/dgprodbr/, +RECO/prodbr/,+RECO/dgprodbr/, +RECO/dataguard/, +RECO/dgprodbr/'

    FastStartFailoverTarget         = ''

    InconsistentProperties          = '(monitor)'

    InconsistentLogXptProps         = '(monitor)'

    SendQEntries                    = '(monitor)'

    LogXptStatus                    = '(monitor)'

    RecvQEntries                    = '(monitor)'

    ApplyLagThreshold               = '0'

    TransportLagThreshold           = '300'

    TransportDisconnectedThreshold  = '30'

    SidName                         = 'dataguard'

    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dataguard.com.br)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=dataguard_DGMGRL)(INSTANCE_NAME=dataguard)(SERVER=DEDICATED)))'

    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'

    AlternateLocation               = ''

    LogArchiveTrace                 = '0'

    LogArchiveFormat                = 'arch_%t_%s_%r.dbf'

    TopWaitEvents                   = '(monitor)'



Database Status:

WARNING

Warning!: If you get error "ORA-01033: ORACLE initialization or shutdown in progress" right after you try to add your database configuration, means your standby database is not open. Startup (open read only) and then retry.
If you get error "Error: ORA-01017: invalid username/password; logon denied" right after you try to add your database configuration, means that your password orapw file is wrong. You can try to recreate it (8th step) or you can just copy your orapw file from primary environment to your standby environment (scp or rsync).

In my case, the database status showing WARNING is normal, because I just configured the broker and the difference between my applied backup date (and applied archives) and my production database, is 5 days and 20 hours, since my transport lag threshold is 300 seconds (5 minutes), it must alarm.

I have some properties I like to configure manually, like:


  edit database dgprodbr set property LogXptMode='ASYNC';

  edit database dgprodbr set property MaxConnections = 5;

  edit database dgprodbr set property ReopenSecs = 15;

  edit database dgprodbr set property TransportDisconnectedThreshold = 30;

  edit database dgprodbr set property ArchiveLagTarget=120;

  edit database dgprodbr set property TransportLagThreshold=300;

  edit database dgprodbr set property NetTimeout=900;

  edit database dgprodbr set property StaticConnectIdentifier='(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dataguard.com.br)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=dataguard_DGMGRL)(INSTANCE_NAME=dataguard)(SERVER=DEDICATED)))';

To understand more about these parameters, just check oracle documentation.

12th: If you dont have Oracle Grid configured, then jump this step.
Now we are going to configure our Server Control Utility, doing the following command on oracle user:


$ srvctl add database -d dataguard -o $ORACLE_HOME -r PHYSICAL_STANDBY -p "+DATA/dataguard/spfiledataguard.ora" -a "DATA","RECO","REDO"

$ srvctl modify database -d dgprodbr -s 'open','read only'

$ srvctl config database -d dataguard

Database unique name: dataguard

Database name:

Oracle home: /u01/app/oracle/product/11.2.0/db_1

Oracle user: oracle

Spfile: +DATA/dataguard/spfiledataguard.ora

Domain:

Start options: open,read only

Stop options: immediate

Database role: PHYSICAL_STANDBY

Management policy: AUTOMATIC

Database instance: dgprodbr

Disk Groups: DATA,RECO,REDO

Services:

Now you can add some services if you want, but remember: services are related with your database role (PRIMARY, PHYSICAL STANDBY, LOGICAL STANDBY or SNAPSHOT STANDBY), e.g.:


$ srvctl add service -d dataguard -s srv_dataguard -l PHYSICAL_STANDBY

$ srvctl config service -d dataguard -s srv_dataguard

Service name: srv_dataguard

Service is enabled

Cardinality: SINGLETON

Disconnect: false

Service role: PHYSICAL_STANDBY

Management policy: AUTOMATIC

DTP transaction: false

AQ HA notifications: false

Failover type: NONE

Failover method: NONE

TAF failover retries: 0

TAF failover delay: 0

Connection Load Balancing Goal: LONG

Runtime Load Balancing Goal: NONE

TAF policy specification: NONE

Edition:

$ srvctl start service -d dgprodbr -s srv_dataguard

In this case, this service will be only available when your database is in physical standby mode. If you open it to snapshot standby, this service will not be available anymore.

And it's done! you can use your broker to check if your archive log is been transported, like:


dgmgrl /

DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production



Copyright (c) 2000, 2009, Oracle. All rights reserved.



Welcome to DGMGRL, type "help" for information.

Connected.

DGMGRL> show database dataguard;



Database - dataguard



  Role:            PHYSICAL STANDBY

  Intended State:  APPLY-ON

  Transport Lag:   2 minutes 15 seconds (computed 1 second ago)

  Apply Lag:       0 seconds (computed 1 second ago)

  Apply Rate:      545.00 KByte/s

  Real Time Query: ON

  Instance(s):

    dataguard



Database Status:

SUCCESS

Considerations:
* Always check your alert_$ORACLE_SID.log file, you can always get a good picture about what are happening with your replication (on both nodes, i.e., standby and primary database)
* Whenever you change your parameter which is shared with your broker, e.g., db_file_name_convert, you need to change the equivalent parameter in broker (in this case, DbFileNameConvert), or your broker database status will be set to WARNING, and some replications can fail if you do, for example, a shutdown and startup.

Oracle daily tasks - where is the secret?

sexta-feira, 16 de outubro de 2015

Create standby database using Active Data Guard

Nenhum comentário:

Postar um comentário

Arquivo do blog