Monday, October 17, 2011

Data Protection Mode In Data Guard

Data Guard protection modes are simply a set of rules that the primary database must adhere to when running in a Data Guard configuration. A protection mode is only set on the primary database and defines the way Oracle Data Guard will maximize a Data Guard configuration for performance, availability, or protection in order to achieve the maximum amount of allowed data loss that can occur when the primary database or site fails

A Data Guard configuration will always run in one of the three protection modes listed above. Each of the three modes provide a high degree of data protection; however they differ with regards to data availability and performance of the primary database. When selecting a protection mode, always consider the one that best meets the needs of your business. Carefully take into account the need to protect the data against any loss vs. availability and performance expectations of the primary database

Data Guard can support multiple standby databases in a single configuration, they may or may not have the same protection mode settings depending on our requirements. The protection modes are 

1.) Maximum Performance  
2.) Maximum Availability    
3.) Maximum Protection      

1.)  Maximum Performance    This is the default mode, we get the highest performance but the lowest protection. This mode requires ASYNC redo transport so that the LGWR process never waits for acknowledgment from  the standby database for maximum performance.How much data we lose depends on the redo rate and how well our network can handle the amount of redo also known as transport lag. Even if we have a zero lag time we still will lose some data at fail-over time .

We can have up to 9 physical standby database in oracle 10g and 30 in oracle 11g and we will use the Asynchronous transport (ASYNC) with no affirmation of the standby I/O (NOAFFIRM). We can use this anywhere in the world but bear in mind the network latency and making sure it can support our redo rate .While it is not mandatory to have standby redo logs (SRL) in this mode, it is advise to do so. The SRL files need to be the same size as the online redo log files (ORL) . 

The following table describes the attributes that should be defined for the LOG_ARCHIVE_DEST_n initialization parameter for the standby database destination to participate in Maximum Performance mode. 
For example :   log_archive_dest_2='service=res ARCH  NOAFFIRM'        or
                       log_archive_dest_2='service=red LGWR ASYNC NOAFFIRM'

2.)  Maximum Availability   : Its first priority is to be available and  its second priority is zero loss protection, thus it requires the SYNC redo transport. This is the middle middle of the range, it offers maximum protection but not at the expense of causing problems with the primary database. However we must remember that it is possible to lose data, if our network was out for a period of time and the standby has not had a chance to re-synchronize and the primary went down then there will be data loss.

Again we can have up to  9 physical standby database in oracle 10g and 30 in oracle 11g  and we will use Synchronous transport (SYNC) with affirmation of the standby I/O (AFFIRM) and SRL files. In the event that the standby server is unavailable the primary will wait the specified time in the NET_TIMEOUT parameter before giving up on the standby server and allowing the primary to continue to process. Once the connection has been re-established the primary will automatically resynchronize the standby database.

When the NET_TIMEOUT expires the LGWR process disconnects from the LNS process, acknowledges the commit and proceeds without the standby, processing continues until the current ORL is complete and the LGWR cycles into a new ORL, a new LNS process is started and an attempt to connect to the standby server is made, if it succeeds the new ORL is sent as normal, if not then LGWR disconnects again until the next log switch, the whole process keeps repeating at every log switch, hopefully the standby database will become available at some point in time. Also in the background if we remember if any archive logs have been created during this time the ARCH process will continually ping the standby database waiting until it come online.

We might have noticed there is a potential loss of data if the primary goes down and the standby database has also been down for a period of time and here has been no resynchronization, this is similar to Maximum Performance but we do give the standby server a chance to respond using the timeout. The minimum requirements are described in the following table : 

For example  :   log_archive_dest_2='services=red LGWR SYNC AFFIRM



3.) Maximum Protection   : This offers the maximum protection even at the expense of the primary database, there is no data loss.  This mode uses the SYNC redo transport and the primary will not issue a commit acknowledgment to the application unless it receives an acknowledgment from at least one standby database, basically the primary will stall and eventually abort preventing any unprotected commits from occurring. This guarantees complete data protection, in this setup it is advised to have two separate standby databases at different locations with no Single Point Of Failures (SPOF's), they should not use the same network infrastructure as this would be a SPOF.


The minimum requirements are described in the following following table

For Example :   log_archive_dest_2='service=red LWGR SYNC AFFIRM'



Finally the protection mode will be changed from its default of Maximum Performance to Maximum Protection.The protection modes run in the order from highest (most data protection) to the lowest (least data protection)

Each of the Data Guard data protection modes require that at least one standby database in the configuration meet the minimum set of requirements listed in the table below.

















Reference  ::  http://www.datadisk.co.uk
                       http://www.idevelopment.info

For more detail   Click Here


Enjoy    :-) 


Data Guard Architecture Oracle 11g Part-III

The redo data transmitted from the primary database is written to the standby redo log on the standby database. Apply services automatically apply the redo data on the standby database to maintain consistency with the primary database. It also allows read-only access to the data.The main difference between physical and logical standby databases is the manner in which apply services apply the archived redo data.

There are two methods in which to apply redo  i.e,
1.)  Redo Apply (physical standby)     and
2.) SQL Apply   (logical standby).

They both have the same common features:
  • Both synchronize the primary database
  • Both can prevent modifications to the data
  • Both provide a high degree of isolation between the primary and the standby database
  • Both can quick transition the standby database into the primary database
  • Both offer a productive use of the standby database which will have no impact on the primary database

1.) Redo Apply (Physical Standby) :  Redo apply is basically a block-by-block physical replica of the primary database, redo apply uses media recovery to read records from the SRL into memory and apply change vectors directly to the standby database. Media recovery does parallel recovery for very high performance, it comprises a media recovery coordinator (MRP0) and multiple parallel apply rocesses(PR0?). The coordinator manages the recovery session, merges the redo by SCN from multiple instances (if in a RAC environment) and parses redo into change mappings partitioned by the apply process. The apply processes read data blocks, assemble redo changes from mappings and then apply redo changes to the data blocks.

This method allows us to be able to use the standby database in a read-only fashion, Active Data Guard solves the read consistency problem in previous releases by the use of a "query" SCN. The media recovery process on the standby database advances the query SCN after all dependant changes in a transaction have been fully applied. The query SCN is exposed to the user via the current_scn column of the v$databaseview  Read-only use will only be able to see data up to the query SCN and thus the standby database can be open in read-only mode while the media recovery is active, which make this an ideal reporting database.


We can use SYNC or ASYNC and is isolated from I/O physical corruptions, corruption-detections checks occur at the following key interfaces:

On the primary during redo transport - LGWR, LNS, ARCH use the DB_UTRA_SAFE parameter
On the standby during redo apply       - RFS, ARCH, MRP, DBWR use the DB_BLOCK_CHECKSUM and DB_LOST_WRITE_PROTECT parameters .

If Data Guard detects any corruption it will automatically fetch new copies of the data from the primary using gap resolution process in the hope of that the original data is free of corruption.The key features of this solution are
  • Complete application and data transparency - no data type or other restrictions
  • Very high performance, least managed complexity and fewest moving parts
  • End-to-end validation before applying, including corruptions due to lost writes
  • Able to be utilized for up-to-date read-only queries and reporting while providing DR
  • Able to execute rolling database upgrades beginning with Oracle Database 11g 

2.) SQL Apply (Logical Standby)  SQL apply uses the logical standby process (LSP) to coordinate the apply of changes to the standby database. SQL apply requires more processing than redo apply, the processes that make up SQL apply, read the SRL and "mine" the redo by converting it to logical change records and then building SQL transactions and applying SQL to the standby database and because there are more moving parts it requires more CPU, memory and I/O then redo apply .

SQL apply does not support all data types, such as XML in object relational format and Oracle supplied types such as Oracle spatial, Oracle inter-media and Oracle text .

The benefits to SQL apply is that the database is open to read-write while apply is active, while we can not make any changes to the replica data we can insert, modify and delete data from local tables and schemas that have been added to the database, we can even create materialized views and local indexes. This makes it ideal for reporting tools, etc to be used.

The key features of this solution are :
  • A standby database that is opened for read-write while SQL apply is active
  • A guard setting that prevents the modification of data that is being maintained by the SQL apply
  • Able to execute rolling database upgrades beginning with Oracle Database 11g using the KEEP IDENTITY clause


Click Here for Data Guard Architecture Oracle 11g Part-IV




Enjoy    :-)



Data Guard Architecture Oracle 11g Part-II

LNS (log-write network-server) and ARCH (archiver) processes running on the primary database select archived redo logs and send them to the standby database, where the RFS (remote file server) background process within the Oracle instance performs the task of receiving archived redo-logs originating from the primary database .

The LNS process support two modes  as
1.) Synchronous    and
2.) Asynchronous.

1.) Synchronous Mode :  Synchronous transport (SYNC) is also referred to as "zero data loss" method because the LGWR is not allowed to acknowledge a commit has succeeded until the LNS can confirm that the redo needed to recover the transaction has been written at the standby site. In the below diagram, the phases of a transaction are





The user commits a transaction creating a redo record in the SGA, the LGWR reads the redo record from the log buffer and writes it to the online redo log file and waits for confirmation from the LNS. The LNS reads the same redo record from the buffer and transmits it to the standby database using Oracle Net Services, the RFS receives the redo at the standby database and writes it to the SRL. When the RFS receives a write complete from the disk, it transmits an acknowledgment back to the LNS process on the primary database which in turns notifies the LGWR that the transmission is complete, the LGWR then sends a commit acknowledgment to the user.

This setup really does depend on network performance and can have a dramatic impact on the primary databases, low latency on the network will have a big impact on response times. The impact can be seen in the wait event "LNS wait on SENDREQ" found in the v$system_event dynamic performance view.

2.) Asynchronous  ModeAsynchronous transport (ASYNC) is different from SYNC in that it eliminates the requirement that the LGWR waits for a acknowledgment from the LNS, creating a "near zero" performance on the primary database regardless of distance between the primary and the standby locations. The LGWR will continue to acknowledge commit success even if the bandwidth prevents the redo of previous transaction from being sent to the standby database immediately. If the LNS is unable to keep pace and the log buffer is recycled before the redo is sent to the standby, the LNS automatically transitions to reading and sending from the log file instead of the log buffer in the SGA. Once the LNS has caught up it then switches back to reading directly from the buffer in the SGA .

The log buffer ratio is tracked via the view X$LOGBUF_READHIST a low hit ratio indicates that the LNS is reading from the log file instead of the log buffer, if this happens try increasing the log buffer size.

The drawback with ASYNC is the increased potential for data loss, if a failure destroys the primary database before the transport lag is reduced to zero, any committed transactions that are part of the transport lag are lost. So again make sure that the network bandwidth is adequate and that get the lowest latency possible.


A log file gap occurs whenever a primary database continues to commit transactions while the LNS process has ceased transmitting redo to the standby database (network issues). The primary database continues writing to the current log file, fills it, and then switches to a new log file, then archiving kicks in and archives the file, before we know it there are a number of archive and log files that need to be processed by the the LNS basically creating a large log file gap.

Data Guard uses an ARCH process on the primary database to continuously ping the standby database during the outage, when the standby database eventually comes back, the ARCH process queries the standby control file (via the RFS process) to determine the last complete log file that the standby received from the primary. The ARCH process will then transmit the missing files to the standby database using additional ARCH processes, at the very next log switch the LNS will attempt and succeed in making a connection to the standby database and will begin transmitting the current redo while the ACH processes resolve the gap in the background. Once the standby apply process is able to catch up to the current redo logs the apply process automatically transitions out of reading the archive redo logs and into reading the current SRL. The whole process can be seen in the diagram below  :





















Click Here for Data Guard Architecture Oracle 11g Part-III


Enjoy      :-)


Data Guard Architecture Oracle 11g Part-I

I have decided to post the Architecture of the Standby Database, although there are lots of  stuff on the Internet but most of them are lengthy and are not so juicy . I have read a good notes on Standby Database  Architecture and further decided to post it . Though, I have modified few topics to make it more clear , juicy and interesting .Hope you all find helpful  and enjoy this after reading. 

Oracle Data Guard is the most effective and comprehensive data availability, data protection and disaster recovery solution for enterprise databases. It provides a method for customers to actively utilize their disaster recovery configuration for read-only queries and reports while it is in standby role. Additionally, a standby database can be used to offload backups from production databases or for Quality Assurance and other test activities that require read-write access to an exact replica of production. These capabilities are unique to Oracle .

Oracle   Data  Guard is  the  management,  monitoring,  and  automation  software  infrastructure that creates,maintains, and monitors one or more standby databases  to  protect  enterprise  data  from  failures, disasters, errors, and corruptions.Data Guard is basically a ship redo and then apply redo, as we know redo is the information needed to recover a database transaction. A production database referred to as a primary database transmits redo to one or more independent replicas referred to as standby databases. Standby databases are in a continuous state of recovery, validating and applying redo to maintain synchronization with the primary database. A  standby database will also automatically re-synchronize if it becomes temporary disconnected to the primary due to power outages, network problems, etc.

The diagram below shows the overview of Data Guard, firstly the redo transport services transmits redo data from the primary to the standby as it is generated, secondly services apply the redo data and update the standby database files, thirdly independently of Data Guard the database writer process updates the primary database files and lastly Data Guard will automatically re-synchronize the standby database following power or network outages using redo data that has been archived at the primary.






Redo records contain all the information needed to reconstruct changes made to a database. During  recovery the database will read the  change vectors in the redo  records and apply  the changes  to  the relevant blocks.Redo  records are  buffered  in a circular fashion in  the redo log  buffer  of the SGA, the  log  writer  process (LGWR) is  the  background process  that handles redo log buffer  management. The LGWR at specific times writes redo log entries into a sequential file (online redo log file) to free space in the buffer, the LGWR writes the following.

1.) A commit record :   When ever a transaction is committed the LGWR writes the transaction redo records from the buffer to the log file and assigns a system change number (SCN), only when this process is complete is  the transaction said to be committed.

2.) Redo log buffers :  If the redo log becomes a third full or if 3 seconds have passed sine the last time the LGWR wrote to the log file, all redo entries in the buffer will be written to the log file. This means  that redo records can be written to the log file before the transaction has been committed and if  necessary media recovery will rollback these changes using undo that is also part of the redo entry.

Remember that the LGWR can write to the log file using "group" commits, basically entire list of redo entries of waiting transactions (not yet committed) can be written to disk in one operation, thus reducing I/O. Even through the data buffer cache has not been written to disk, Oracle guarantees that no transaction will be lost due to the redo log having successfully saved any changes.

Data Guard Redo Transport Services  coordinate the  transmission of redo from  the primary  database to the standby database, at the same time the LGWR  is  processing redo, a separate Data Guard process called the Log Network Server (LNS) is reading from the redo buffer in the SGA and passes redo to Oracle Net Services from transmission to a standby database, it is possible to direct the redo data to nine standby databases, we can also use Oracle RAC and they don't all need to be a RAC setup. The process Remote File Server  (RFS) receives the redo from LNS and writes it to a sequential file called a standby redo log file (SRL).


Click Here for Data Guard Architecture Oracle 11g Part-II


Enjoy   :-)


Friday, October 14, 2011

ORA-01000: Maximum Open Cursors Exceeded


Once our client report that they are facing error  “ORA-01000: maximum open cursors exceeded” while  running a application . As it seems from error that the error is related to cursors limits i.e, open cursors are exceeding from it's defaults values. To solving this issue ,let's have a look on the open_cursor i.e, what is open_cursor and how it impact into  database.

Open cursors take up space in the shared pool, in the library cache. To keep a renegade session from filling up the library cache, or clogging the CPU with millions of parse requests, we set the parameter OPEN_CURSORS.

OPEN_CURSORS sets the maximum number of cursors each session can have open, per session. For example, if OPEN_CURSORS is set to 1000, then each session can have up to 1000 cursors open at one time. If a single session has OPEN_CURSORS # of cursors open, it will get an ora-1000 error when it tries to open one more cursor.
To solve this issue we can either increase the no. of open_cursors or kill the inactive session which has open the large number of cursors. Now we connect to the database and check the open_cursors limits :

C:\>sqlplus sys/xxxx@noida as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Fri Oct 14 18:05:30 2011
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> sho parameter open_cursors
NAME                          TYPE          VALUE
-----------------        ---------      -----------
open_cursors              integer            300

Since the no. of open_cursors is 300. So we list the top 10 sessions which are currently opening most cursors

SQL> select * from ( select ss.value, sn.name, ss.sid
 from v$sesstat ss, v$statname sn
 where ss.statistic# = sn.statistic#
 and sn.name like '%opened cursors current%'
 order by value desc) where rownum < 11 ;

VALUE           NAME                              SID
-----            -----------------                ----------
300         opened cursors current          131
300        opened cursors current          125
300        opened cursors current           143
300        opened cursors current          149
300        opened cursors current           17
300        opened cursors current           132
300        opened cursors current           23
300        opened cursors current           1
300        opened cursors current           9
300        opened cursors current          10
10 rows selected.

Now we check what make session 131 open to many cursors?

SQL>  select sid, status, event, seconds_in_wait state "wait(s)" , blocking_session "blk_sesn", prev_sql_id  "SQL_ID"  from v$session where sid=131;

SID  STATUS      EVENT                  WAIT(s)  STATE    BLK_SESN    SQL_ID
---   ----------   ------------------ --------     ---------   ---------    ---------------
131 INACTIVE rdbms ipc message   8745     WAITING                 6mqvntr9ytnga


Since the status of the cursor is INACTIVE so we can we kill the session by using the below command :

SQL> alter system kill session 'sid,serial#' immediate;

The other alternatives is to increase the no. of the open_cursors parameter as :

SQL> alter system set open_cursors=1500 scope=spfile;

In my case i have increased the values of the open_cursors and issue got solved.


Enjoy     :-)