DBA - Oracle, PostgreSQL, MySql, MongoDB, Cassandra, MS SQLServer

Convert the physical standby to snapshot standby database in oracle dataguard

Thursday, February 14, 2019 / by Rakesh DBA / in Dataguard / with No comments /

A Snapshot Standby database still receives redo data from the primary but it does not apply the redo data until after it converted back to a physical standby. Keep in mind that a snapshot standby database cannot be the target of a switchover or failover. A snapshot must be converted back to a physical standby prior to performing a role transition. Flashback Database technology is used in the conversion process so the Fast (Flash) Recovery area must be configured.

This document will detail the steps to manually convert a physical standby to a snapshot standby.

Convert the Physical Standby Database into a Snapshot Standby Database

On the standby database stop redo apply.

SQL> alter database recover managed standby database cancel;

Database altered.

SQL>

Next convert the standby database to a snapshot standby.

SQL> alter database convert to snapshot standby;

Database altered.

SQL>

Once the conversion is complete all that is left is to open the database.

SQL> alter database open;

Database altered.

SQL>

You can verify the role change by querying the DATABASE_ROLE from V$DATABASE.

SQL> select database_role from v$database;

DATABASE_ROLE

----------------

SNAPSHOT STANDBY

SQL>

While the standby is in snapshot standby mode you are free to run transactions against the snapshot standby.

While the standby is in snapshot mode it still continues to receive redo data from the primary but it does not apply the redo data. You verify the transport by switching logs on the primary and looking at the alert log on the standby.

Thu Jun 17 11:15:41 2010

RFS[6]: Selected log 5 for thread 1 sequence 981 dbid 459961910 branch 719914169

Thu Jun 17 11:15:41 2010

Archived Log entry 1513 added for thread 1 sequence 980 ID 0x1b7c5492 dest 2:

RFS[6]: Selected log 4 for thread 1 sequence 982 dbid 459961910 branch 719914169

Thu Jun 17 11:15:42 2010

When the standby was converted to a snapshot a guaranteed restore point was created. You can see this in the alert log for the standby.

Thu Jun 17 09:44:44 2010

RVWR started with pid=30, OS id=9171

Allocated 3981120 bytes in shared pool for flashback generation buffer

Created guaranteed restore point SNAPSHOT_STANDBY_REQUIRED_06/17/2010 09:44:44

krsv_proc_kill: Killing 3 processes (all RFS)

Begin: Standby Redo Logfile archival

End: Standby Redo Logfile archival

When the snapshot standby is converted back into a physical standby this restore point will be used to flashback the standby to its original state prior to the conversion. If any operation is performed on the snapshot that cannot be reversed with Flashback Database will prevent the snapshot standby from being converted back to a physical standby.

Convert the Snapshot Standby Database back to a Physical Standby Database

Shutdown the snapshot standby database and bring it back up in the mount state.

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 830930944 bytes

Fixed Size 2217912 bytes

Variable Size 603981896 bytes

Database Buffers 222298112 bytes

Redo Buffers 2433024 bytes

Database mounted.

SQL>

Next convert the snapshot to a physical standby.

SQL> alter database convert to physical standby;

Database altered.

SQL>

In the standby alert log you can see that Flashback restore completed and the restore point was dropped.

Thu Jun 17 11:48:29 2010

alter database convert to physical standby

ALTER DATABASE CONVERT TO PHYSICAL STANDBY (standby)

krsv_proc_kill: Killing 4 processes (all RFS)

Flashback Restore Start

Flashback Restore Complete

Stopping background process RVWR

Deleted Oracle managed file /u01/app/flash_recovery_area/STANDBY/flashback/o1_mf_61nf6w8g_.flb

Deleted Oracle managed file /u01/app/flash_recovery_area/STANDBY/flashback/o1_mf_61ngk05r_.flb

Guaranteed restore point dropped

Clearing standby activation ID 461943091 (0x1b88b133)

The primary database controlfile was created using the

'MAXLOGFILES 16' clause.

There is space for up to 13 standby redo logfiles

Use the following SQL commands on the standby database to create

standby redo logfiles that match the primary database:

ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;

ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;

ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;

ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;

Completed: alter database convert to physical standby

Shutdown the database and bring it back to the mount state.

SQL> shutdown immediate

ORA-01507: database not mounted

ORACLE instance shut down.

SQL> startup mount;

ORACLE instance started.

Total System Global Area 830930944 bytes

Fixed Size 2217912 bytes

Variable Size 603981896 bytes

Database Buffers 222298112 bytes

Redo Buffers 2433024 bytes

Database mounted.

SQL>

If you take a look in the alert log you will see that the archive logs shipped when the standby was a snapshot standby are now applied.

Media Recovery Log /u01/app/oracle/oradata/standby/arch/1_970_719914169.dbf

Media Recovery Log /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_971_719914169.dbf

Media Recovery Log /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_972_719914169.dbf

Media Recovery Log /u01/app/oracle/oradata/standby/arch/1_973_719914169.dbf

Media Recovery Log /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_974_719914169.dbf

Media Recovery Log /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_975_719914169.dbf

Media Recovery Log /u01/app/oracle/product/11.2.0/dbhome_1/dbs/arch1_976_719914169.dbf

Media Recovery Log /u01/app/oracle/oradata/standby/arch/1_977_719914169.dbf

Using Snapshot Standby you can leverage your standby for testing or other special purposes temporarily will still protecting your primary database.

Oracle data guard switch over and fail over

Thursday, February 14, 2019 / by Rakesh DBA / in Dataguard / with 2 comments /

1. [PRIMARY] Switch log file on primary database.

SQL>alter system switch logfile;
2. [PRIMARY] Check switchover status before switching database.

SQL>select switchover_status from v$database;
You must see “TO_STANDBY” as result.

3. [PRIMARY] Switch primary database to standby database.

SQL>alter database commit to switchover to physical standby with session shutdown;

SQL>shutdown immediate;

SQL>startup nomount;

SQL>alter database mount standby database;
4. [PRIMARY] Defer for archive log apply. Because I didn’t set my standby database as primary yet.

SQL>alter system set log_archive_dest_state_2=defer;
5. [Standby] Switch standby database to primary. Check switchover status before switching database.

SQL>select switchover_status from v$database;
You must see “TO_PRIMARY” as result. Now let’s swtich

SQL>alter database commit to switchover to primary;

SQL>shutdown immediate;

SQL>startup;
Our switchover process is successfully completed .
6. [PRIMARY] Start real-time recovery process..

SQL>recover managed standby database using current logfile disconnect;
Finally let’s open our database with “Read Only with Apply”.

SQL>recover managed standby database cancel;

SQL>alter database open;

SQL>recover managed standby database using current logfile disconnect;
FAILOVER:

In short, the failover is the deformation of the production (primary) database and activating standby database as the primary. It is not reversible. When enabled, re-create the standby database. What to do in case of failover:

(Important note: PRIMARY is the primary server and Standby is the standby server)

1. [PRIMARY] If the primary database is accessible and running, then it must provided to send redo buffer to the standby database.

SQL> alter system flush redo to standby_db_name;

SQL>alter system archive log current;
If you don’t receive an error, you can continue with step 5th. In this case, the system can be opened by zero data loss. If you receive an error, We continue with step 2 to open the system at least data loss.

2. [Standby] We must run the following query to learn last applied archive log sequence number.

SQL> SELECT UNIQUE THREAD# AS THREAD, MAX(SEQUENCE#) OVER (PARTITION BY thread#) AS LAST from V$ARCHIVED_LOG;
3. [PRIMARY’dan Standby’ye] If you can access archive logs which are not copied to standby then copy archives to standby. After copy archive log files we must register them to standby database. This operation must be done for every thread.

SQL> alter database register physical logfile '/oracle/ora11g/dbs/arch/ TALIP_991834413_1_102.arc ';
4. [Standby] Check the standby database for redo gap. If there is a gap then we must copy archive log files and register.

SQL> SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;

SQL> alter database register physical logfile '/oracle/ora11g/dbs/arch/ TALIP_991834413_1_101.arc ';
As a result of the above query until it returns to zero.

5. [Standby] Stop the redo apply process in standby database.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
6. [Standby] Finish to apply archive logs copied from primary.

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
If you get an error, it means there are redo logs not applied. Consider 2th and 4th steps. You can also continue with following command;

SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
In this situation you can open database in 8th step. If you get no error, continue with 7th step.

7. [Standby] Switch standby database to primary database.

SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
8. [Standby] Open database.

SQL> ALTER DATABASE OPEN;
After opening standby database as primary with failover you must take full backup.

Oracle Data guard Protection modes

Thursday, February 14, 2019 / by Rakesh DBA / in Oracle Dataguard, Oracle DBA / with No comments /

Maximum Protection

In the Maximum Protection mode a transaction is only confirmed as “Committed”, when the data has been written both local and into at least one Standby Redolog file. If the Standby Database or the network between the databases breaks down, transactions could not be performed any longer, the primary database shuts down automatically. Oracle recommends to only use Maximum Protection mode, if at least two Standby databases exist.

For the Maximum Protection mode the following parameters must be set for the Redo transport:

AFFIRM

SYNC

Maximum Availibility

This mode is a compromise between data security and performance. First the Maximum Availability mode works just like the Maximum Protection mode. That means all transactions are transmitted simultaneously and the commit of the transaction is only sent, when the transaction is saved both local and on at least one Standby Redolog file. With the 12c Oracle feature “FastSync” the performance can be increased a bit by already confirming the transaction, when the data reached the Standby side, so when it is located in that memory.

Unlike the Maximum Protection mode, the primary database continues working after a short time, if the Standby side breaks down or a network error occurs. By this it switches to the Maximum Performance mode automatically, that means transactions are committed immediately. Once the Standby database is available again it automatically switches back to the Maximum Protection mode.

Following parameters are responsible for the Redolog transport in Maximum Availability mode:

AFFIRM

SYNC

NET_TIMEOUT

The parameter NET_TIMEOUT (default 30) tells the time (in seconds) after that the database shall switch to the Maximum Performance mode.

Maximum Performance

The Maximum Performance mode is used, when the primary database must not be compromised. That means transactions are confirmed, once they are saved into the local Redolog files and asynchronously transmitted to the Standby database. So in case of a breakdown of the primary database you can expect a loss of transactions.

Following parameters are responsible for the Redolog transport here:

NOAFFIRM

ASYNC

Oracle RAC GRID Daemons and Background process

Thursday, February 14, 2019 / by Rakesh DBA / in Oracle DBA, oracle RAC interview questions, RAC Daemons and background process / with No comments /

Oracle Cluster Specific Daemons:

Crsd :

The CRS daemon (crsd) manages cluster resources based on configuration information that is stored in Oracle Cluster Registry (OCR) for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes.

Cssd :

Cluster Synchronization Service (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interfaces with your clusterware to manage node membership information. CSS has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and provides input/output fencing. This service formerly was provided by Oracle Process Monitor daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure results in Oracle Clusterware restarting the node.

Diskmon :

Disk Monitor daemon (diskmon): Monitors and performs input/output fencing for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC node at any point in time, the diskmon daemon is always started when ocssd is started.

Evmd :

Event Manager (EVM): Is a background process that publishes Oracle Clusterware events

Mdnsd :

Multicast domain name service (mDNS): Allows DNS requests. The mDNS process is a background process on Linux and UNIX, and a service on Windows.

Gnsd :

Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and external DNS servers. The GNS process performs name resolution within the cluster.

Ons :

Oracle Notification Service (ONS): Is a publish-and-subscribe service for communicating Fast Application Notification (FAN) events

Oraagent :

oraagent: Extends clusterware to support Oracle-specific requirements and complex resources. It runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1).

Orarootagent :

Oracle root agent (orarootagent): Is a specialized oraagent process that helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP address

Oclskd :

Cluster kill daemon (oclskd): Handles instance/node evictions requests that have been escalated to CSS .

Gipcd :

Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure

Ctssd :

Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization between nodes, rather depending on NTP.

RAC Background Process:

LMSn — Global Cache Service Process: It can mainly handle the cache fusion part. It handles the consistent copies of blocks that are transferred between instances. It receives the request from the LMD to perform lock requests. It rolls back any uncommitted transactions. There can be up to 10 LMS process running and can be started dynamically if demand requires. It also handles the global deadlock detections and monitors for the lock conversion timeouts.

LMON — Global Enqueue Service Monitor: This process manages the GES, it maintains the consistency of GCSmemory in case of process death. It also responsible for cluster reconfiguration and locks reconfiguration.

LMD — Global Enqueue Service Daemon: This manages eneque manager service request for the GCS. It also handles the deadlock detections and remote resource requests the other instances.

LCK0 — Instance Enqueue Process: Manages the instance resource requests and cross instance call operations for shared resources. It builds a list of invalid lock elements and validates the lock elements during recovery.

DIAG — Diagnosability Daemon

GCS ensures a single system image of the data even though the data is accessed by multiple instances.

GES maintains or handles the synchronization of the dictionary cache, library cache, transaction locks, and DDL locks. In other words, GES manages enqueues other than data blocks. To synchronize access to the data dictionary cache, latches are used in exclusive (X) mode and in single-node cluster databases. Global enqueues are used in cluster database mode

RAC/Grid startup sequence in Oracle 11gR2

Thursday, February 14, 2019 / by Rakesh DBA / in Oracle DBA, Oracle DBA interview questions, RAC startup sequence / with No comments /

OHASD has access to OLR (oracle local registry). OHASD then reads the OLR content and initialize accordingly.

OHASD brings up GPnP (ora.gpnpd)Daemon and CSS (ora.cssd) Daemon.

CSS Daemon has access to the GPNP Profile stored on the local file system. I even found a copy of GPNP Profile directly stored in OLR (in Oracle 12c release 2)

The Voting Files locations on ASM Disks are accessed by CSSD with well-known pointers in the ASM Disk headers and CSSD is able to complete initialization and start or join an existing cluster.

OHASD starts an ASM instance. The ASM instance uses special code to locate the contents of the ASM SPFILE, if it is stored in a Diskgroup.

With an ASM instance operating and its Diskgroups mounted, access to Clusterware’s OCR is available to CRS.

OHASD then starts CRSD (ora.crsd)damon with access to the OCR in an ASM Diskgroup.

And thus Clusterware completes initialization and brings up other cluster managed resources defined in OCR.

Level 1: OHASD Spawns:

cssdagent – Agent responsible for spawning CSSD.
orarootagent – Agent responsible for managing all root owned ohasd resources.
oraagent – Agent responsible for managing all oracle owned ohasd resources.
cssdmonitor – Monitors CSSD and node health (along wth the cssdagent).

Level 2: OHASD rootagent spawns:

CRSD – Primary daemon responsible for managing cluster resources.
CTSSD – Cluster Time Synchronization Services Daemon
Diskmon
ACFS (ASM Cluster File System) Drivers

Level 2: OHASD oraagent spawns:

MDNSD – Used for DNS lookup
GIPCD – Used for inter-process and inter-node communication
GPNPD – Grid Plug & Play Profile Daemon
EVMD – Event Monitor Daemon
ASM – Resource for monitoring ASM instances

Level 3: CRSD spawns:

orarootagent – Agent responsible for managing all root owned crsd resources.
oraagent – Agent responsible for managing all oracle owned crsd resources.

Level 4: CRSD rootagent spawns:

Network resource – To monitor the public network
SCAN VIP(s) – Single Client Access Name Virtual IPs
Node VIPs – One per node
ACFS Registery – For mounting ASM Cluster File System
GNS VIP (optional) – VIP for GNS

Level 4: CRSD oraagent spawns:

ASM Resouce – ASM Instance(s) resource
Diskgroup – Used for managing/monitoring ASM diskgroups.
DB Resource – Used for monitoring and managing the DB and instances
SCAN Listener – Listener for single client access name, listening on SCAN VIP
Listener – Node listener listening on the Node VIP
Services – Used for monitoring and managing services
ONS – Oracle Notification Service
eONS – Enhanced Oracle Notification Service
GSD – For 9i backward compatibility
GNS (optional) – Grid Naming Service – Performs name resolution