Oracle Data guard Protection modes

Maximum Protection
In the Maximum Protection mode a transaction is only confirmed as “Committed”, when the data has been written both local and into at least one Standby Redolog file. If the Standby Database or the network between the databases breaks down, transactions could not be performed any longer, the primary database shuts down automatically. Oracle recommends to only use Maximum Protection mode, if at least two Standby databases exist.

For the Maximum Protection mode the following parameters must be set for the Redo transport:

AFFIRM
SYNC
Maximum Availibility
This mode is a compromise between data security and performance. First the Maximum Availability mode works just like the Maximum Protection mode. That means all transactions are transmitted simultaneously and the commit of the transaction is only sent, when the transaction is saved both local and on at least one Standby Redolog file. With the 12c Oracle feature “FastSync” the performance can be increased a bit by already confirming the transaction, when the data reached the Standby side, so when it is located in that memory.

Unlike the Maximum Protection mode, the primary database continues working after a short time, if the Standby side breaks down or a network error occurs. By this it switches to the Maximum Performance mode automatically, that means transactions are committed immediately. Once the Standby database is available again it automatically switches back to the Maximum Protection mode.

Following parameters are responsible for the Redolog transport in Maximum Availability mode:

AFFIRM
SYNC
NET_TIMEOUT
The parameter NET_TIMEOUT (default 30) tells the time (in seconds) after that the database shall switch to the Maximum Performance mode.

Maximum Performance
The Maximum Performance mode is used, when the primary database must not be compromised. That means transactions are confirmed, once they are saved into the local Redolog files and asynchronously transmitted to the Standby database. So in case of a breakdown of the primary database you can expect a loss of transactions.

Following parameters are responsible for the Redolog transport here:

NOAFFIRM
ASYNC

Oracle RAC GRID Daemons and Background process


Oracle Cluster Specific Daemons:
Crsd :
The CRS daemon (crsd) manages cluster resources based on configuration information that is stored in Oracle Cluster Registry (OCR) for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes.

Cssd :
 Cluster Synchronization Service (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interfaces with your clusterware to manage node membership information. CSS has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and provides input/output fencing. This service formerly was provided by Oracle Process Monitor daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure results in Oracle Clusterware restarting the node. 
Diskmon :
Disk Monitor daemon (diskmon): Monitors and performs input/output fencing for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC node at any point in time, the diskmon daemon is always started when ocssd is started. 
Evmd :
Event Manager (EVM): Is a background process that publishes Oracle Clusterware events 
Mdnsd :
Multicast domain name service (mDNS): Allows DNS requests. The mDNS process is a background process on Linux and UNIX, and a service on Windows. 
Gnsd :
Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and external DNS servers. The GNS process performs name resolution within the cluster. 
Ons :
Oracle Notification Service (ONS): Is a publish-and-subscribe service for communicating Fast Application Notification (FAN) events 
Oraagent :
oraagent: Extends clusterware to support Oracle-specific requirements and complex resources. It runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1). 
Orarootagent :
Oracle root agent (orarootagent): Is a specialized oraagent process that helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP address 
Oclskd :
Cluster kill daemon (oclskd): Handles instance/node evictions requests that have been escalated to CSS .
Gipcd :
Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure 
Ctssd :
Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization between nodes, rather depending on NTP. 

 RAC Background Process:

LMSn — Global Cache Service Process: It can mainly handle the cache fusion part. It handles the consistent copies of blocks that are transferred between instances. It receives the request from the LMD to perform lock requests. It rolls back any uncommitted transactions. There can be up to 10 LMS process running and can be started dynamically if demand requires. It also handles the global deadlock detections and monitors for the lock conversion timeouts.

LMON    — Global Enqueue Service Monitor: This process manages the GES, it maintains the consistency of GCSmemory in case of process death. It also responsible for cluster reconfiguration and locks reconfiguration.

LMD     — Global Enqueue Service Daemon: This manages eneque manager service request for the GCS. It also handles the deadlock detections and remote resource requests the other instances.

LCK0    — Instance Enqueue Process: Manages the instance resource requests and cross instance call operations for shared resources. It builds a list of invalid lock elements and validates the lock elements during recovery.

DIAG    — Diagnosability Daemon

GCS ensures a single system image of the data even though the data is accessed by multiple instances.

GES maintains or handles the synchronization of the dictionary cache, library cache, transaction locks, and DDL locks. In other words, GES manages enqueues other than data blocks. To synchronize access to the data dictionary cache, latches are used in exclusive (X) mode and in single-node cluster databases. Global enqueues are used in cluster database mode

RAC/Grid startup sequence in Oracle 11gR2


OHASD has access to OLR (oracle local registry). OHASD then reads the OLR content and initialize accordingly. 

OHASD brings up GPnP (ora.gpnpd)Daemon and CSS (ora.cssd) Daemon. 

CSS Daemon has access to the GPNP Profile stored on the local file system. I even found a copy of GPNP Profile directly stored  in OLR (in Oracle 12c release 2)

The Voting Files locations on ASM Disks are accessed by CSSD with well-known pointers in the ASM Disk headers and CSSD is able to complete initialization and start or join an existing cluster.

OHASD starts an ASM instance. The ASM instance uses special code to locate the contents of the ASM SPFILE, if it is stored in a Diskgroup. 

With an ASM instance operating and its Diskgroups mounted, access to Clusterware’s OCR is available to CRS.

OHASD then starts CRSD (ora.crsd)damon with access to the OCR in an ASM Diskgroup.

And thus Clusterware completes initialization and brings up other cluster managed resources defined in OCR.

Level 1: OHASD Spawns:
  • cssdagent – Agent responsible for spawning CSSD.
  • orarootagent – Agent responsible for managing all root owned ohasd resources.
  • oraagent – Agent responsible for managing all oracle owned ohasd resources.
  • cssdmonitor – Monitors CSSD and node health (along wth the cssdagent).
Level 2: OHASD rootagent spawns:
  • CRSD – Primary daemon responsible for managing cluster resources.
  • CTSSD – Cluster Time Synchronization Services Daemon
  • Diskmon
  • ACFS (ASM Cluster File System) Drivers
Level 2: OHASD oraagent spawns:
  • MDNSD – Used for DNS lookup
  • GIPCD – Used for inter-process and inter-node communication
  • GPNPD – Grid Plug & Play Profile Daemon
  • EVMD – Event Monitor Daemon
  • ASM – Resource for monitoring ASM instances
Level 3: CRSD spawns:
  • orarootagent – Agent responsible for managing all root owned crsd resources.
  • oraagent – Agent responsible for managing all oracle owned crsd resources.
Level 4: CRSD rootagent spawns:
  • Network resource – To monitor the public network
  • SCAN VIP(s) – Single Client Access Name Virtual IPs
  • Node VIPs – One per node
  • ACFS Registery – For mounting ASM Cluster File System
  • GNS VIP (optional) – VIP for GNS
Level 4: CRSD oraagent spawns:
  • ASM Resouce – ASM Instance(s) resource
  • Diskgroup – Used for managing/monitoring ASM diskgroups.
  • DB Resource – Used for monitoring and managing the DB and instances
  • SCAN Listener – Listener for single client access name, listening on SCAN VIP
  • Listener – Node listener listening on the Node VIP
  • Services – Used for monitoring and managing services
  • ONS – Oracle Notification Service
  • eONS – Enhanced Oracle Notification Service
  • GSD – For 9i backward compatibility
  • GNS (optional) – Grid Naming Service – Performs name resolution

Long running queries troubleshooting in oracle database


In general application team come to DBA’s and complaint on one application query, last time or previous times it was done in few seconds now it is taking 1 minute or more time. For this issue no concreate answer. Every DBA has their own approaches to solve the issue.

Step:1
Find the long running query:
Long running session are currently running, then we can use v$session find the long running query sql_id, using sql_id get the sql using v$sqltext view.
Sql>select sql_id from v$session where sid=1234;
      Sql_id
 btdzd9ktsa55n
sql> select sql_text from v$sqltext where sql_id=’ btdzd9ktsa55n’ order by piece;
Sql_text
select name, accno, htown, amt, transactions from abd.transact where mobile_no=’9933445599’;
If in-case query already completed, then we can find the sql_id from AWR, ASH and dba_hist_active_session_history views on specific time interval.

Step2:
1.     Run the Sql_Tuning_advisor for the sql_id:
SET SERVEROUTPUT ON
DECLARE
  l_sql_tune_task_id  VARCHAR2(200);
BEGIN
  l_sql_tune_task_id := DBMS_SQLTUNE.create_tuning_task (
                          sql_id      => 'btdzd9ktsa55n',
                          scope       => DBMS_SQLTUNE.scope_comprehensive,
                          time_limit  => 60,
                          task_name   => 'btdzd9ktsa55n_tuning',
                          description => 'Tuning for btdzd9ktsa55n');
  DBMS_OUTPUT.put_line('l_sql_tune_task_id: ' || l_sql_tune_task_id);
END;
/
2.     Execute the DBMS_SQLTUNE using task_name
EXEC DBMS_SQLTUNE.execute_tuning_task(task_name => 'btdzd9ktsa55n_tuning');
3.     Once tuning task done, execute report tuning task.

SELECT DBMS_SQLTUNE.report_tuning_task('btdzd9ktsa55n_tuning') FROM dual;

It gives us few recommendations have to take the actions accordingly.
Recommendations could be gather statistics, creating indexes, drop indexes……..

Step 3:
Few time we didn’t get any recommendations from the Sql_tune, then that time we have to verify is there any plan changed for the query using dba_sqlstat, dba_hist_snapshot and DBA_HIST_SQL_PLAN.

select distinct b.BEGIN_INTERVAL_TIME as Snap, a.PLAN_HASH_VALUE as plan, a.EXECUTIONS_DELTA as EXECUTIONS, a.ELAPSED_TIME_DELTA/1000000 as ELAPSED_SEC, ROWS_PROCESSED_DELTA as "ROWS" , a.ROWS_PROCESSED_DELTA/CASE WHEN a.EXECUTION_DELTA = 0 THEN -1 ELSE a.EXECUTIONS_DELTA END "Avg Rows", a.ELAPSED_TIME_DELTA/1000000/CASE WHEN a. EXECUTION_DELTA = 0 THEN -1 ELSE a.EXECUTION_DELTA END "Avg Elapsed", a.optimizer_cost, a.SQL_PROFILE from DBA_HIST_SQLSTAT a, DBA_HIST_SNAPHOT b where a.SQL_ID = '&sqlid' and a.snap_id = b.snap_id order by b.BEGIN_INTERVAL_TIME;

or

SELECT DISTINCT sql_id, plan_hash_value
FROM dba_hist_sqlstat q,
    (
    SELECT /*+ NO_MERGE */ MIN(snap_id) min_snap, MAX(snap_id) max_snap
    FROM dba_hist_snapshot ss
    WHERE ss.begin_interval_time BETWEEN (SYSDATE - &No_Days) AND SYSDATE
    ) s
WHERE q.snap_id BETWEEN s.min_snap AND s.max_snap
  AND q.sql_id IN ( '&SQLID')
/

Above queries gives us sql paln from number of days, from the list we can choose the best plan and pin the best plan.
How to pin the sql plan:

pre and post migration overview of database migration to cloud Azure, AWS

As you prepare for migrating to the cloud, verify that your source environment is supported and the you have addressed any prerequisites.

1. Pre-Migration Overview:

  • Find the Database size, it helps us to how long the data copy will take.
  • How many schema's and tables going to migrate and check the large objects.
  • Database users, roles and privileges.
  • How busy/critical of source database.
  • Under standing the network, firewalls and security.
  • How much down time afford.
  • Limitations of source database.

2. Planing migration methods and Execution phase:

  • Are we have any additional resources available to assist with various data migration scenarios.
  • Relatively small databases that allow for downtime during migration, relatively large databases that can't afford downtime during migration.

3. Post migration overview:

  • Developing validation test on both source and target databases using same queries.
  • Set up test environment
  • Run validation tests against the source and the target and then analyze the result.
  • Run the performance tests against the source and target and then analyze and compare result.