Tuesday, May 26, 2015

ORA-09945: Unable to initialize the audit trail file, PRCD-1222 : Online relocation of database "ORALIN" failed but database was restored to its original state

I was testing the online relocation of RAC One Node database from host08 to host09 server. Command was run from host08 server.

## Error

ORA-09945: Unable to initialize the audit trail file

### Full Error

[oracle@host08 trace]$ srvctl relocate database -d ORALIN -n host09 -v
Configuration updated to two instances
Online relocation failed, rolling back to original state
Configuration reverted back to one instance
PRCD-1222 : Online relocation of database "ORALIN" failed but database was restored to its original state
PRCD-1129 : Failed to start instance ORALIN_2 for database ORALIN
PRCR-1064 : Failed to start resource ora.ORALIN.db on node host09
CRS-5017: The resource action "ora.ORALIN.db start" encountered the following error:
ORA-09817: Write to audit file failed.
Linux-x86_64 Error: 28: No space left on device
Additional information: 12
ORA-09945: Unable to initialize the audit trail file
Linux-x86_64 Error: 28: No space left on device
. For details refer to "(:CLSN00107:)" in "/ofa/rac/app/oracle/grid/".
CRS-2674: Start of 'ora.ORALIN.db' on 'host09' failed
[oracle@host08 trace]$

#  Error Occurred

Error occured while doing a relocation of database instance of a One Node RAC in version to a different server

## Command Executed

srvctl relocate database -d ORALIN -n host09 -v

**************************************** Step By Step Analysis ******************************************************

# 1) Check the Full Error

The above error shows that there is some problem with the audit file creation.

Lets check oragent_oracle.log file in host09 server as the problem ocurred while relocating the instance to host09.

[oracle@host09 audit]$ vi "/ofa/rac/app/oracle/grid/"
[oracle@host09 audit]$
2015-05-26 16:57:41.499: [ USRTHRD][2234124608]{0:1:24} CrsCmd::destroy
2015-05-26 16:57:41.500: [ora.orawin.orawindev.svc][2234124608]{0:1:24} [check] clsnUtils::error Exception type=2 string=
CRS-5017: The resource action "ora.orawin.orawindev.svc check" encountered the following error:
ORA-09817: Write to audit file failed.
Linux-x86_64 Error: 28: No space left on device
Additional information: 12
. For details refer to "(:CLSN00109:)" in "/ofa/rac/app/oracle/grid/".


# 2) Reason for Failure

Online relocation command showed that error is because of the audit. But we shouldnt take that the audit got filled up for ORALIN database.

From above "oragent_oracle.log" we can see that the error occurred on "ora.orawin.orawindev.svc". So the database in problem is orawin.


# 3) Check the audit location & Filesystem

If we can login to orawin database then check audit location. (Show parameter audit)

In my case, i'm unable to login to the database, so checked the filesystems which reached 100%

[oracle@host09 bin]$ df -h /ofa/oracle_11.2.0.4_home
Filesystem            Size  Used Avail Use% Mounted on
                       20G   19G     0 100% /ofa/oracle_11.2.0.4_home
[oracle@host09 bin]$


# 4) Check which folder is using more space

[oracle@host09 oracle_11.2.0.4_home]$ du -sh *
14G     dbs

[oracle@host09 dbs]$ pwd
[oracle@host09 dbs]$ ls arch*
arch1_19_871216596.dbf  arch1_31_871216596.dbf  arch1_43_871216596.dbf  arch1_51_871216596.dbf  arch1_60_871216596.dbf  arch1_70_871216596.dbf
arch1_20_871216596.dbf  arch1_32_871216596.dbf  arch1_44_871216596.dbf  arch1_52_871216596.dbf  arch1_61_871216596.dbf  arch1_71_871216596.dbf
arch1_21_871216596.dbf  arch1_33_871216596.dbf  arch1_45_871216596.dbf  arch1_53_871216596.dbf  arch1_62_871216596.dbf  arch1_72_871216596.dbf
arch1_22_871216596.dbf  arch1_34_871216596.dbf  arch1_46_871216596.dbf  arch1_54_871216596.dbf  arch1_63_871216596.dbf  arch1_73_871216596.dbf
arch1_23_871216596.dbf  arch1_35_871216596.dbf  arch1_47_871216596.dbf  arch1_55_871216596.dbf  arch1_64_871216596.dbf  arch1_74_871216596.dbf
arch1_24_871216596.dbf  arch1_36_871216596.dbf  arch1_4_870194590.dbf   arch1_56_871216596.dbf  arch1_65_871216596.dbf  arch1_75_871216596.dbf
arch1_25_871216596.dbf  arch1_37_871216596.dbf  arch1_4_870252683.dbf   arch1_57_871216596.dbf  arch1_66_871216596.dbf  arch1_76_871216596.dbf
arch1_26_871216596.dbf  arch1_38_871216596.dbf  arch1_4_870864967.dbf   arch1_5_870194590.dbf   arch1_67_871216596.dbf  arch1_77_871216596.dbf
arch1_27_871216596.dbf  arch1_39_871216596.dbf  arch1_4_870965663.dbf   arch1_5_870864967.dbf   arch1_6_870194590.dbf   arch1_78_871216596.dbf
arch1_28_871216596.dbf  arch1_40_871216596.dbf  arch1_48_871216596.dbf  arch1_5_870965663.dbf   arch1_6_870864967.dbf
arch1_29_871216596.dbf  arch1_41_871216596.dbf  arch1_49_871216596.dbf  arch1_58_871216596.dbf  arch1_68_871216596.dbf
arch1_30_871216596.dbf  arch1_42_871216596.dbf  arch1_50_871216596.dbf  arch1_59_871216596.dbf  arch1_69_871216596.dbf
[oracle@host09 dbs]$

If above we can see that the files are the archivelogs.

## Solution

Backup and Delete the archive log files. (As this is the development database, i removed it without a backup)

[oracle@host09 dbs]$ rm arch*
[oracle@host09 dbs]$ df -h .
Filesystem            Size  Used Avail Use% Mounted on
                       20G  8.1G   11G  43% /ofa/oracle_11.2.0.4_home
[oracle@host09 dbs]$

Now database relocation worked fine.

[oracle@host08 trace]$ srvctl relocate database -d ORALIN -n host09 -v
Configuration updated to two instances
Instance ORALIN_2 started
Services relocated
Waiting for up to 30 minutes for instance ORALIN_1 to stop ...
Instance ORALIN_1 stopped
Configuration updated to one instance
[oracle@host08 trace]$

 Comments Are Always welcome