Aug 07

RMAN-20036: invalid record order

Trying to register a database with my rman catalog I was getting the following error

$ rman target sys/xxx@DB01 catalog=RMAN_DB01/xxx@rmancat

Recovery Manager: Release 11.2.0.4.0 - Production on Mon Aug 7 11:42:37 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: DB01 (DBID=114156133)
connected to recovery catalog database

RMAN> create catalog;

recovery catalog created

RMAN> register database;

starting full resync of recovery catalog
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of register command at 08/07/2017 11:42:44
RMAN-03014: implicit resync of recovery catalog failed
RMAN-03009: failure of full resync command on default channel at 08/07/2017 11:42:44
RMAN-20036: invalid record order

Poking around metalink I found note 113325.1

Basicly it says there is a mismatch in the number of standby redo logs, but this database does not have a standby. However it was cloned from a database that had a standby, and I apparently forgot to remove the standby redo logs after I cloned it.

SQL> select INST_ID,GROUP#,TYPE from gv$logfile;
         1          1         ONLINE
         1          2         ONLINE
         1          3         ONLINE
         1          4 INVALID STANDBY
         1          5         STANDBY
         2          6         ONLINE
         2          7         ONLINE
         2          8         ONLINE
         2          9 INVALID STANDBY
         2         10         STANDBY

So cleanup the standby logfiles

SQL> alter database drop standby logfile group 1 thread 4;
		 
Database altered.

SQL> alter database drop standby logfile group 1 thread 5;
		 
Database altered.

SQL> alter database drop standby logfile group 2 thread 9;
		 
Database altered.

SQL> alter database drop standby logfile group 2 thread 9;
		 
Database altered.

I dropped and re-created the catalog

RMAN> create catalog;

recovery catalog created

RMAN> register database;

database registered in recovery catalog
starting full resync of recovery catalog
full resync complete

 

Mar 28

Now on SSL

I have moved this site over to a secure connection. LetsEncrypt offers free SSL certificates for your website! The setup was a bit of a pain because I have a non-standard configuration for this server, but after a little poking and prodding, we are now running on https.

Feb 20

Targets.xml was rejected: loaded with a wrong agent token

I noticed an agent on one of my database servers was down this morning. Agents crash from time to time, no big deal. I will just restart it

$ ./emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
Starting agent ............ failed.
Target Manager failed at Startup: targets.xml was rejected: loaded with a wrong agent token
Consult emctl.log and emagent.nohup in: /u01/app/oracle/product/12.1.0.4/agent12c/agent_inst/sysman/log

Well that sucks. Something is out of sync. Usually this will happen when your file system is filled up and a change cannot be written to the targets.xml. I wasn’t out of space though. I’m not quite sure what caused this, but I know how to fix it.

make a backup of the targets.xml file, blank out the file, and add the opening and closign targets tag

cd /u01/app/oracle/product/12.1.0.4/agent12c/agent_inst/sysman/emd/
mv targets.xml targets.xml.20170220
echo "<Targets></Targets>" > targets.xml

Then re-synchronize the agent from OMS, this will re-populate the targets.xml

emcli login -username=SYSMAN
emcli sync
emcli resyncAgent -agent="dbsrv01:3872"

it takes about another 10 minutes to resynchronize but while it is running the agent will look like this

$ ./emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 12.1.0.2.0
OMS Version       : (unknown)
Protocol Version  : 12.1.0.1.0
Agent Home        : /u01/app/oracle/product/12.1.0.4/agent12c/agent_inst
Agent Binaries    : /u01/app/oracle/product/12.1.0.4/agent12c/core/12.1.0.2.0
Agent Process ID  : 794210
Parent Process ID : 793987
Agent URL         : https://dbsrv01:3872/emd/main/
Repository URL    : https://oem1:4900/empbs/upload
Started at        : 2017-02-20 11:56:00
Started by user   : oracle
Last Reload       : (none)
Last successful upload                       : (none)
Last attempted upload                        : (none)
Total Megabytes of XML files uploaded so far : 0
Number of XML files pending upload           : 95
Size of XML files pending upload(MB)         : 0.9
Available disk space on upload filesystem    : 73.85%
Collection Status                            : [RESYNC]
Heartbeat Status                             : Agent is blocked
Last attempted heartbeat to OMS              : 2017-02-20 11:59:03
Last successful heartbeat to OMS             : (none)
Next scheduled heartbeat to OMS              : 2017-02-20 12:02:03

After a little waiting around the agent is back to normal

$ ./emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 12.1.0.2.0
OMS Version       : 12.1.0.4.0
Protocol Version  : 12.1.0.1.0
Agent Home        : /u01/app/oracle/product/12.1.0.4/agent12c/agent_inst
Agent Binaries    : /u01/app/oracle/product/12.1.0.4/agent12c/core/12.1.0.2.0
Agent Process ID  : 794210
Parent Process ID : 793987
Agent URL         : https://dbsrv01:3872/emd/main/
Repository URL    : https://oem1:4900/empbs/upload
Started at        : 2017-02-20 11:56:00
Started by user   : oracle
Last Reload       : (none)
Last successful upload                       : 2017-02-20 12:19:53
Last attempted upload                        : 2017-02-20 12:19:53
Total Megabytes of XML files uploaded so far : 2.15
Number of XML files pending upload           : 0
Size of XML files pending upload(MB)         : 0
Available disk space on upload filesystem    : 73.62%
Collection Status                            : Collections enabled
Heartbeat Status                             : Ok
Last attempted heartbeat to OMS              : 2017-02-20 12:19:04
Last successful heartbeat to OMS             : 2017-02-20 12:19:04
Next scheduled heartbeat to OMS              : 2017-02-20 12:20:04

---------------------------------------------------------------
Agent is Running and Ready

 

Feb 03

Oracle OEM 13c LDAP error

My new OEM 13c system has been having lots of issues, I have multiple bugs registered with oracle. This one is particularly odd, trying to create a new user I get an LDAP error

The user actually does get created, but when I try to assign additional privileges to the user I get the error again, and this time it doesn’t actually add them.

The kicker is, I am not using LDAP, the user is created as an internal user, no single sign on!

I am trying to give a user access to the performance page, so off to emcli!!

I need to find the privileges related to the performance tab. It’s better to copy and paste these since they are case sensitive.

$ emcli get_supported_privileges | grep -i perf
PERFORM_OPERATION_AS_ANY_AGENT  Resource Type    TARGET
DB_PERFSQL_OPT_ADMIN            Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
PERFORM_OPERATION               Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
DB_PERFORMANCE_HOME_VIEW        Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
DB_PERFORMANCE_ADMIN            Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
DB_PERFSQL_OPT_CONSOLE_VIEW     Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
PERFORM_OPERATION_ANYWHERE      Resource Type    TARGET
DB_PERFORMANCE_VIEW             Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
PERFORM_OPERATION_AS_AGENT      Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE

I want to give myUser DB_PERFORMANCE_VIEW and DB_PERFORMANCE_HOME_VIEW. I have my databases assigned to groups, so lets do that grant

emcli grant_privs -name="myUser" -privilege="DB_PERFORMANCE_HOME_VIEW;TARGET_NAME=Dev:TARGET_TYPE=group"
emcli grant_privs -name="myUser" -privilege="DB_PERFORMANCE_VIEW;TARGET_NAME=Dev:TARGET_TYPE=group"
emcli modify_group -name="Dev" -privilege_propagation=true

I looped through my dev/test/prod groups

Checking inside OEM, the grants are now in place

I don’t understand what OEM is doing though the web interface that is difference from using emcli. I had another issue deploying plugins using the web interface. I was able to deploy them with emcli. I will have a post on that soon.

Jan 06

ohasd permissions

I was working kind of late last night, trying to get a new install working. I ran into a bug where the permissions on the ohasd were incorrect after patching GI. I went out to a working server to see what the permissions should be, build my chown and chmod statements and pasted them into my terminal window. Unfortunately I got them in the wrong terminal, and had managed to copy the wrong permissions. I changed the ownership on ohasd on the first node of my production RAC Cluster. Apparently the permissions are really important because the whole node went down.

A little bit of panic set in and I wasn’t sure what I had done. I didn’t realize i had pasted the permission statements into the wrong window, and the error messages weren’t very helpful.

[root@node1 bin]# ./crsctl stop crs -f
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Stop failed, or completed with errors.
[root@node1 bin]# ps -ef | grep d.bin
root 33615 1 0 16:53 ? 00:00:00 /u01/app/12.1.0.2/grid/bin/ohasd.bin reboot
root 53203 1 0 16:57 ? 00:00:00 /u01/app/12.1.0.2/grid/bin/ohasd.bin reboot
root 79744 1 0 17:07 ? 00:00:00 /u01/app/12.1.0.2/grid/bin/ohasd.bin reboot
root 105598 103980 0 17:17 pts/2 00:00:00 grep d.bin
[root@node1 bin]# kill -9 33615 53203 79744
[root@node1 bin]# ps -ef | grep d.bin
root 106623 103980 0 17:17 pts/2 00:00:00 grep d.bin
[root@node1 bin]# date
Thu Jan 5 17:18:02 EST 2017
[root@node1 bin]# ./crsctl start crs
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.

I was getting nothing in the logs.

I figured it must be a permission issue, but I wasn’t quite sure what to reset them to.

Apparently I am not the first person to do this since Oracle has a document for fixing this!

How to check and fix file permissions on Grid Infrastructure environment (Doc ID 1931142.1)

I ran

./rootcrs.pl -init

rebooted the node, and all was right with the world, except for my ego being kind of damaged from making such a silly mistake.

Sep 30

myMYSQL Nightmare Continues

My fight this week with mySQL continues. I started getting really strange errors in the error log, and users couldn’t connect, not even root@localhost.

The error log showed

2016-09-30T15:04:08.790598Z 0 [Warning] System table 'time_zone_leap_second' is expected to be transactional.
2016-09-30T15:04:08.790625Z 0 [Warning] System table 'time_zone_name' is expected to be transactional.
2016-09-30T15:04:08.790628Z 0 [Warning] System table 'time_zone' is expected to be transactional.
2016-09-30T15:04:08.790643Z 0 [Warning] System table 'time_zone_transition_type' is expected to be transactional.
2016-09-30T15:04:08.790646Z 0 [Warning] System table 'time_zone_transition' is expected to be transactional.
2016-09-30T15:04:08.793672Z 0 [Warning] System table 'servers' is expected to be transactional.
2016-09-30T15:04:08.800142Z 0 [Warning] Info table is not ready to be used. Table 'mysql.slave_master_info' cannot be opened.
2016-09-30T15:04:08.803287Z 0 [Warning] Info table is not ready to be used. Table 'mysql.slave_relay_log_info' cannot be opened.

Followed by logins failing

2016-09-30T15:14:27.945521Z 4 [Note] Access denied for user 'root'@'localhost' (using password: YES)
2016-09-30T15:14:35.794051Z 5 [Note] Access denied for user 'root'@'localhost' (using password: YES)
2016-09-30T15:14:50.242206Z 6 [Note] Access denied for user 'root'@'localhost' (using password: YES)

What I think had happened, mySQL was upgraded on the OS by the linux admins. It doesn’t look like anything inside the database was upgrade.

The new version of mySQL uses the innodb engine (v5.6.15)

mysql> show create table time_zone\G
*************************** 1. row ***************************
       Table: time_zone
Create Table: CREATE TABLE `time_zone` (
  `Time_zone_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `Use_leap_seconds` enum('Y','N') NOT NULL DEFAULT 'N',
  PRIMARY KEY (`Time_zone_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 STATS_PERSISTENT=0 COMMENT='Time zones'
1 row in set (0.00 sec)

Whereas the older version of mySQL uses MyISAM (from another server v5.5.50)

[mysql]> show create table time_zone\G
*************************** 1. row ***************************
       Table: time_zone
Create Table: CREATE TABLE `time_zone` (
  `Time_zone_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `Use_leap_seconds` enum('Y','N') NOT NULL DEFAULT 'N',
  PRIMARY KEY (`Time_zone_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Time zones'

mysql was expecting INNODB and found MyISAM

This should be a simple fix, just run the mysql_upgrade script. However the database is either down, or I can’t log in when the database is up.

So I had to shutdown the database, start it up skipping permissions, and change my root password

mysqld --skip-grant-tables --skip-networking

in another session I was now able to connect to the database

mysql -u root

and update the root password

FLUSH PRIVILEGES;
SET PASSWORD FOR root@'localhost' = PASSWORD('password');

then killed off my instance, and started it back up

service mysqld start

run the mysql_upgrade script

msyql_upgrade -uroot -pPassword

and restart once more

service mysqld restart

I don’t know what wrong with the passwords at this point, but none of the users could log in, so I changed all of their passwords to their current passwords. Just rewriting the password seemed to do the trick.

CREATE USER 'dba_user'@'%' IDENTIFIED WITH 'mysql_native_password' AS '*2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19' REQUIRE NONE PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK

Everything seems to be back to normal…

 

 

Sep 28

Open Source Rant

%98 of my time I spend as an Oracle DBA. I am not limited just to Oracle though. I have used mySQL and SQL server extensively. Something that always drives me crazy about mySQL is the documentation. It is horrible. The fact that it is done as an open forum where people post and try and provide information is a big red flag to me. Just trying to figure out how a simple command works is incredibly frustrating. It assumes you know what you are doing. I do know what I am doing, but can still get tripped up when they don’t provide command flow diagrams. I can’t imagine how frustrating trying to use mysql must be for a more average user.

Jun 17

Mind your spaces

I am trying to export a database, everything except for a single table. I need to send a copy of the database to the vendor, but they don’t need a table that takes up %90 of the space in the database due to it storing a massive number of BLOBs and CLOBs

This should be really simple

$ expdp "'/ as sysdba'" directory=A dumpfile=fullexport.dmp exclude=TABLE:"IN ('TABLEOFCLOBS')"

Export: Release 11.2.0.4.0 - Production on Fri Jun 17 10:37:01 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
ORA-39001: invalid argument value
ORA-39071: Value for EXCLUDE is badly formed.
ORA-00936: missing expression

That syntax looks correct to me! Maybe I copy/pasted in a special character or something…. I type out the full command by hand and get the same error. I am stumped!

idontgetit

I worked the googles on the internet machine, and see a bunch of examples that look pretty much identical to mine. Then I notice it. I have a space after the “IN” key word.

$ expdp "'/ as sysdba'" directory=A dumpfile=fullexport.dmp exclude=table:"IN('TABLEOFCLOBS')" FULL=Y

Export: Release 11.2.0.4.0 - Production on Fri Jun 17 10:38:14 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
Starting "SYS"."SYS_EXPORT_FULL_01":  "/******** AS SYSDBA" directory=A dumpfile=exportfile.dmp exclude=table:IN('TABLEOFCLOBS') FULL=Y
Estimate in progress using BLOCKS method...
Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 53.28 GB

Thwarted by spaces again! It seems trival, and kind of incorrect since the “IN” in a sql statement needs that space there.

Jun 08

Windows 10 – When You Can’t Be Trusted

Unfortunately some things require windows. Old interfaces that ensure you are running IE6, old proprietary administration software, and other things I don’t use on a daily basis. Windows 10 is one of the most frustrating versions of windows I have ever run. It pretty much assumes you are too stupid to have any idea what you are doing. I just spend an hour fighting with it to let me download a file that builds boot disks. It decided it wasn’t safe. I don’t know what that means since I have been using this product for years! In fact Edge assumes you are so stupid, I can’t get on most websites because it doesn’t assume they are safe! It also doesn’t think it is safe for me to install chrome! How anyone could use the pile of crap for day to day work baffles me. I am glad my work laptop is still Windows 7, where I can get a virus if I am stupid, but doesn’t try to baby proof everything!

Jun 08

Rushing = Stupid Mistakes

I am currently being asked to refresh a database. The end users are pissed because they asked for the database to be refreshed at noon yesterday, this was the first they had asked for it, and they were doing “really important things” and by 3pm were pissed that I hadn’t responded to them, let alone completed the work. The thing was, I took a vacation. I took a whole two days off to spend some time with my girlfriend and wasn’t checking my email. Never mind that we have a whole team of DBAs that can refresh a database, they had emailed me. What they should have done was open a proper ticket, and someone would have helped out even though they hadn’t planned and were now in a panic.

So After escalating it to their boss because I wasn’t providing them with “good service”, who in turn talked to my boss, who informed him how they had dropped the ball, and I would deal with it today, that they were being completely unprofessional. (My boss is a good guy, and sticks up for me).

So here is the technical part of this post. Ever DBA has cloned a database, thousands and thousands of databases, but when you are rushing you can make stupid errors. I just fought this error for 20 minutes

Oracle instance shut down
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 06/08/2016 14:23:04
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
ORA-27191: sbtinfo2 returned error
Additional information: 3480

This should have been a no brainier, but in my rush I wasn’t fully paying attention to the error.

I was just issuing the command

duplicate target database to testdb from active database

My defaults for RMAN are using sbt ie tape, so for some reason the tape channels were being loaded. This causes an error. The solution is simple and should have been fairly immediate, just allocate disk channels.

run
{
ALLOCATE CHANNEL d1 TYPE DISK ;
ALLOCATE CHANNEL d2 TYPE DISK ;
ALLOCATE CHANNEL d3 TYPE DISK ;
ALLOCATE AUXILIARY CHANNEL a1 TYPE DISK ;
ALLOCATE AUXILIARY CHANNEL a2 TYPE DISK ;
ALLOCATE AUXILIARY CHANNEL a3 TYPE DISK ;
duplicate target database to testdb from active database;
}

Even simple things become very hard when you are rushing.

Older posts «

» Newer posts