Kevin Kempf's Blog

February 14, 2012

Annoying Agent Problems

Filed under: 11g, Enterprise Manager — kkempf @ 9:30 am

New PROD RDBMS host

We run the main ERP database on a physical machine; I’d love to virtualize, and probably will oneday soon, but we couldn’t get to vSphere 5 (required because of CPU count) before the hardware refresh.  So we migrated Oracle to a brand new spiffy Dell R610 and it’s smokin’ fast.  The process was what is known as physical to physical (P to P) server migration, and it went as well as can be expected.  There was a bit of LVM manipulation required at the OS level, but for the most part we managed to bumble our way through it.

In the process of migrating to the new physical machine (from a rather reliable but ancient IBM blade server, incidentally), I took the plunge and cut over our production database hosts from RedHat 5.7 to Oracle Linux 5.7.  I say take the plunge, but in truth the risk was a known entity: it’s a RedHat compatible kernel.  What sparked this decision was 2 miserable, unresponsive tickets with RedHat support about high system CPU on my application server.  Not to be funny about it, but if I can pay about half as much to get bad support, perhaps better, from Oracle, why wouldn’t I?  Incidentally, the process of migrating from RH5 to OL5 (formerly OEL5, know they just call it OL5) is something which I will put in a detailed post shortly.

Angry Agents

After bringing up the database on new hardware, the agent would not communicate with the OMS:

The Oracle Management Server (OMS) has blocked this agent because it has either been reinstalled or restored from a filesystem backup.  Please click on the Agent Resynchronization button to resync the agent.

Your agent is hopelessly confused

When I “clicked on the agent resynchronization button to resync the agent” if failed with an error.   You can bet your last dime, however, the first thing Oracle Support asked in my ticket?  “Did you try clicking the agent resynchronization button?”.    This is the subsequent message (see below as well):

Agent Operation completed with errors.  For those targets that could not be saved, please go to the target’s monitoring configuration page to save them.  All other targets have been saved successfully.  Agent has not been unblocked.

Error communicating with agent.  Exception message – oracle.sysman.emSDK.emd.comm.CommException: IOException in reading Response :: Connection reset

Your agent has double crossed you

Blocked Agents

If there’s one thing I hate, it’s blocked agents.  You bet I tried to unblock it, then resync it.  I tried command line updates like emctl status agent, emctl upload agent, emctl unsecure agent, emctl secure agent.  You name it.  Nada.

The Fix

I stumbled across Document ID 1307816.1 while my analyst was busy asking me things like “can you upload your log files”.  In the end, as the sysman user, I ran this against your EM database:

exec mgmt_admin.cleanup_agent(‘problemhost.domainname.com:3872’);

After that my agent was happy, could talk to the OMS, and life was good.

Advertisements

5 Comments

  1. HI,

    We upgraded and migrated oracle discoverer 10g to 11g.

    Unfortunaly in some of the reports I am getting below mentioned error:

    Database Error – ORA-00997: illegal use of LONG datatype

    Please suggest me some solution.

    Thanks in advance.

    Best,
    Ashwani

    Comment by Ashwani — March 12, 2012 @ 11:28 am

    • My solution is open a ticket with Oracle. This comment is not on topic with the post.

      Comment by kkempf — March 12, 2012 @ 12:17 pm

  2. i’d love to do this: exec mgmt_admin.cleanup_agent(‘problemhost.domainname.com:3872′);

    But i believe this clears the agent history and all its targets? I wouldn’t want to lose all historical information via one command.

    Comment by Vikas Bardwaji — June 27, 2012 @ 4:32 am

    • Honestly, I can’t recall if I lost the history on the server I executed the command on. It wasn’t one I particularly cared about in terms of history, so it wasn’t especially relevant. That said, I’d open an SR or check other blogs because it wouldn’t surprise me if it did wipe history.

      Comment by kkempf — June 27, 2012 @ 7:19 pm

  3. Thank you for revealing this info. The below appears to delete the agent from the OMS. However i must admit its much easier to do below than to attempt to delete from the OMS console.

    exec mgmt_admin.cleanup_agent(‘problemhost.domainname.com:3872′);

    Comment by Melissa — July 24, 2012 @ 8:47 am


RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Create a free website or blog at WordPress.com.