EM13cR2 AWR Warehouse “Error communicating with agent” during transfer step with custom certificates

I have just noticed and resolved an issue in my EM13c R2 AWR Warehouse environment that I brought upon myself, hence a blog post for any others who might run into this, which also seems like a good time to release the scripts I use to generate and populate Oracle wallets for my EM13c agents.

After moving an AWRW source database from one EM13c managed server to a different EM13c managed server (same OS, same DB release), AWRW loads from that server began to fail. While debugging the issue, I first had to resolve an already-documented issue (see MOS note 2075341.1) where the source database had a NULL definition for the CAW_EXTR directory object, then fix up the data in the DBSNMP.CAW_EXTRACT_PROPERTIES table to reflect the CAW_EXTR directory. After resolving that, AWRW extracts ran successfully from the source database, but began to hang indefinitely during the CAW_RUN_ETL_NOW job in the transferAWR/transferFile job step, displaying only a cryptic error message:

An unhelpful error message

A helpful error message

I ran through many debugging steps: changing preferred credentials, bouncing the agents, checking for firewalls blocking connectivity, none seemed to help. Eventually I realized the step I had missed in setting up the new managed server where the source database now runs: I had not generated an Oracle wallet for the agent on the new server, while I did have an Oracle wallet for the agent on the previous, now-retired server. This created an issue because I have secured the agent on my OMS host (where my AWRW repository database runs) with a custom third party certificate, and the new agent, lacking a wallet containing a trusted root certificate to which it could trace the repository agent’s certificate, could not initiate a connection from the AWRW source DB host agent to the AWRW repository DB host agent.

I generated a wallet for the new agent, added the trusted root certificate and a certificate for the host to the wallet, stopped the agent, deployed the wallet, and started the agent. After those steps, running the AWRW load from this source database completed successfully. I believe that the missing trusted root certificate prevented the creation of a secure channel between the two agents. I probably did not need to add the host certificate to resolve this problem, but consider it a good practice anyway.

If you read this far, you may find my create_agent_wallets.sh script useful to generate wallets and certificate signing requests for every agent in your environment. If you find the wallet creation script useful, you may also find my import_agent_wallets.sh script useful to populate those wallets with signed certificates received from your CA.


2 thoughts on “EM13cR2 AWR Warehouse “Error communicating with agent” during transfer step with custom certificates

  1. Wayne Baltimore

    Just a heads up, after upgrading the repository database from to 12,2,0.1 we had some corruption in SYSMAN.MGMT_COLLECTIONS_E on both dev and prod. OEM version at Support had us backup, drop, recreate, reinsert data on table and it fixed the issue.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s