Grid Control
Contents
- 1 Definitions
- 1.1 Additional references:
- 1.2 A useful set of emcli commands
- 1.2.1 Login to EM on OMS
- 1.2.2 logout of EM on OMS
- 1.2.3 Sync imcli with OMS
- 1.2.4 List promoted targets
- 1.2.5 Delete a specific target
- 1.2.6 Delete an agent and its targets
- 1.2.7 Follow a plugin deployment (on the OMS / on an agent)
- 1.2.8 Import an update (for example: a plugin update) into the software library
- 1.2.9 Deploy a plugin on the OMS
- 1.2.10 Deploy a plugin on EM agent(s)
- 1.2.11 List available agents in the library
- 1.2.12 Download an agent from the library (used for agentDeploy.sh script method)
- 1.2.13 Set monitoring credentials for a specific target (example given for an Oracle database instance)
- 1.3 Uninstall the agent oracle home that registered with inventory
- 1.4 Get rid of an agent that won’t go away in the Grid screens
- 1.5 Get rid of targets that won’t go away in the Grid screens:
- 1.6 Re-discover targets
- 1.7 Agent Unreachable
- 1.8 Start a host level blackout from command line (blackout has to already have been created via OEM)
- 1.9 Start a host level blackout for a certain duration
- 1.10 Stop a blackout from command line
- 1.11 Backend WLS or EM application seems to be down
- 1.12 To list targets known to an agent:
- 1.13 Change sysman password
- 1.14 Change dbsnmp password
- 1.15 Quick Checklist
- 1.16 Some useful Metalink Master Documents related to Grid Control
Definitions[edit]
The 10g Enterprise Manager Grid Control includes the following Components:
This document describes the troubleshooting steps to be followed when there is a communication problem between the Oracle Management Service (OMS) and the Grid Agent.
Additional references:[edit]
- Note 951076.1: How to Troubleshoot Communication From a Grid Agent to the Oracle Management Service (OMS) in 10g Enterprise Manager Grid Control?
- Note 1089443.1: How to Troubleshoot Communication From the Grid Console (UI) Machine to the Oracle Management Service (OMS) in 10g Enterprise Manager Grid Control?
- Note 1089693.1: How to Troubleshoot Communication From the Oracle Management Service (OMS) to the Grid Control Repository Database in 10g Enterprise Manager Grid Control?
- OMS - Management Service - This is responsible for communicating with the user via a GUI and communicating with the OMR
- OMR - Management Repository - This is a collection of tables owned by the sysman schema that stores all the data collected by the OMA
- OMA - Management Agent - This is a perl program that runs on the database host (one for all databases on the host) and uploads data to the OMS for storage in the OMR
A useful set of emcli commands[edit]
Login to EM on OMS[edit]
emcli login -username=sysman [-password=<sysmanpassword>]
logout of EM on OMS[edit]
emcli logout
Sync imcli with OMS[edit]
emcli sync
List promoted targets[edit]
emcli get_targets
Delete a specific target[edit]
emcli delete_target -name="demo" -type="database"
Delete an agent and its targets[edit]
emcli delete_target -name="xxxlgc-prod2.mydomain.com:3872" -type="oracle_emd" -delete_monitored_targets
Follow a plugin deployment (on the OMS / on an agent)[edit]
Oracle Database plugin
emcli get_plugin_deployment_status -plugin_id=oracle.sysman.db
Oracle Fusion Middleware plugin
emcli get_plugin_deployment_status -plugin_id=oracle.sysman.emas
My Oracle Support plugin
emcli get_plugin_deployment_status -plugin_id=oracle.sysman.mos
Import an update (for example: a plugin update) into the software library[edit]
emcli import_update -file="p14018177_112000_Generic.zip" -omslocal
Deploy a plugin on the OMS[edit]
emcli deploy_plugin_on_server -plugin=oracle.sysman.db -sys_password=XXXXX
Deploy a plugin on EM agent(s)[edit]
emcli deploy_plugin_on_agent -plugin="oracle.sysman.db" -agent_names="xxxdb-prod1.mydomain.com:3872;xxxdb-prod2.mydomain.com:3872"
List available agents in the library[edit]
emcli get_supported_platforms
Download an agent from the library (used for agentDeploy.sh script method)[edit]
emcli get_agentimage -destination=/home/oracle -platform="Microsoft Windows x64 (64-bit)" -version="12.1.0.1.0"
Set monitoring credentials for a specific target (example given for an Oracle database instance)[edit]
emcli set_credential -target_type=oracle_database -target_name="prod1" -credential_set=DBCredsMonitoring -user=sysman -column="Role:SYSDBA;UserName:sys;password:XXXXX" -monitoring
Uninstall the agent oracle home that registered with inventory[edit]
If you are getting "agent home is already registered with the inventory", it means there is an entry for this home in OraInventory.xml already and it will need removing before installing to the same directory.
Agent Homes are stored in /etc/oragchomelist (Linux, AIX) and /var/opt/oracle/oragchomelist (Solaris).
Removing the entry from oragchomelist makes no difference.
It's in the Oracle Inventory file – its location can be found in /etc/oraInst.loc (Linux, AIX) and /var/opt/oracle/oraInst.loc (Solaris).
ORACLE_HOME="/oracle/cloud/core/12.1.0.1.0" $ORACLE_HOME/oui/bin/runInstaller -silent -detachHome
Get rid of an agent that won’t go away in the Grid screens[edit]
This has the effect of removing everything related to the host in question!
On host where agent is running...
emctl stop agent
On Grid server...
sqlplus / as sysdba
exec mgmt_admin.cleanup_agent('<host>:<port>');
exit
On host where agent is running...
emctl start agent
Get rid of targets that won’t go away in the Grid screens:[edit]
exec mgmt_admin.delete_target('target_name','target_type’)
See mgmt_targets table in sysman schema for list of known targets.
Re-discover targets[edit]
cd $ORACLE_HOME/bin ./agentca -d
Agent Unreachable[edit]
- Is the agent running?
From SQL*Plus:
select username,program from v$session where LOWER (program) like 'emagent%';
If no rows selected, the agent is not running. or
From Unix: . oraenv agent11g (how to see [[which databases are running on the machine]]) emctl status agent ps -ef|grep [a]gent
There should be a handful. If only a few, kill them and restart the agent.
emctl start agent
Check the agent log in $ORACLE_HOME/sysman/log
Start a host level blackout from command line (blackout has to already have been created via OEM)[edit]
This could be included in a shell script before patching, for example...
emctl start blackout server_maint -nodeLevel
Start a host level blackout for a certain duration[edit]
This starts a blackout from now until now + 8 hours
emctl start blackout server_maint -nodeLevel -d 08:00
Stop a blackout from command line[edit]
When patching is done?...
emctl stop blackout server_maint
Backend WLS or EM application seems to be down[edit]
To list targets known to an agent:[edit]
emctl config agent listtargets
It looks at the file $AGENT_HOME/sysman/emd/targets.xml
Manually add targets by editing this file and running:
emctl config agent addtargets $AGENT_HOME/sysman/emd/targets.xml
Check state and upload directories under $AGENT_HOME/sysman/emd for .err files
Change sysman password[edit]
This works until 11.1. For a more complete guide see metalink note 270516.1 or note 259379.1
For the DB Control Release 11.2 and higher, you need to set the environment variable ORACLE_UNQNAME to the value of the DB_UNIQUE_NAME database parameter.
Changing this password is easy as long as it is done correctly. It is a three step process.
Step 1. Change the password in the traditional manner.
SQL> alter user sysman identified by &new_password account unlock;
Step 2. Change the password in the emoms.properties file.
vi ${AGENT_HOME}/sysman/config/emoms.properties
Change the following 2 lines by entering the clear text password where the encrypted password is, and set True to False
oracle.sysman.eml.mntr.emdRepPwd=<new password here><br /> oracle.sysman.eml.mntr.emdRepPwdEncrypted=False<br />
Step 3. Restart the agent. Picks up the new password (and encrypts it)
emctl restart agent
Change dbsnmp password[edit]
Changing this password is easy as long as it is done correctly. It is a three step process.
Change the password in the traditional manner[edit]
SQL> alter user dbsnmp identified by &new_password account unlock;
Change the password in the targets.xml file[edit]
vi ${AGENT_HOME}/sysman/emd/targets.xml
Change the following line by entering the clear text password where the encrypted password is, and set True to False
<Property NAME=”password” VALUE=”<new password>” ENCRYPTED=”FALSE”/>
Restart the agent. Picks up the new password (and encrypts it)[edit]
emctl restart agent
Quick Checklist[edit]
borrowed from oraxprt.com
Verify that the Agent on the target machine is up and running using:
cd <AGENT_HOME>/bin
emctl status agent
The command should return output such as:
emctl status agent
Oracle Enterprise Manager 10g Release 5 Grid Control 10.2.0.5.0.
Copyright (c) 1996, 2009 Oracle Corporation. All rights reserved.
—————————————————————
Agent Version : 10.2.0.5.0
OMS Version : 10.2.0.5.0
Protocol Version : 10.2.0.5.0
Agent Home : /home/oracle/OracleHomes/agent10g
Agent binaries : /home/oracle/OracleHomes/agent10g
Agent Process ID : 24465
Parent Process ID : 24449
Agent URL : https://agentmachine.domain:1830/emd/main/
Repository URL : https://omsmachine.domain:1159/em/upload
Started at : 2010-04-22 15:35:39
Started by user : oracle
….
—————————————————————
which indicates that the Agent has started up fine. Also review the <AGENT_HOME>/sysman/log/emagent.nohup to ensure that the Agent is not re-starting frequently, which can affect the OMS to Agent communication.
Refer to Note 548928.1: Enterprise Manager Grid Control Agent 10g, Process Control (Start, Stop & Status) Troubleshooting Guide
Verify that the Agent’s URL, as seen in the Grid Console -> Setup -> Agent name page is the same as the value configured for the EMD_URL in the <AGENT_HOME>/sysman/config/emd.properties file.
Refer Note 358953.1: What ports are used in communication between the Grid Control OMS and a Management Agent?
1. OMS / Agent Component level issues
If the Agent machine is configured with DHCP and/or the IP address of the machine has recently changed, the OMS will not be able to communicate with the Agent.
Refer Note 605009.1: Problem: OMS Cannot Communicate with Agent if IP Address of the Grid Agent Machine is Changed
If there is a rogue emagent process on the target machine, then the OMS log/trace files could show communication errors. Refer Note 733879.1: Communication: OMS Log/Trace Files Show ‘ERROR eml.OMSHandshake processFailure’ for Agent Already Removed from Grid Console
If the Agent is not capable of accepting incoming connection requests from the OMS, then the communication will fail. Refer Note 550452.1: Communication: OMS to Agent Communication Fails with ‘IOException in sending Request :: Broken pipe’
Verify if there are multiple Agents installed / discovered from this machine. Refer to Note 435728.1: Communication: OMS to Agent Communication Fails with “Connection refused” if Multiple Agent Targets are Discovered
2. Hostname/IP Address Resolution Issues
If the OMS and Agent Components are located in separate machines, then the hostname/IP address resolution should work correctly from the OMS to the Agent machine.
Refer Note 763844.1: How to Verify the Hostname/IP Address Resolution Between the 10g Enterprise Manager Grid Control Components?
If the OMS is unable to resolve the hostname / IP address of the Agent machine, the <OMS_HOME>/sysman/log/emoms.trc will show errors such as below, when trying to access the Agent Homepage in the Grid Console:
2010-04-26 12:01:51,405 [EMUI_12_01_26_/console/admin/rep/emdConfig/emdTargetsMain$target=agentmachine.domain_3A3872$type=oracle*_emd] ERROR emdConfig.EmdConfigTargetsData getEmdUploadData.1732 – IOException in sending Request :: No route to host
To verify the Hostname / IP Address resolution from OMS to Agent machine, follow below steps:
Collect the following details on the Agent machine:
Hostname and the corresponding IP Address on which the Agent is configured:
cd <AGENT_HOME>/bin
emctl status agent
Oracle Enterprise Manager 10g Release 5 Grid Control 10.2.0.5.0.
Copyright (c) 1996, 2009 Oracle Corporation. All rights reserved.
—————————————————————
Agent Version : 10.2.0.5.0
OMS Version : 10.2.0.5.0
Protocol Version : 10.2.0.5.0
Agent Home : /home/oracle/OracleHomes/agent10g
Agent binaries : /home/oracle/OracleHomes/agent10g
Agent Process ID : 24465
Parent Process ID : 24449
Agent URL : https://agentmachine.domain:1830/emd/main/
Repository URL : https://omsmachine.domain:1159/em/upload
Started at : 2010-04-22 15:35:39
Started by user : oracle
The hostname is the one seen in the ‘Agent URL’ field.
Obtain the IP address for this hostname using:
ping <agentmachine.domain>
Output of these commands:
ping <IP adddress of the Agent machine>
ping <hostname.domain of the Agent machine>
ping <hostname of the Agent machine>
nslookup <IP adddress of the Agent machine>
nslookup <hostname.domain of the Agent machine>
nslookup <hostname of the Agent machine>
Collect the following details from the OMS machine:
ping <IP adddress of the Agent machine>
ping <hostname.domain of the Agent machine>
ping <hostname of the Agent machine>
nslookup <IP adddress of the Agent machine>
nslookup <hostname.domain of the Agent machine>
nslookup <hostname of the Agent machine>
Compare the output of the above commands on OMS and Agent machines – the outputs should match. If there is a difference or an error, please enlist the help of your System / Network Administrator to correct the configuration in the hosts file or the DNS.
Note:
1. If all the above commands work fine but the OMS still fails to communicate with the Agent, then stop and restart the OMS once to reset the TCP caching
<OMS_HOME>/opmn/bin
opmnctl stopall
<OMS_HOME>/opmn/bin>
opmnctl startall
2. If the Agent machine has multiple NIC cards / IP addresses, the Agent can be bound to a particular hostname / IP address combination using steps in:
Note 390444.1: How to: Tell the agent to listen to only one specific NIC Network Interface Card?
If the hostname / resolution works fine from the OMS to Agent but the communication still fails, then check for the presence of Firewall or Proxy Server in the setup using the steps below.
3. Firewall Setup / Proxy Server Issues
For details about configuring the Firewall and using the Proxy Server for the EM components, refer
Note 1088393.1: How to Verify the Communication Between the 10g Enterprise Manager Grid Control Components via Firewall/Proxy?
If the Agent port is blocked, then the <OMS_HOME>/sysman/log/emoms.trc will show:
2008-12-01 11:21:25,535[EMUI_11_17_40_/console/admin/rep/emdConfig/emdTargetsMain$target=agentmachine.domain_3A3872$type=oracle*_emd] ERROR emdConfig.EmdConfigTargetsData getEmdTargetsList.1767 – CommException:
Unable to get list of targets from emd-getEmdTargetsList()
2008-12-01 11:21:25,541 [EMUI_11_17_40_/console/admin/rep/emdConfig/emdTargetsMain$target=agentmachine.domain_3A3872$type=oracle*_emd] ERROR emdConfig.EmdConfigTargetsData getEmdTargetsList.1769 - Connection timed out oracle.sysman.emSDK.emd.comm.CommException: Connection timed out
The following error is displayed when trying to look at the Targets -> Agent Host -> Performance page:
An error has occurred!
Unable to obtain data for target solaris.oracle.com. The target may be down. Switching to the last 24 hrs view
Incorrect Proxy server configuration at the OMS side, can cause problems described in
Note 395717.1: Communication: OMS to Agent Communication Fails With ‘Cannot Establish Proxy Connection’ Due to Proxy-Related Settings
To verify the communication between OMS to Agent machine, when Firewall / Proxy server is in use:
Identify the Agent port and URL using the steps in
Note 358953.1: What ports are used in communication between the Grid Control OMS and a Management Agent?
Test the connectivity to the Agent URL from the OMS machine, using one of the following methods:
Open a web-browser on the OMS machine and try to access these URL’s:
http://agentmachine.domain:agentport/em/upload
OR
https://agentmachine.domain:agentport/em/upload
The URL must return an output similar to:
EMAgent10.1.0.2.0
Congratulations, EMAgent is working!
Use telnet
telnet agentmachine.domain <agent port>
Sample output:
telnet agentmachine.domain 3872
Trying 20.20.20.20…
Connected to agentmachine.domain.
Escape character is ‘^]’.
If the access to the port is blocked due to a firewall, then the above command will fail with:
telnet agentmachine.domain 3872
Trying 20.20.20.20…
telnet: connect to address 20.20.20.20: Connection refused
Use wget
wget <agent http url>
OR
wget –no-check-certificate <agent https url>
If any of the above commands fail, please contact your Network Administrator to determine if there is a Firewall / Proxy Server in use and check the configuration.
References
NOTE:235290.1 – Understanding the Enterprise Manager Management Agent 10g ‘emd.properties’ File NOTE:358953.1 – What ports are used in communication between the Grid Control OMS and a Management Agent? NOTE:471842.1 – Understanding Proxy Settings in Enterprise Manager Grid Control
[edit]
- Master Index for Managing Oracle Database and Listener with Grid Control [ID 1304021.1]
- Master Note for 10g Grid Control Agent Process Control (Start, Stop & Status) & Configuration [ID 1082009.1]
- How to Run the RDA against a Grid Control Installation [ID 1057051.1]
- How to Run the RDA against a Grid Control Installation Release 11g [ID 1190193.1]
- Grid Control Target Maintenance: Steps to Diagnose Issues Related to "Agent Unreachable" Status [ID 271126.1]
- Master Note for 10g Grid Control Enterprise Manager Communication and Upload issues [ID 1086343.1]
- Master Note for Target Maintenance Through 10g Enterprise Manager Grid Control [ID 1202453.1]
- Receiving agent unreachable notification emails very often after 10.2.0.4 agent upgrade [ID 752296.1]
- Healthcheck Metric failing for a 10.2.0.4 Target Database with 10.2.0.4 Agent [ID 602633.1]