Administration
This article describes how to manage and perform administrative tasks related to OpsDash server and agent installation.
Files Used by OpsDash
The files installed, created or used by OpsDash are listed below:
Log Files
OpsDash log files are present under /var/log/opsdash
. The OpsDash server
writes the log file opsdash-server.log
and the agent opsdash-agent.log
.
Note that both files may be present on the same machine if the agent is also
installed alongside the server.
Log rotation spec files for these files are installed under /etc/logrotate.d
during installation via RPM/deb packages. You can tweak the defaults (weekly
rotation for 4 weeks) if required.
Database Files
The OpsDash server maintains a set of database files under /var/lib/opsdash/db
.
These files are updated constantly during regular operation, and should not be
manipulated directly.
These databases contain metric data, as well as configuration made via the web UI.
Configuration Files
The OpsDash server configuration file server.cfg
and the agent
configuration file agent.cfg
live in /etc/opsdash
. Additionally, any
license files *.lic
are also present in that directory.
Other Files
The three main OpsDash binaries, opsdash-server
, opsdash-agent
and
opsdash-admin
are present in /usr/sbin
. The daemon scripts are
present under /etc
(exact location depends on Linux distribution).
Note that the daemon names have a d
suffix by convention. These scripts
should work with both systemd and non-systemd style inits. The directory
/usr/share/opsdash
contains static files used by the OpsDash server.
Users and Groups
During installation of OpsDash via a .deb or .rpm package, the user opsdash
and it’s primary group opsdash
are created. The OpsDash server and agent
daemons run as this user.
The user and group are not created if they exist already, so that pre-allocated UIDs can be used.
Security
In general, OpsDash is meant to be a tool for internal use by operations and development teams. It is not designed to be secure enough to be exposed to public internet.
Securing the OpsDash Server
The OpsDash server web UI (that runs by default on port 8080) exposes details
of your servers and configuration of services (passwords are not passed
on to the web UI from the OpsDash server backend, however) and notification
integrations (especially tokens for HipChat, Slack and PagerDuty). This UI
should therefore be accessible only to authorized personnel, and should be
secured behind reverse proxies, firewalls or VPN as necessary. The configuration
entry listen.web-ui
in server.cfg can be used to make OpsDash listen only
on an internal-network interface.
There are two ports (6273/tcp and 6273/udp) that are used by agents to report
metric and dashboard data. If your agents and the server are both connected
by an internal network, then set the entries listen.agent-metrics
and listen.agent-data
in server.cfg to accept incoming connections on
these ports only via the internal network.
The configuration file /etc/opsdash/server.cfg
may contain sensitive
information. This file should be readable only by root
and opsdash
users.
Note that it is not possible to secure the installation if root
cannot be
trusted.
Backup and Restore
The data stored internally by the OpsDash server can be backed up using the
opsdash-admin
command-line utility (explained in detail in the sections
below).
The backup can be performed while the OpsDash server is running, and does not impact the performance of the server while it is going on. The command backs up the metric time-series data, as well as the data for the configuration done via the web UI. We recommend that you perform this backup each day, and retain 7 days worth of backup on a rolling basis.
Additionally, the contents of /etc/opsdash should be backed up whenever it is changed. This is not done by the opsdash-admin tool.
To restore from a backup, use the following procedure:
Install the same version of OpsDash server from which the data was backed up.
Ensure the daemon opsdash-serverd is not running.
Replace the contents of /etc/opsdash from the backup. After restore this directory should contain server.cfg and any *.lic license files.
Replace the contents of /var/lib/opsdash/db from the backup. After restore this directory should contain main.db, alert.db and various 2-digit directories.
Start the server daemon and check log files and web UI.
We recommend that you try out the restore procedure once before you really need to do it.
There is no data to be backed up at the agent-side. You may want to make the configuration file /etc/opsdash/agent.cfg part of a standard template/image, or document the edits needed (typically only the IP/hostname of the OpsDash server).
Upgrading
This section describes how to upgrade a working OpsDash server or agent installation to a newer version.
Note: If you previously installed via packages, it is possible to set up the system to use the RapidLoop yum or apt repository, and then upgrade to a newer version through these repositories. See this section to learn more about the RapidLoop yum and apt repositories.
Upgrading via the yum or apt Repository
If you have set up the system to use the RapidLoop yum or apt repository, you can receive updates for the OpsDash server and agent packages generally with:
# for RHEL-based systems
yum update
# for Debian-based systems
apt-get update && apt-get upgrade
or specifically for the OpsDash packages with:
# for RHEL-based systems
yum install opsdash-server
yum install opsdash-agent
# for Debian-based systems
apt-get install opsdash-server
apt-get install opsdash-agent
Note that the subcommands for yum and apt-get are both called install, but they result in either install or upgrade as required.
You should check the state of the service after upgrade, and verify that it is running:
# check service status (both RHEL and Debian)
service opsdash-serverd status
service opsdash-agentd status
# ensure service is running (both RHEL and Debian)
service opsdash-serverd start
service opsdash-agentd start
Upgrading via Packages
If you are using downloaded packages, you can upgrade with the commands:
# for RHEL-based systems
rpm -Uhv opsdash-agent-1.7-1.x86_64.rpm
# for Debian-based systems
dpkg -i opsdash-agent_1.7_amd64.deb
You should check the state of the service after upgrade, and verify that it is running:
# check service status (both RHEL and Debian)
service opsdash-serverd status
service opsdash-agentd status
# ensure service is running (both RHEL and Debian)
service opsdash-serverd start
service opsdash-agentd start
Configuration Files During Upgrade
Starting in v1.7 of OpsDash, the handling of configuration files during upgrade requires less operator intervention.
On Debian-based systems, in case the configuration files have been modified, apt-get will show a prompt like this:
(Reading database ... 67111 files and directories currently installed.)
Preparing to unpack .../opsdash-server_1.7_amd64.deb ...
Unpacking opsdash-server (1.7) over (1.6.3) ...
Setting up opsdash-server (1.7) ...
Configuration file '/etc/opsdash/server.cfg'
==> File on system created by you or by a script.
==> File also in package provided by package maintainer.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
*** server.cfg (Y/I/N/O/D/Z) [default=N] ?
We recommend that you accept the default, which brings up the new version of OpsDash with the old configuration files. The new version of OpsDash will still work in this situation. The new configuration file will be saved as /etc/opsdash/server.cfg.dpkg-dist. You should review this file, copy over any relevant changes into /etc/opsdash/server.cfg and restart the OpsDash daemon. The agent package also follows the same process.
On RHEL-based systems, in case the configuration files have been modified and there is a newer version of the configuration file, yum will save the new file as /etc/opsdash/server.cfg.rpmnew and bring up the new version of OpsDash with the old configuration files. You should review this file, copy over any relevant changes into /etc/opsdash/server.cfg and restart the OpsDash daemon. The agent package also follows the same process.
In general, the upgrade process tries to minimize the down time of the daemon and tries to bring it back up as quickly as possible, even with older configuration files if needed. However, you should check if the daemon (the server or the agent, as the case may be) is in a running state after the update using the commands in the previous section.
Uninstallation
Uninstallation of OpsDash server or agent can be done using standard package manager commands:
# for Debian-based distros
sudo apt-get remove opsdash-server
# for RHEL-based distros
sudo yum remove opsdash-server
By default, some files and configuration are left behind so that it is possible to re-install the same or a newer version of OpsDash without loss of data. If you wish to remove all traces of OpsDash from the system, do the following additional steps:
Remove contents of
/var/lib/opsdash
and/var/log/opsdash
Remove the user
opsdash
and the groupopsdash
The opsdash-admin
CLI
The OpsDash server package comes with a command-line utility,
called opsdash-admin
for peforming administrative tasks:
$ opsdash-admin
opsdash-admin - tool for administering OpsDash server
Usage:
opsdash-admin command [arguments]
The commands are:
add-service add a new service
backup backup the OpsDash database
delete-server delete a server and all associated data
delete-service delete a service and all associated data
help show usage information
version print OpsDash version
Use "opsdash-admin help [command]" for more information about a command.
The following sections describe each subcommand (except help and version) in more detail.
opsdash-admin add-service
A new service can be added using the “add-service” subcommand:
$ opsdash-admin add-service
opsdash-admin add-service - add a new service
Usage:
opsdash-admin add-service name type config
This command creates a new service, and is identical to creating one through
the OpsDash web UI.
"name" should be a string of up to 32 characters of less, containing only
alphanumeric characters, "-" and "."
"type" is the type of the service, one of:
"mysql" - MySQL
"postgresql" - PostgreSQL
"mongodb" - MongoDB
"memcache" - memcache
"redis" - Redis
"httpurl" - HTTP URL
"elasticsearch" - Elasticsearch Cluster
"elasticsearch-index" - Elasticsearch Index
"elasticsearch-node" - Elasticsearch Node
"docker" - Docker
"php-fpm" - PHP FPM
"haproxy" - HAProxy
"config" is a JSON-encoded string with the configuration for the service.
See https://www.opsdash.com/docs for more info.
Example:
opsdash-admin add-service db-srv-3 mysql '{"address":"10.0.3.34:3306"}'
The parameter config
is a JSON object literal. The keys that it should
contain depend on the type of the service being created.
MySQL service:
Here is an example of creating a MySQL service:
opsdash-admin add-service db-srv-1 mysql \
'{ "address": "10.1.2.3:3306", "user": "someuser", "pass": "plainpass" }'
The keys for the MySQL service JSON object are:
|
Required, string. |
|
Optional, string. |
|
Optional, string. |
PostgreSQL service:
Here is an example of creating a PostgreSQL service:
opsdash-admin add-service db-srv-2 postgresql \
'{ "address": "10.1.2.3:5432", "user": "someuser", "pass": "plainpass", "ssl": true }'
The keys for the PostgreSQL service JSON object are:
|
Required, string. |
|
Optional, string. |
|
Optional, string. |
|
Optional, boolean. |
MongoDB service:
Here is an example of creating a MongoDB service:
opsdash-admin add-service mongo-1 mongodb \
'{ "address": "10.9.0.1:27017", "user": "someuser", "pass": "plainpass" }'
The keys for the MongoDB service JSON object are:
|
Required, string. |
|
Optional, string. |
|
Optional, string. |
Memcache service:
Here is an example of creating a Memcache service:
opsdash-admin add-service cache-42 memcache \
'{ "address": "10.1.2.3:11211" }'
The keys for the Memcache service JSON object are:
|
Required, string. |
Redis service:
Here is an example of creating a Redis service:
opsdash-admin add-service redis-1 redis \
'{ "address": "10.1.2.3:6379" }'
The keys for the Redis service JSON object are:
|
Required, string. |
|
Optional, string. |
HTTP URL service:
Here is a simple example of creating an HTTP(S) URL service:
opsdash-admin add-service cust10-web httpurl \
'{ "url": "https://www.example.com/", "method": "GET" }'
A more complicated example involving POST-ing to presumably an internal queueing service:
$ cat params.json
{
"url": "http://queuesrv.int/",
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/x-www-form-urlencoded"
}
],
"body": "ping=1"
}
$ opsdash-admin add-service api-2 httpurl "$(cat params.json)"
service "api-2" created successfully.
The keys for the HTTP URL service JSON object are:
|
Required, string. |
|
Required, string. |
|
Optional, array of objects, each with “key” and “value”
fields. |
|
Optional, string. |
Elasticsearch service:
Here is an example of creating a Elasticsearch service:
opsdash-admin add-service es-clus-1 elasticsearch \
'{ "address": "10.1.2.3:9200" }'
This creates a service of type “Elasticsearch Cluster”. The keys for the Elasticsearch service JSON object are:
|
Required, string. |
|
Optional, boolean. |
|
Optional, string. |
|
Optional, string. |
Elasticsearch Index service:
Here is an example of creating a Elasticsearch Index service:
opsdash-admin add-service es-index-books elasticsearch-index \
'{ "address": "10.1.2.3:9200", "index": "books" }'
This creates a service of type “Elasticsearch Index”. The keys for the Elasticsearch Index service JSON object are:
|
Required, string. |
|
Required, string. |
|
Optional, boolean. |
|
Optional, string. |
|
Optional, string. |
Elasticsearch Node service:
Here is an example of creating a Elasticsearch Node service:
opsdash-admin add-service es-node-p3 elasticsearch-node \
'{ "address": "10.1.2.3:9200", "node": "es-staging1-n3" }'
This creates a service of type “Elasticsearch Node”. The keys for the Elasticsearch Node service JSON object are:
|
Required, string. |
|
Required, string. |
|
Optional, boolean. |
|
Optional, string. |
|
Optional, string. |
Docker service:
Here is an example of creating a Docker service:
opsdash-admin add-service docker-1 docker \
'{ "url": "unix:///var/run/docker.sock" }'
This creates a service of type “Docker”. The keys for the Docker service JSON object are:
|
Required, string. |
PHP FPM service:
Here is an example of creating a PHP FPM service:
opsdash-admin add-service fpm-www-pool php-fpm \
'{ "url": "http://10.1.0.2/status" }'
This creates a service of type “PHP-FPM”. The keys for the PHP FPM service JSON object are:
|
Required, string. |
HAProxy service:
Here is an example of creating an HAProxy service:
opsdash-admin add-service my-haproxy-1 haproxy \
'{ "url": "http://10.1.0.2/stats" }'
This creates a service of type “HAProxy”. The keys for the HAProxy service JSON object are:
|
Required, string. |
opsdash-admin backup
The “backup” subcommand can be used to take a hot-backup of the OpsDash server databases. The OpsDash server daemon will continue to run and function normally during this operation. The backup includes configuration data (made using the web UI) as well as the metric time-series data.
$ opsdash-admin backup
opsdash-admin backup - backup the OpsDash database
Usage:
opsdash-admin backup dir
This command creates a full backup of all the OpsDash database files into the
specified directory "dir". This directory should either be absent, in which
case it will be created, or should be present but empty.
This command must be run as "root" or the user "opsdash" for it to be able
to open the database.
The progress of the backup is logged to stderr. The command will exit with
exit code 0 if the backup completed successfully, and with 1 otherwise.
Example:
sudo opsdash-admin backup /mnt/backups/opsdash/Friday
Here is a sample invocation of the backup:
$ rm -rf monday
$ sudo opsdash-admin backup monday
2015/08/24 02:40:18 successfully created backup directory [monday/]
2015/08/24 02:40:18 successfully backed up [/var/lib/opsdash/db/main.db] -> [monday/main.db]
2015/08/24 02:40:18 successfully backed up [/var/lib/opsdash/db/alert.db] -> [monday/alert.db]
2015/08/24 02:40:18 successfully backed up [/var/lib/opsdash/db/01/00001.db] -> [monday/01/00001.db]
2015/08/24 02:40:18 successfully backed up [/var/lib/opsdash/db/02/00002.db] -> [monday/02/00002.db]
..
2015/08/24 02:40:19 successfully backed up [/var/lib/opsdash/db/53/00053.db] -> [monday/53/00053.db]
2015/08/24 02:40:19 successfully backed up [/var/lib/opsdash/db/54/00054.db] -> [monday/54/00054.db]
opsdash-admin delete-server
The “delete-server” subcommand can be used to delete all data associated with the specified server. This includes metric time-series data and associated alerts. Source groups in which this server was a part of will be updated to reflect the removal. However, alert history of this server is not deleted.
This operation permanently deletes data and cannot be undone. Use with caution.
$ opsdash-admin delete-server
opsdash-admin delete-server - delete a server and all associated data
Usage:
opsdash-admin delete-server [-y] name
This command deletes the server called "name" and all associated data
irrecoverably. It is not possible to undo this.
Note that it is not possible to do this operation via the OpsDash web UI.
-y Do not ask for delete confirmation; assume "yes"
Example:
opsdash-admin delete-server web-node-43
opsdash-admin delete-service
The “delete-service” subcommand can be used to delete all data associated with the specified service. This includes metric time-series data and associated alerts. Source groups in which this service was a part of will be updated to reflect the removal. However, alert history of this service is not deleted.
This operation permanently deletes data and cannot be undone. Use with caution.
$ opsdash-admin delete-service
opsdash-admin delete-service - delete a service and all associated data
Usage:
opsdash-admin delete-service [-y] name
This command deletes the service called "name" and all associated data
irrecoverably. It is not possible to undo this.
-y Do not ask for delete confirmation; assume "yes"
Example:
opsdash-admin delete-service db-srv-3