PSTORAGE−STAT

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
CONFIGURATION FILE
KEYBOARD CONTROLS
TABLE FIELDS
STATUSES
EXIT STATUS
AUTHOR
SEE ALSO

NAME

pstorage-stat, pstorage-top − Parallels Cloud Storage cluster monitoring tools

SYNOPSIS

pstorage top [−n] [−t time] [{−s|−S|−O} key1[=inv][,key2...]]

pstorage stat [−nX] [{−s|−S|−O} key1[=inv][,key2...]]

DESCRIPTION

These commands are used to monitor Pstorage cluster status and health.

pstorage stat retrieves the current Pstorage cluster statistics and prints them in text format.

pstorage top monitors Pstorage clusters in real time.

OPTIONS

−n, −−numeric−addrs

Do not resolve IP addresses to host names.

−T, −−refresh−time=time

Set the refresh interval, in ms.

−s, −−mds−sort=key1[=inv][,key2...]

Sort MDSs first by key1, then by key2, and so on. Specify inv after a key to invert the sort order.

−S, −−cs−sort=key1[=inv][,key2...]

Sort CSs first by key1, then by key2, and so on. Specify inv after a key to invert the sort order.

−O, −−clnt−sort=key1[=inv][,key2...]

Sort clients first by key1, then by key2, and so on. Specify inv after a key to invert the sort order.

−X, −−xml

Print cluster statistics in XML format.

Note: All the above options except −−xml can also be set in the CONFIGURATION FILE described below.

CONFIGURATION FILE

The configuration file /etc/pstorage/pstorage.conf allows you to set the following options:

numeric.addrs

Do not resolve IP addresses to hostnames.

refresh.time

Set the refresh interval, in ms.

freespace.threshold.mb

Set the absolute free space threshold, in MB. If free space drops below this value, the status will change to Warning (see Space Statuses). Default value: 1024.

freespace.threshold.percent

Set the relative free space threshold, in percent. If free space percentage drops below this value, the status will change to Warning (see Space Statuses). Default value: 20.

mds.sort

Set MDS sorting options, see OPTIONS.

cs.sort

Set CS sorting options, see OPTIONS.

clnt.sort

Set client sorting options, see OPTIONS.

KEYBOARD CONTROLS

Show all tables (MDSs, CSs, clients, event log).

Show the MDS table only.

Show the CS table only.

Cycle CS table views.

Show the client table only.

Show the event log only.

Toggle between hostnames and IP addresses.

Show the locations table.

Show the hosts table.

Show the disks table.

Cycle 5s, 1m, 5m and 15m average intervals for all displayed rates.

Select sort fields for the selected table (MDS, CS, or client).

Toggle additional information in the header.

ENTER and SPACE

Refresh output.

q, ESC and Ctrl−C

Quit.

h, ?

Show help.

TABLE FIELDS

This section explains the header and table fields you can browse (MDS, CS, and event log).

Header Fields
The header shows the overall cluster−wide status and statistics:

• cluster name and status (see Cluster Status)

• total physical/logical disk space

• the number of MDS nodes and time since the last master change

• the number of CS nodes

• license status

• replication settings (the normal number of chunk replicas and the minimum number after which a chunk gets blocked until recovered)

• the number of chunks in the following states:

HEALTHY

Chunks with the normal number of replicas.

STANDBY

Chunks with standby replicas. Standby replicas are slightly out of sync (e.g., due to brief network disconnects) and can be resynchronized quickly without full replication.

DEGRADED

Chunks with the number of replicas between the minimum and normal (if the minimum is lower than the normal number of replicas).

URGENT

Chunks with exactly the minimum number of replicas (if the minimum is lower than the normal number of replicas).

BLOCKED

Chunks with fewer than minimum replicas, so writing to them is blocked until more replicas are created.

PENDING

Chunks in the top−priority queue for replication and thus blocked.

OFFLINE

Chunks with no healthy replicas.

REPLICATING

Chunks being replicated at the moment.

OVERCOMMITTED

Chunks with more than the normal number of replicas.

DELETING

Chunks queued for deletion.

VOID

Unused chunks.

• the total number of files, inodes, file maps, chunks and chunk replicas

• I/O speeds, in bytes and operations per second (excluding replication I/O)

• replication I/O speed and estimated time to replicate

• rate of syncs and datasyncs, in operations per second

MDS Fields
MDSID

Global MDS identifier.

STATUS

Current MDS status (see MDS Statuses).

%CTIME

Total time spent writing to the local journal.

COMMITS

Local journal commit rate.

%CPU

MDS activity time.

MEM

The number of pages a process has in real memory.

UPTIME

Time since program startup.

HOST

MDS hostname or IP address.

CS Fields
CSID

Global CS identifier.

STATUS

Current CS status (see CS Statuses).

SPACE

Total space on a CS.

AVAIL

Available space on a CS.

REPLICAS

The number of chunk replicas stored on a CS.

IOWAIT

The percentage of time spent waiting for I/O operations. Includes SWAIT (sync wait time).

IOLAT(ms)

The average/maximum latency of I/O operations (excluding sync operations).

SLAT(ms)

The average/maximum latency of sync operations.

READ

Current read speed in B/s.

WRITE

Current write speed in B/s.

RD_OPS

Current read speed in ops/s.

WR_OPS

Current write speed in ops/s.

MAP_OPS

The number of map operations per second.

SYNC

The number of sync operations per second.

DATASYNC

The number of data sync operations per second.

HOST

CS hostname or IP address.

VOID

The number of unused chunks with replicas on this CS.

BLOCKED

The number of blocked chunks with replicas on this CS.

URGENT

The number of urgent chunks with replicas on this CS.

DEGRDED

The number of degraded chunks with replicas on this CS.

HEALTHY

The number of healthy chunks with replicas on this CS.

OVERCMT

The number of overcommitted chunks with replicas on this CS.

REPL

The number of replicating chunks with replicas on this CS.

OFFLINE

The number of offline chunks with replicas on this CS.

DELETNG

The number of chunks to be deleted with replicas on this CS.

COST

The cost of allocating a chunk on this CS.

ERR

CS error status. If not "None", the CS is not used for chunks allocation at the moment.

LAST_ERR

The previous CS local error status and time since it has been last observed.

LAST_LINK_ERR

The last CS link error status and time since it has been last observed.

FLAGS

The list of enabled CS features: "J" − SSD journal is present "C" − data checksumming "D" − Direct I/O (normal state of CS without SSD journal) CS status: "c" − SSD journal is clean, nothing to commit from SSD to HDD The list of reasons why CS failed: "H" − HDD failed (returned I/O error) "h" − HDD data checksum failed "S" − SSD failed (returned I/O error) "s" − SSD data checksum failed "R" − broken repository. CS couldn’t find its repository "T" − I/O request timeout

TIER

Storage tier assigned to the CS.

JRN_FULL

The percentage of SSD journal awaiting to be stored on HDD. The smaller the better. 100% means overload.

RMW

The number of read−modify−write sequences due to unaligned I/O. May happen with old OS guests due to misaligned partitions. Such sequences reduce performance.

JRMW

The number of read−modify−write sequences served from the SSD journal. If the journal is not configured, accounts unaligned I/O. In this case, performance will depend on the physical HDD sector size.

QDEPTH

Average CS I/O queue depth.

SWAIT

The percentage of time spent waiting for data sync operations.

Event Log Fields
TIME

Log message time stamp.

SYS

Log message subsystem.

SEV

Log message severity.

MESSAGE

Log message.

Locations Table Fields
LOCATION

Location identifier.

HOSTS

Hosts at this location.

Number of chunk servers at this location.

TOTAL

Total disk space of this location.

FREE

Free disk space of this location.

Disks Table Fields
DISK

Disk name assigned by the operating system.

SMART

Disk’s S.M.A.R.T. status. Can either be OK or Warn. The Warn status means that at least one of these S.M.A.R.T. counters is non−zero:

"005" − Reallocated Sector Count

"196" − Reallocated Event Count

"197" − Current Pending Sector Count

"198" − Offline Uncorrectable

TEMP

Disk temperature in Celsius reported by S.M.A.RT.

CAPACITY

Disk capacity.

SERIAL

Disk serial number reported by S.M.A.R.T.

MODEL

Disk model reported by S.M.A.R.T.

HOST

Disk’s host computer address.

Hosts Table Fields
HOSTID

Host identifier.

LOCATION

Location identifier.

MDSES

Number of active MDSes per number of all MDSes available on the host.

CSES

Number of active CSes per number of all CSes available on the host.

CLIENTS

Number of clients available on the host.

SPACE

Total disk space of this host.

RETRANS

Total number of retransmits on the host.

LAT_MAX

Maximum latency between all CSes on the host.

HOST

Host name or IP address.

STATUSES

This section explains MDS, CS, and Pstorage cluster statuses.

Cluster Statuses
healthy

All CSs are active.

unknown

Not enough information yet. MDS is either not a master or has been a master recently.

degraded

Some CSs are inactive.

failure

Too many CSs are inactive. Automatic replication is disabled.

SMART warning

Some disks reported SMART errors.

Space Statuses
OK

Space usage is normal.

Warning

Free space is below either 20% of total or 1G (default).

MDS Statuses
avail

MDS is running and responds to requests.

stale

MDS has not responded to requests for some time.

unavail

MDS has terminated and broke the connection to master.

CS Statuses
active

CS responds to requests.

inactive

CS has not responded for some time. Replication has not started yet.

offline

CS has not responded for quite some time, chunks are being replicated.

EXIT STATUS

Success

Non−zero

Failure (syntax or usage error; configuration error; cluster failure; unexpected error).