We added 7 drives to each of the 3 nodes early last week.
The *.rings.gz files were re-created and copied to all 3 nodes. They are identical on all 3 nodes. We did a rebalance at that time.
The swift_hash_path_prefix and swift_hash_path_suffix in /etc/swift.conf is unchanged and identical on all nodes.
Here is the swift-recon output.
[root@swift-r1p1 ~]# swift-recon --all =============================================================================== --> Starting reconnaissance on 3 hosts =============================================================================== [2017-06-26 11:41:04] Checking async pendings [async_pending] - No hosts returned valid data. =============================================================================== [2017-06-26 11:41:04] Checking on replication [replication_time] low: 34, high: 36, avg: 35.2, total: 105, Failed: 0.0%, no_result: 0, reported: 3 Oldest completion was 2017-06-21 23:15:48 (4 days ago) by 192.168.10.236:6000. Most recent completion was 2017-06-21 23:16:23 (4 days ago) by 10.11.12.76:6000. =============================================================================== [2017-06-26 11:41:04] Checking auditor stats [ALL_audit_time_last_path] low: 400766, high: 404103, avg: 402302.9, total: 1206908, Failed: 0.0%, no_result: 0, reported: 3 [ALL_quarantined_last_path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 [ALL_errors_last_path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 [ALL_passes_last_path] low: 47289, high: 48313, avg: 47631.3, total: 142894, Failed: 0.0%, no_result: 0, reported: 3 [ALL_bytes_processed_last_path] low: 34480426331, high: 35853409281, avg: 35055774061.7, total: 105167322185, Failed: 0.0%, no_result: 0, reported: 3 [ZBF_audit_time_last_path] low: 391647, high: 396734, avg: 394729.9, total: 1184189, Failed: 0.0%, no_result: 0, reported: 3 [ZBF_quarantined_last_path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 [ZBF_errors_last_path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 [ZBF_bytes_processed_last_path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 =============================================================================== [2017-06-26 11:41:04] Checking updater times [updater_last_sweep] low: 9, high: 32, avg: 17.9, total: 53, Failed: 0.0%, no_result: 0, reported: 3 =============================================================================== [2017-06-26 11:41:04] Checking on expirers [object_expiration_pass] - No hosts returned valid data. [expired_last_pass] - No hosts returned valid data. =============================================================================== [2017-06-26 11:41:04] Getting unmounted drives from 3 hosts... =============================================================================== [2017-06-26 11:41:04] Checking load averages [5m_load_avg] low: 6, high: 7, avg: 6.8, total: 20, Failed: 0.0%, no_result: 0, reported: 3 [15m_load_avg] low: 6, high: 7, avg: 7.1, total: 21, Failed: 0.0%, no_result: 0, reported: 3 [1m_load_avg] low: 5, high: 6, avg: 5.8, total: 17, Failed: 0.0%, no_result: 0, reported: 3 =============================================================================== [2017-06-26 11:41:04] Checking disk usage now Distribution Graph: 0% 3 **** 7% 1 * 8% 2 ** 10% 1 * 11% 8 *********** 14% 2 ** 15% 4 ***** 16% 3 **** 17% 1 * 29% 3 **** 64% 1 * 68% 1 * 70% 1 * 71% 2 ** 72% 1 * 73% 1 * 74% 47 ********************************************************************* 75% 3 **** Disk usage: space used: 223125483429888 of 477676577447936 Disk usage: space free: 254551094018048 of 477676577447936 Disk usage: lowest: 0.03%, highest: 75.12%, avg: 46.7105765625% =============================================================================== [2017-06-26 11:41:05] Checking ring md5sums 3/3 hosts matched, 0 error[s] while checking hosts. =============================================================================== [2017-06-26 11:41:05] Checking quarantine [quarantined_objects] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 [quarantined_accounts] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 [quarantined_containers] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 =============================================================================== [2017-06-26 11:41:05] Checking socket usage [orphan] low: 15, high: 383, avg: 138.0, total: 414, Failed: 0.0%, no_result: 0, reported: 3 [tcp_in_use] low: 156, high: 540, avg: 290.3, total: 871, Failed: 0.0%, no_result: 0, reported: 3 [time_wait] low: 568, high: 10321, avg: 6383.3, total: 19150, Failed: 0.0%, no_result: 0, reported: 3 [tcp6_in_use] low: 4, high: 4, avg: 4.0, total: 12, Failed: 0.0%, no_result: 0, reported: 3 [tcp_mem_allocated_bytes] low: 2777088, high: 4067328, avg: 3594922.7, total: 10784768, Failed: 0.0%, no_result: 0, reported: 3 =============================================================================== [2017-06-26 11:41:05] Validating server type 'object' on 3 hosts... 3/3 hosts ok, 0 error[s] while checking hosts. =============================================================================== [2017-06-26 11:41:05] Checking drive-audit errors [drive_audit_errors] - No hosts returned valid data. =====================
We added 7 drives to each of the 3 nodes early last week.
The *.rings.gz files were re-created and copied to all 3 nodes. They are identical on all 3 nodes. We did a rebalance at that time.
The swift_hash_ path_prefix and swift_hash_ path_suffix in /etc/swift.conf is unchanged and identical on all nodes.
Here is the swift-recon output.
[root@swift-r1p1 ~]# swift-recon --all ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == 10.236: 6000. ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == time_last_ path] low: 400766, high: 404103, avg: 402302.9, total: 1206908, Failed: 0.0%, no_result: 0, reported: 3 d_last_ path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 last_path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 last_path] low: 47289, high: 48313, avg: 47631.3, total: 142894, Failed: 0.0%, no_result: 0, reported: 3 processed_ last_path] low: 34480426331, high: 35853409281, avg: 35055774061.7, total: 105167322185, Failed: 0.0%, no_result: 0, reported: 3 time_last_ path] low: 391647, high: 396734, avg: 394729.9, total: 1184189, Failed: 0.0%, no_result: 0, reported: 3 d_last_ path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 last_path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 processed_ last_path] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == last_sweep] low: 9, high: 32, avg: 17.9, total: 53, Failed: 0.0%, no_result: 0, reported: 3 ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == expiration_ pass] - No hosts returned valid data. ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == ******* ******* ******* ******* ******* ******* ******* ******* ****** ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == objects] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 accounts] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 containers] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 3 ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == allocated_ bytes] low: 2777088, high: 4067328, avg: 3594922.7, total: 10784768, Failed: 0.0%, no_result: 0, reported: 3 ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= == audit_errors] - No hosts returned valid data. ======= =======
=======
--> Starting reconnaissance on 3 hosts
=======
[2017-06-26 11:41:04] Checking async pendings
[async_pending] - No hosts returned valid data.
=======
[2017-06-26 11:41:04] Checking on replication
[replication_time] low: 34, high: 36, avg: 35.2, total: 105, Failed: 0.0%, no_result: 0, reported: 3
Oldest completion was 2017-06-21 23:15:48 (4 days ago) by 192.168.
Most recent completion was 2017-06-21 23:16:23 (4 days ago) by 10.11.12.76:6000.
=======
[2017-06-26 11:41:04] Checking auditor stats
[ALL_audit_
[ALL_quarantine
[ALL_errors_
[ALL_passes_
[ALL_bytes_
[ZBF_audit_
[ZBF_quarantine
[ZBF_errors_
[ZBF_bytes_
=======
[2017-06-26 11:41:04] Checking updater times
[updater_
=======
[2017-06-26 11:41:04] Checking on expirers
[object_
[expired_last_pass] - No hosts returned valid data.
=======
[2017-06-26 11:41:04] Getting unmounted drives from 3 hosts...
=======
[2017-06-26 11:41:04] Checking load averages
[5m_load_avg] low: 6, high: 7, avg: 6.8, total: 20, Failed: 0.0%, no_result: 0, reported: 3
[15m_load_avg] low: 6, high: 7, avg: 7.1, total: 21, Failed: 0.0%, no_result: 0, reported: 3
[1m_load_avg] low: 5, high: 6, avg: 5.8, total: 17, Failed: 0.0%, no_result: 0, reported: 3
=======
[2017-06-26 11:41:04] Checking disk usage now
Distribution Graph:
0% 3 ****
7% 1 *
8% 2 **
10% 1 *
11% 8 ***********
14% 2 **
15% 4 *****
16% 3 ****
17% 1 *
29% 3 ****
64% 1 *
68% 1 *
70% 1 *
71% 2 **
72% 1 *
73% 1 *
74% 47 *******
75% 3 ****
Disk usage: space used: 223125483429888 of 477676577447936
Disk usage: space free: 254551094018048 of 477676577447936
Disk usage: lowest: 0.03%, highest: 75.12%, avg: 46.7105765625%
=======
[2017-06-26 11:41:05] Checking ring md5sums
3/3 hosts matched, 0 error[s] while checking hosts.
=======
[2017-06-26 11:41:05] Checking quarantine
[quarantined_
[quarantined_
[quarantined_
=======
[2017-06-26 11:41:05] Checking socket usage
[orphan] low: 15, high: 383, avg: 138.0, total: 414, Failed: 0.0%, no_result: 0, reported: 3
[tcp_in_use] low: 156, high: 540, avg: 290.3, total: 871, Failed: 0.0%, no_result: 0, reported: 3
[time_wait] low: 568, high: 10321, avg: 6383.3, total: 19150, Failed: 0.0%, no_result: 0, reported: 3
[tcp6_in_use] low: 4, high: 4, avg: 4.0, total: 12, Failed: 0.0%, no_result: 0, reported: 3
[tcp_mem_
=======
[2017-06-26 11:41:05] Validating server type 'object' on 3 hosts...
3/3 hosts ok, 0 error[s] while checking hosts.
=======
[2017-06-26 11:41:05] Checking drive-audit errors
[drive_
=======