Swift is opening/closing a huge number of sockets and using a lot of cpu in a deploy even without vm's

Bug #1819115 reported by Candido Campos Rivas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
New
Undecided
Unassigned

Bug Description

Info test:

 osp13(the same behavior in osp14)
 3 controllers - 3 computes
 1 external network+ 1 tenant network

 create and delete vms during all the nigth with the script:

 while true ; do for i in $(seq 2 10); do openstack server delete provider-instance$i > h; done; date ; sleep 240; for i in $(seq 2 10); do openstack server create --flavor cirros --image cirros --nic net-id=4252a7a3-fff4-42fc-80f5-4d61525de6bb --key-name mykey provider-instance$i > h; done; date ; sleep 240; echo "LOOP" >> log_test; date >> log_test; done

 I stopped the test this morning after 88 loops, no vms now:

(overcloud) [stack@undercloud-0 ~]$ cat log_test | grep LOOP | wc -l
88
(overcloud) [stack@undercloud-0 ~]$ openstack server list

(overcloud) [stack@undercloud-0 ~]$ cat log_test | tail -1
Thu Mar 7 04:48:34 EST 2019
(overcloud) [stack@undercloud-0 ~]$ date
Thu Mar 7 05:42:16 EST 2019

 And the conclussions are the number of socket is increased and never decreased delteting vm's, but the increase in the number of sockets stops, when open/close sockets reachs 43-50 k sockets every 2 minutes.
 I monit in other chassis during all the night without vm's and without activity and the usage of scokets was constant more or less.

The resource usage seems be excesive, but sysctl options could be modified by reduce the number of blocked sockets/entries in conntrack in the swift containers, for example:

net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120 -> reduce to 15 seconds the number is reduced to 5- 6k

 Script to monit scokets:

while true ; do conntrack -L | grep TIME_WAIT | egrep "6000|6001" | wc -l ; conntrack -L | grep TIME_WAIT | wc -l ;conntrack -L | grep TIME_WAIT | egrep -v "6000|6001" | wc -l ;conntrack -L | grep -v TIME_WAIT | wc -l; sleep 120; done

Info in Test bed 1(no activity during all the night):

)[root@controller-0 /]# date
Thu Mar 7 10:49:11 UTC 2019
()[root@controller-0 /]# tail -10 log2
19691
22189
2457
983
Thu Mar 7 10:45:25 UTC 2019
19375
21643
2334
969
Thu Mar 7 10:47:26 UTC 2019
()[root@controller-0 /]# tail -300 log2
19062
21456
2410
1029
Thu Mar 7 08:50:43 UTC 2019
20039
22443
2379
929
Thu Mar 7 08:52:44 UTC 2019
19462
21828
2428
1032
Thu Mar 7 08:54:45 UTC 2019
18471
20787
2317
964
Thu Mar 7 08:56:45 UTC 2019
19385
21760
2336
937
Thu Mar 7 08:58:46 UTC 2019
20250
22564
2369
994
Thu Mar 7 09:00:47 UTC 2019
18453
20755
2302
983
Thu Mar 7 09:02:48 UTC 2019
21218
23685
2479
1011
Thu Mar 7 09:04:48 UTC 2019
18533
20889
2354
975
Thu Mar 7 09:06:49 UTC 2019
18768
21190
2399
964
Thu Mar 7 09:08:50 UTC 2019
20345
22747
2360
989
Thu Mar 7 09:10:50 UTC 2019
18655
20828
2250
960
Thu Mar 7 09:12:51 UTC 2019
19264
21792
2529
1020
Thu Mar 7 09:14:52 UTC 2019
20516
22941
2423
942
Thu Mar 7 09:16:53 UTC 2019
18839
21252
2396
962
Thu Mar 7 09:18:53 UTC 2019
19480
21808
2351
1005
Thu Mar 7 09:20:54 UTC 2019
19718
22056
2340
955
Thu Mar 7 09:22:55 UTC 2019
19445
22025
2576
1011
Thu Mar 7 09:24:55 UTC 2019
19593
21930
2298
988
Thu Mar 7 09:26:56 UTC 2019
18951
21300
2387
964
Thu Mar 7 09:28:57 UTC 2019
19508
21975
2460
1015
Thu Mar 7 09:30:57 UTC 2019
20195
22450
2272
967
Thu Mar 7 09:32:58 UTC 2019
19426
21956
2527
1020
Thu Mar 7 09:34:59 UTC 2019
19001
21382
2384
959
Thu Mar 7 09:37:00 UTC 2019
19785
22152
2355
1002
Thu Mar 7 09:39:00 UTC 2019
19485
21934
2456
1034
Thu Mar 7 09:41:01 UTC 2019
19160
21477
2304
989
Thu Mar 7 09:43:02 UTC 2019
20227
22668
2487
1025
Thu Mar 7 09:45:02 UTC 2019
18999
21324
2330
977
Thu Mar 7 09:47:03 UTC 2019
19717
22078
2364
967
Thu Mar 7 09:49:04 UTC 2019
19401
21767
2393
1026
Thu Mar 7 09:51:05 UTC 2019
19001
21352
2350
960
Thu Mar 7 09:53:05 UTC 2019
20176
22653
2456
1022
Thu Mar 7 09:55:06 UTC 2019
19555
21910
2371
964
Thu Mar 7 09:57:07 UTC 2019
19023
21474
2403
967
Thu Mar 7 09:59:07 UTC 2019
20048
22471
2464
990
Thu Mar 7 10:01:08 UTC 2019
18960
21349
2388
961
Thu Mar 7 10:03:09 UTC 2019
19395
21930
2483
1002
Thu Mar 7 10:05:10 UTC 2019
19620
21909
2310
972
Thu Mar 7 10:07:11 UTC 2019
18912
21279
2389
977
Thu Mar 7 10:09:11 UTC 2019
19675
22100
2405
995
Thu Mar 7 10:11:12 UTC 2019
19994
22344
2398
932
Thu Mar 7 10:13:13 UTC 2019
18733
21263
2537
981
Thu Mar 7 10:15:14 UTC 2019
20673
23091
2399
959
Thu Mar 7 10:17:15 UTC 2019
18735
21216
2479
974
Thu Mar 7 10:19:15 UTC 2019
18815
21241
2436
1009
Thu Mar 7 10:21:16 UTC 2019
20407
22918
2461
895
Thu Mar 7 10:23:17 UTC 2019
19180
21593
2506
986
Thu Mar 7 10:25:18 UTC 2019
19098
21446
2348
952
Thu Mar 7 10:27:18 UTC 2019
19221
21819
2533
960
Thu Mar 7 10:29:19 UTC 2019
19744
22130
2428
978
Thu Mar 7 10:31:20 UTC 2019
18834
21132
2312
927
Thu Mar 7 10:33:21 UTC 2019
20131
22618
2435
1007
Thu Mar 7 10:35:21 UTC 2019
19430
21651
2253
967
Thu Mar 7 10:37:22 UTC 2019
19001
21471
2467
991
Thu Mar 7 10:39:23 UTC 2019
20169
22485
2345
1001
Thu Mar 7 10:41:24 UTC 2019
18878
21264
2395
954
Thu Mar 7 10:43:24 UTC 2019
19691
22189
2457
983
Thu Mar 7 10:45:25 UTC 2019
19375
21643
2334
969
Thu Mar 7 10:47:26 UTC 2019
19064
21441
2386
975
Thu Mar 7 10:49:26 UTC 2019
()[root@controller-0 /]#

top output ordered by cpu time:

()[root@controller-0 /]# top

top - 10:51:15 up 1 day, 1:22, 0 users, load average: 3.35, 3.43, 3.51
Tasks: 449 total, 6 running, 443 sleeping, 0 stopped, 0 zombie
%Cpu(s): 30.8 us, 10.3 sy, 0.0 ni, 56.7 id, 0.1 wa, 0.0 hi, 2.1 si, 0.0 st
KiB Mem : 16266388 total, 307284 free, 8673284 used, 7285820 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 6664336 avail Mem

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 117649 swift 20 0 1643960 32864 5464 R 36.2 0.2 208:05.46 swift-container
 116885 swift 20 0 1655744 35276 2628 S 26.9 0.2 127:33.50 swift-container
 116887 swift 20 0 1655900 35552 2692 R 26.6 0.2 127:07.75 swift-container
 116827 swift 20 0 1649008 41544 4944 S 0.0 0.3 63:18.12 swift-object-re
 116355 ceilome+ 20 0 2859060 83336 2724 S 4.3 0.5 52:48.80 ceilometer-agen
 114916 swift 20 0 1648576 34228 2172 S 3.0 0.2 48:25.65 swift-object-se
 114915 swift 20 0 1648352 34140 2172 R 2.3 0.2 48:24.46 swift-object-se
  53958 rabbitmq 20 0 2427244 146040 4232 S 2.7 0.9 36:29.41 beam.smp
 136741 root 20 0 675924 117160 2480 S 2.0 0.7 28:55.71 cinder-volume
  61670 mysql 20 0 2906308 566676 146200 S 1.3 3.5 26:05.31 mysqld
  79873 haproxy 20 0 62232 13240 668 S 1.3 0.1 19:19.58 haproxy
 116549 heat 20 0 337380 97024 8212 S 1.0 0.6 16:55.44 heat-engine
 136320 gnocchi 20 0 417100 51784 11468 S 1.0 0.3 16:53.69 gnocchi-metricd
 117139 nova 20 0 297384 97932 6324 S 1.0 0.6 16:33.96 nova-scheduler
 114646 nova 20 0 292940 93376 6240 S 1.0 0.6 16:33.02 nova-conductor
 114575 aodh 20 0 234480 56968 6296 S 0.7 0.4 13:35.41 aodh-evaluator:
 115896 ceilome+ 20 0 260932 57728 7532 S 0.7 0.4 13:32.96 ceilometer-agen
 118226 aodh 20 0 234496 56976 6296 S 0.7 0.4 13:30.97 aodh-listener:
 119368 neutron 20 0 495952 274776 7340 S 1.0 1.7 13:25.78 neutron-dhcp-ag
 115345 aodh 20 0 234496 56972 6296 R 0.7 0.4 13:23.35 aodh-notifier:
 115120 ceilome+ 20 0 362632 62088 10124 S 1.0 0.4 13:18.16 ceilometer-poll
 118763 swift 20 0 246920 52208 7176 S 0.7 0.3 13:05.95 swift-proxy-ser
 117237 swift 20 0 170320 26412 4908 S 0.7 0.2 13:04.28 swift-account-s
 116666 swift 20 0 171060 27260 4904 S 0.7 0.2 12:57.55 swift-container
 114769 swift 20 0 164956 26036 4768 S 0.7 0.2 12:48.97 swift-object-se
 118859 glance 20 0 527040 111972 12876 S 0.7 0.7 12:33.64 glance-api
 118702 keystone 20 0 670144 104588 7288 S 0.0 0.6 12:07.22 httpd
 118706 keystone 20 0 670400 102872 7288 S 0.3 0.6 12:04.21 httpd
  18916 root 20 0 2623300 76740 16592 S 0.3 0.5 12:00.64 dockerd-current

 Info in Test bed 2(create/delte vms during all the night):

conntrack v1.4.4 (conntrack-tools): 44191 flow entries have been shown.
43088
conntrack v1.4.4 (conntrack-tools): 44205 flow entries have been shown.
3514
conntrack v1.4.4 (conntrack-tools): 44222 flow entries have been shown.
1112
conntrack v1.4.4 (conntrack-tools): 47609 flow entries have been shown.
43552
conntrack v1.4.4 (conntrack-tools): 47556 flow entries have been shown.
46482
conntrack v1.4.4 (conntrack-tools): 47522 flow entries have been shown.
2980
conntrack v1.4.4 (conntrack-tools): 47447 flow entries have been shown.
1079
conntrack v1.4.4 (conntrack-tools): 50484 flow entries have been shown.
46181
conntrack v1.4.4 (conntrack-tools): 50540 flow entries have been shown.
49504
conntrack v1.4.4 (conntrack-tools): 50644 flow entries have been shown.
3271
conntrack v1.4.4 (conntrack-tools): 50745 flow entries have been shown.
1042
conntrack v1.4.4 (conntrack-tools): 48821 flow entries have been shown.
44426
conntrack v1.4.4 (conntrack-tools): 48682 flow entries have been shown.
47640
conntrack v1.4.4 (conntrack-tools): 48625 flow entries have been shown.
3347
conntrack v1.4.4 (conntrack-tools): 48547 flow entries have been shown.
1045
conntrack v1.4.4 (conntrack-tools): 51374 flow entries have been shown.
47343
conntrack v1.4.4 (conntrack-tools): 51345 flow entries have been shown.
50305
conntrack v1.4.4 (conntrack-tools): 51260 flow entries have been shown.
2990
conntrack v1.4.4 (conntrack-tools): 51183 flow entries have been shown.
1032
conntrack v1.4.4 (conntrack-tools): 51349 flow entries have been shown.
47435
conntrack v1.4.4 (conntrack-tools): 51357 flow entries have been shown.
50330
conntrack v1.4.4 (conntrack-tools): 51335 flow entries have been shown.
2898
conntrack v1.4.4 (conntrack-tools): 51322 flow entries have been shown.
1022
conntrack v1.4.4 (conntrack-tools): 51851 flow entries have been shown.
47994
conntrack v1.4.4 (conntrack-tools): 51863 flow entries have been shown.
50817
conntrack v1.4.4 (conntrack-tools): 51853 flow entries have been shown.
2814
conntrack v1.4.4 (conntrack-tools): 51843 flow entries have been shown.
1041
conntrack v1.4.4 (conntrack-tools): 51206 flow entries have been shown.
47547
conntrack v1.4.4 (conntrack-tools): 51208 flow entries have been shown.
50205
conntrack v1.4.4 (conntrack-tools): 51216 flow entries have been shown.
2666
conntrack v1.4.4 (conntrack-tools): 51218 flow entries have been shown.
1003
conntrack v1.4.4 (conntrack-tools): 51003 flow entries have been shown.
47333
conntrack v1.4.4 (conntrack-tools): 51034 flow entries have been shown.
50032
conntrack v1.4.4 (conntrack-tools): 51077 flow entries have been shown.
2661
conntrack v1.4.4 (conntrack-tools): 51097 flow entries have been shown.
1002
conntrack v1.4.4 (conntrack-tools): 50155 flow entries have been shown.
46459
conntrack v1.4.4 (conntrack-tools): 50211 flow entries have been shown.
49194
conntrack v1.4.4 (conntrack-tools): 50251 flow entries have been shown.
2675
conntrack v1.4.4 (conntrack-tools): 50297 flow entries have been shown.
1003
conntrack v1.4.4 (conntrack-tools): 49422 flow entries have been shown.
45782
conntrack v1.4.4 (conntrack-tools): 49422 flow entries have been shown.
48413
conntrack v1.4.4 (conntrack-tools): 49428 flow entries have been shown.
2634
conntrack v1.4.4 (conntrack-tools): 49411 flow entries have been shown.
1012
conntrack v1.4.4 (conntrack-tools): 49631 flow entries have been shown.
46013
conntrack v1.4.4 (conntrack-tools): 49600 flow entries have been shown.
48581
conntrack v1.4.4 (conntrack-tools): 49555 flow entries have been shown.
2595
conntrack v1.4.4 (conntrack-tools): 49518 flow entries have been shown.
1011
conntrack v1.4.4 (conntrack-tools): 49795 flow entries have been shown.
46101
conntrack v1.4.4 (conntrack-tools): 49674 flow entries have been shown.
48662
conntrack v1.4.4 (conntrack-tools): 49605 flow entries have been shown.
2685
conntrack v1.4.4 (conntrack-tools): 49533 flow entries have been shown.
1004
conntrack v1.4.4 (conntrack-tools): 50747 flow entries have been shown.
46991
conntrack v1.4.4 (conntrack-tools): 50748 flow entries have been shown.
49712
conntrack v1.4.4 (conntrack-tools): 50739 flow entries have been shown.
2722
conntrack v1.4.4 (conntrack-tools): 50730 flow entries have been shown.
1030
conntrack v1.4.4 (conntrack-tools): 50556 flow entries have been shown.
46842
conntrack v1.4.4 (conntrack-tools): 50420 flow entries have been shown.
49413
conntrack v1.4.4 (conntrack-tools): 50367 flow entries have been shown.
2710
conntrack v1.4.4 (conntrack-tools): 50309 flow entries have been shown.
994

top ordered by cpu time:

[root@controller-0 heat-admin]# top

top - 10:54:12 up 18:52, 2 users, load average: 6.20, 6.31, 6.18
Tasks: 455 total, 10 running, 445 sleeping, 0 stopped, 0 zombie
%Cpu0 : 43.3 us, 11.5 sy, 0.0 ni, 38.5 id, 0.0 wa, 0.0 hi, 6.7 si, 0.0 st
%Cpu1 : 31.7 us, 31.7 sy, 0.0 ni, 35.6 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu2 : 28.4 us, 12.7 sy, 0.0 ni, 57.8 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu3 : 33.7 us, 23.1 sy, 0.0 ni, 42.3 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
KiB Mem : 16266508 total, 291048 free, 8354744 used, 7620716 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 6851364 avail Mem

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 114858 42445 20 0 166296 29580 5468 S 42.5 0.2 377:23.73 swift-container
 114055 42445 20 0 173944 27900 2564 R 10.4 0.2 106:40.41 swift-container
 114061 42445 20 0 173800 27716 2564 R 11.3 0.2 106:36.26 swift-container
 114063 42445 20 0 173676 27568 2564 R 14.2 0.2 106:27.33 swift-container
 114054 42445 20 0 173656 27572 2564 R 9.4 0.2 106:18.36 swift-container
 113954 42445 20 0 1651288 42628 4940 S 0.0 0.3 76:54.84 swift-object-re
 113542 42405 20 0 2858884 86800 2684 S 3.8 0.5 41:19.61 ceilometer-agen
  47489 42439 20 0 2441532 154284 4236 S 4.7 0.9 39:03.29 beam.smp
  55203 42434 20 0 2903028 563912 146160 S 0.9 3.5 38:59.74 mysqld
 112235 42445 20 0 1649600 36072 2160 S 0.0 0.2 35:52.37 swift-object-se
 112236 42445 20 0 1649764 36144 2164 S 0.0 0.2 35:50.67 swift-object-se
 112240 42445 20 0 1650104 36600 2160 S 0.0 0.2 35:47.59 swift-object-se
 112237 42445 20 0 1650440 36880 2160 S 0.0 0.2 35:45.92 swift-object-se
  73306 42454 20 0 61416 12456 664 S 0.0 0.1 23:56.55 haproxy
 116041 42425 20 0 761004 107160 7528 S 0.0 0.7 23:12.11 httpd
 116040 42425 20 0 679252 103500 7400 S 0.0 0.6 22:10.96 httpd
 134379 root 20 0 674260 115560 2488 R 1.9 0.7 18:54.03 cinder-volume
 134138 42416 20 0 856548 78936 5688 S 0.9 0.5 17:39.72 gnocchi-metricd
 134137 42416 20 0 856548 79100 5692 S 0.9 0.5 16:38.75 gnocchi-metricd
 118796 42435 20 0 373752 139376 2272 S 0.0 0.9 15:00.04 neutron-server
 118795 42435 20 0 374212 139756 2272 S 0.0 0.9 14:59.12 neutron-server
 116043 42425 20 0 680060 101388 7444 S 0.9 0.6 13:07.37 httpd
 116042 42425 20 0 681340 102596 7436 S 0.0 0.6 13:00.09 httpd
 113720 42418 20 0 334472 94036 8288 S 0.9 0.6 12:57.55 heat-engine
 133889 42416 20 0 416112 50848 11536 S 1.9 0.3 12:55.88 gnocchi-metricd
 114265 42436 20 0 292640 91088 6316 S 0.9 0.6 12:49.95 nova-conductor
 116435 42436 20 0 291452 89952 6304 S 0.9 0.6 11:53.18 nova-api-metada
 115488 42402 20 0 233348 55888 6368 R 0.9 0.3 10:15.50 aodh-listener:
 112567 42405 20 0 361112 61020 10212 S 0.9 0.4 10:12.50 ceilometer-poll
 112793 42402 20 0 233348 55888 6368 S 0.0 0.3 10:09.20 aodh-notifier:
 116088 42445 20 0 220396 47648 6016 R 0.9 0.3 10:07.69 swift-proxy-ser
 114591 42402 20 0 233348 55892 6368 S 0.0 0.3 10:04.55 aodh-evaluator:
 113242 42405 20 0 259736 56584 7616 S 0.9 0.3 10:02.05 ceilometer-agen
 114404 42445 20 0 170156 26252 4904 S 1.9 0.2 9:56.52 swift-account-s
 112041 42445 20 0 164616 25776 4768 S 0.9 0.2 9:39.96 swift-object-se
 118798 42435 20 0 368048 133560 2268 S 0.9 0.8 9:37.28 neutron-server
 116242 42415 20 0 509036 109308 12676 S 0.9 0.7 9:37.23 glance-api
 116692 42435 20 0 310184 94312 7232 S 0.9 0.6 9:35.14 neutron-dhcp-ag
 113834 42445 20 0 170812 27000 4912 S 0.0 0.2 9:28.43 swift-container
 133430 root 20 0 669444 125528 17288 S 0.0 0.8 9:19.82 cinder-volume
  23062 root 20 0 2662468 74140 16716 S 0.0 0.5 9:00.33 dockerd-current
 118797 42435 20 0 366636 132204 2240 S 0.0 0.8 8:38.15 neutron-server

Swift logs after one day without vms:

[root@controller-1 heat-admin]# date
Fri Mar 8 07:33:58 UTC 2019
[root@controller-1 heat-admin]# tail -f //var/log/containers/swift/swift.log
Mar 8 02:33:46 controller-1 account-server: Replication run OVER
Mar 8 02:33:46 controller-1 account-server: Attempted to replicate 1 dbs in 0.01374 seconds (72.80587/s)
Mar 8 02:33:46 controller-1 account-server: Removed 0 dbs
Mar 8 02:33:46 controller-1 account-server: 2 successes, 0 failures
Mar 8 02:33:46 controller-1 account-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:2 remote_merge:0 rsync:0 ts_repl:0
Mar 8 02:33:56 controller-1 object-server: 1024/1024 (100.00%) partitions replicated in 12.98s (78.91/sec, 0s remaining)
Mar 8 02:33:56 controller-1 object-server: 2048 successes, 0 failures
Mar 8 02:33:56 controller-1 object-server: 75806 suffixes checked - 0.00% hashed, 0.00% synced
Mar 8 02:33:56 controller-1 object-server: Partition times: max 0.0515s, min 0.0061s, med 0.0113s
Mar 8 02:33:56 controller-1 object-server: Object replication complete. (0.22 minutes)
Mar 8 02:34:00 controller-1 container-server: Replication run OVER
Mar 8 02:34:00 controller-1 container-server: Attempted to replicate 12212 dbs in 160.74059 seconds (75.97334/s)
Mar 8 02:34:00 controller-1 container-server: Removed 0 dbs
Mar 8 02:34:00 controller-1 container-server: 24424 successes, 0 failures
Mar 8 02:34:00 controller-1 container-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:24424 remote_merge:0 rsync:0 ts_repl:0
Mar 8 02:34:00 controller-1 container-server: Beginning replication run
Mar 8 02:34:16 controller-1 account-server: Beginning replication run
Mar 8 02:34:16 controller-1 account-server: Replication run OVER
Mar 8 02:34:16 controller-1 account-server: Attempted to replicate 1 dbs in 0.01143 seconds (87.47351/s)
Mar 8 02:34:16 controller-1 account-server: Removed 0 dbs
Mar 8 02:34:16 controller-1 account-server: 2 successes, 0 failures
Mar 8 02:34:16 controller-1 account-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:2 remote_merge:0 rsync:0 ts_repl:0
Mar 8 02:34:26 controller-1 object-server: Starting object replication pass.
Mar 8 02:34:28 controller-1 account-server: 172.17.4.19 - - [08/Mar/2019:07:34:28 +0000] "HEAD /d1/815/.expiring_objects" 404 - "HEAD http://localhost/v1/.expiring_objects?format=json" "tx0221871e961d449895f43-005c821b04" "proxy-server 1" 0.0004 "-" 26 -
Mar 8 02:34:28 controller-1 account-server: 172.17.4.19 - - [08/Mar/2019:07:34:28 +0000] "GET /d1/815/.expiring_objects" 404 - "GET http://localhost/v1/.expiring_objects?marker=&prefix=&end_marker=&format=json" "tx51e5a0716f37430aa6a03-005c821b04" "proxy-server 1" 0.0005 "-" 23 -
Mar 8 02:34:30 controller-1 container-server: Container update sweep completed: 280.37s
Mar 8 02:34:38 controller-1 object-server: 1024/1024 (100.00%) partitions replicated in 11.48s (89.23/sec, 0s remaining)
Mar 8 02:34:38 controller-1 object-server: 2048 successes, 0 failures
Mar 8 02:34:38 controller-1 object-server: 75806 suffixes checked - 0.00% hashed, 0.00% synced
Mar 8 02:34:38 controller-1 object-server: Partition times: max 0.0371s, min 0.0059s, med 0.0101s
Mar 8 02:34:38 controller-1 object-server: Object replication complete. (0.19 minutes)
Mar 8 02:34:46 controller-1 account-server: Beginning replication run
Mar 8 02:34:46 controller-1 account-server: Replication run OVER
Mar 8 02:34:46 controller-1 account-server: Attempted to replicate 1 dbs in 0.01863 seconds (53.69093/s)
Mar 8 02:34:46 controller-1 account-server: Removed 0 dbs
Mar 8 02:34:46 controller-1 account-server: 2 successes, 0 failures
Mar 8 02:34:46 controller-1 account-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:2 remote_merge:0 rsync:0 ts_repl:0
Mar 8 02:34:50 controller-1 container-server: Begin container update sweep
Mar 8 02:35:08 controller-1 object-server: Starting object replication pass.
Mar 8 02:35:09 controller-1 object-server: Begin object update sweep
Mar 8 02:35:09 controller-1 object-server: Object update sweep starting on /srv/node/d1 (pid: 508)
Mar 8 02:35:09 controller-1 object-server: Object update sweep of d1 completed: 0.00s, 0 successes, 0 failures, 0 quarantines, 0 unlinks, 0 errors
Mar 8 02:35:09 controller-1 object-server: Object update sweep completed: 0.06s
Mar 8 02:35:16 controller-1 account-server: Beginning replication run
Mar 8 02:35:16 controller-1 account-server: Replication run OVER
Mar 8 02:35:16 controller-1 account-server: Attempted to replicate 1 dbs in 0.02329 seconds (42.93289/s)
Mar 8 02:35:16 controller-1 account-server: Removed 0 dbs
Mar 8 02:35:16 controller-1 account-server: 2 successes, 0 failures
Mar 8 02:35:16 controller-1 account-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:2 remote_merge:0 rsync:0 ts_repl:0
Mar 8 02:35:20 controller-1 object-server: 1024/1024 (100.00%) partitions replicated in 12.04s (85.06/sec, 0s remaining)
Mar 8 02:35:20 controller-1 object-server: 2048 successes, 0 failures
Mar 8 02:35:20 controller-1 object-server: 75806 suffixes checked - 0.00% hashed, 0.00% synced
Mar 8 02:35:20 controller-1 object-server: Partition times: max 0.0564s, min 0.0059s, med 0.0102s
Mar 8 02:35:20 controller-1 object-server: Object replication complete. (0.20 minutes)
Mar 8 02:35:46 controller-1 account-server: Beginning replication run
Mar 8 02:35:46 controller-1 account-server: Replication run OVER
Mar 8 02:35:46 controller-1 account-server: Attempted to replicate 1 dbs in 0.01103 seconds (90.68487/s)
Mar 8 02:35:46 controller-1 account-server: Removed 0 dbs
Mar 8 02:35:46 controller-1 account-server: 2 successes, 0 failures
Mar 8 02:35:46 controller-1 account-server: diff:0 diff_capped:0 empty:0 hashmatch:0 no_change:2 remote_merge:0 rsync:0 ts_repl:0
Mar 8 02:35:50 controller-1 object-server: Starting object replication pass.
Mar 8 02:35:56 controller-1 account-server: 172.17.4.30 - - [08/Mar/2019:07:35:56 +0000] "HEAD /d1/815/.expiring_objects" 404 - "HEAD http://localhost/v1/.expiring_objects?format=json" "txddc7a377cca7498c9ee80-005c821b5c" "proxy-server 1" 0.0004 "-" 24 -
Mar 8 02:35:56 controller-1 account-server: 172.17.4.30 - - [08/Mar/2019:07:35:56 +0000] "GET /d1/815/.expiring_objects" 404 - "GET http://localhost/v1/.expiring_objects?marker=&prefix=&end_marker=&format=json" "txd2ea61e134a548ecaa538-005c821b5c" "proxy-server 1" 0.0004 "-" 23 -
Mar 8 02:36:01 controller-1 account-server: 172.17.4.10 - - [08/Mar/2019:07:36:01 +0000] "HEAD /d1/815/.expiring_objects" 404 - "HEAD http://localhost/v1/.expiring_objects?format=json" "txec5a7fd9d7034ef693d5b-005c821b61" "proxy-server 1" 0.0004 "-" 25 -
Mar 8 02:36:01 controller-1 object-expirer: Pass beginning; 0 possible containers; 0 possible objects (txn: txec5a7fd9d7034ef693d5b-005c821b61)
Mar 8 02:36:01 controller-1 account-server: 172.17.4.10 - - [08/Mar/2019:07:36:01 +0000] "GET /d1/815/.expiring_objects" 404 - "GET http://localhost/v1/.expiring_objects?marker=&prefix=&end_marker=&format=json" "txcfb626155ea5413db6118-005c821b61" "proxy-server 1" 0.0005 "-" 26 -
Mar 8 02:36:01 controller-1 object-expirer: Pass completed in 0s; 0 objects expired (txn: txcfb626155ea5413db6118-005c821b61)
Mar 8 02:36:02 controller-1 object-server: 1024/1024 (100.00%) partitions replicated in 11.72s (87.37/sec, 0s remaining)
Mar 8 02:36:02 controller-1 object-server: 2048 successes, 0 failures
Mar 8 02:36:02 controller-1 object-server: 75806 suffixes checked - 0.00% hashed, 0.00% synced
Mar 8 02:36:02 controller-1 object-server: Partition times: max 0.0386s, min 0.0059s, med 0.0100s
Mar 8 02:36:02 controller-1 object-server: Object replication complete. (0.20 minutes)

description: updated
Revision history for this message
clayg (clay-gerrard) wrote :

So did dialing back os level connection tracking resolves the issue for you?

Revision history for this message
Candido Campos Rivas (ccamposr) wrote :

No, it reduces the resources usage. But I htink that the solution should be maintain open conections, reuse the sockets ... beacuse I don't understand what is the reason for using so many resources even without vm's or activity in the test bed.

Revision history for this message
clayg (clay-gerrard) wrote :

Yes connection pools could keep those values down. The activity your noticing is most likely (broadly speaking) part of swift's consistency engines active auditing.

On a small swift cluster you certainly have the option to increasing your interval time options for the the various daemons - most checks happen every 30s, maybe you'd prefer that done every 24hrs?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.