Cache trimming is now throttled. Dropping the MDS cache via the “ceph tell mds.<foo> cache drop” command or large reductions in the cache size will no longer cause service unavailability.
Behavior with recalling caps has been significantly improved to not attempt recalling too many caps at once, leading to instability. MDS with a large cache (64GB+) should be more stable.
MDS now provides a config option “mds_max_caps_per_client” (default: 1M) to limit the number of caps a client session may hold. Long running client sessions with a large number of caps have been a source of instability in the MDS when all of these caps need to be processed during certain session events. It is recommended to not unnecessarily increase this value.
The “mds_recall_state_timeout” config parameter has been removed. Late client recall warnings are now generated based on the number of caps the MDS has recalled which have not been released. The new config parameters “mds_recall_warning_threshold” (default: 32K) and “mds_recall_warning_decay_rate” (default: 60s) set the threshold for this warning.
The “cache drop” admin socket command has been removed. The “ceph tell mds.X cache drop” remains.
OSD:
A health warning is now generated if the average osd heartbeat ping time exceeds a configurable threshold for any of the intervals computed. The OSD computes 1 minute, 5 minute and 15 minute intervals with average, minimum and maximum values. New configuration option “mon_warn_on_slow_ping_ratio” specifies a percentage of “osd_heartbeat_grace” to determine the threshold. A value of zero disables the warning. A new configuration option “mon_warn_on_slow_ping_time”, specified in milliseconds, overrides the computed value, causing a warning when OSD heartbeat pings take longer than the specified amount. A new admin command “ceph daemon mgr.# dump_osd_network [threshold]” lists all connections with a ping time longer than the specified threshold or value determined by the config options, for the average for any of the 3 intervals. A new admin command ceph daemon osd.# dump_osd_network [threshold]” does the same but only including heartbeats initiated by the specified OSD.
The default value of the “osd_deep_scrub_large_omap_object_key_threshold” parameter has been lowered to detect an object with large number of omap keys more easily.
RGW:
radosgw-admin introduces two subcommands that allow the managing of expire-stale objects that might be left behind after a bucket reshard in earlier versions of RGW. One subcommand lists such objects and the other deletes them. Read the troubleshooting section of the dynamic resharding docs for details.
cephfs: mds: delay exporting directory whose pin value exceeds max rank id (issue#40603, pr#29940, Zhi Zhang)
cephfs: mds: destroy reconnect msg when it is from non-existent session to avoid memory leak (issue#40588, pr#28796, Shen Hang)
cephfs: mds: evict an unresponsive client only when another client wants its caps (pr#30239, Rishabh Dave)
cephfs: mds: fix SnapRealm::resolve_snapname for long name (issue#39472, pr#28186, “Yan, Zheng”)
cephfs: mds: fix corner case of replaying open sessions (pr#28579, “Yan, Zheng”)
cephfs: mds: high debug logging with many subtrees is slow (issue#38875, pr#29219, Rishabh Dave)
cephfs: mds: make MDSIOContextBase delete itself when shutting down (pr#30417, Xuehan Xu)
cephfs: mds: mds_cap_revoke_eviction_timeout is not used to initialize Server::cap_revoke_eviction_timeout (issue#38844, issue#39210, pr#29220, simon gao)
rbd: use the ordered throttle for the export action (issue#40435, pr#30178, Jason Dillaman)
restful: Query nodes_by_id for items (pr#31273, Boris Ranto)
rgw admin: disable stale instance delete in a multiste env (pr#30340, Abhishek Lekshmanan)
rgw/OutputDataSocket: append_output(buffer::list&) says it will (but does not) discard output at data_max_backlog (issue#40178, issue#40351, pr#29279, Matt Benjamin)
rgw/cls: keep issuing bilog trim ops after reset (issue#40187, pr#30074, Casey Bodley)
rgw/multisite: Don’t allow certain radosgw-admin commands to run on non-master zone (issue#39548, pr#30133, Shilpa Jagannath)
rgw/rgw_op: Remove get_val from hotpath via legacy options (pr#30141, Mark Nelson)
rgw: Add support for –bypass-gc flag of radosgw-admin bucket rm command in RGW Multi-site (issue#39748, issue#24991, pr#29262, Casey Bodley)
rgw: Don’t crash on copy when metadata directive not supplied (issue#40416, pr#29500, Adam C. Emerson)
rgw: Fix bucket versioning vs. swift metadata bug (pr#30140, Marcus Watts)
rgw: Fix rgw decompression log-print (pr#30156, Han Fengzhe)
rgw: Multisite sync corruption for large multipart obj (issue#40144, pr#29273, Casey Bodley, Tianshan Qu, Xiaoxi CHEN)