v10.0.2 released

sage

This development release includes a raft of changes and improvements for Jewel. Key additions include CephFS scrub/repair improvements, an AIX and Solaris port of librados, many librbd journaling additions and fixes, extended per-pool options, and NBD driver for RBD (rbd-nbd) that allows librbd to present a kernel-level block device on Linux, multitenancy support for RGW, RGW bucket lifecycle support, RGW support for Swift static large objects (SLO), and RGW support for Swift bulk delete.

There are also lots of smaller optimizations and performance fixes going in all over the tree, particular in the OSD and common code.

NOTABLE CHANGES

  • auth: fail if rotating key is missing (do not spam log) (pr#6473, Qiankun Zheng)
  • auth: fix crash when bad keyring is passed (pr#6698, Dunrong Huang)
  • auth: make keyring without mon entity type return -EACCES (pr#5734, Xiaowei Chen)
  • buffer: make usable outside of ceph source again (pr#6863, Josh Durgin)
  • build: cmake check fixes (pr#6787, Orit Wasserman)
  • build: fix bz2-dev dependency (pr#6948, Samuel Just)
  • build: Gentoo: _FORTIFY_SOURCE fix. (issue#13920, pr#6739, Robin H. Johnson)
  • build/ops: systemd ceph-disk unit must not assume /bin/flock (issue#13975, pr#6803, Loic Dachary)
  • ceph-detect-init: Ubuntu >= 15.04 uses systemd (pr#6873, James Page)
  • cephfs-data-scan: scan_frags (pr#5941, John Spray)
  • cephfs-data-scan: scrub tag filtering (#12133 and #12145) (issue#12133, issue#12145, pr#5685, John Spray)
  • ceph-fuse: add process to ceph-fuse –help (pr#6821, Wei Feng)
  • ceph-kvstore-tool: handle bad out file on command line (pr#6093, Kefu Chai)
  • ceph-mds:add –help/-h (pr#6850, Cilang Zhao)
  • ceph_objectstore_bench: fix race condition, bugs (issue#13516, pr#6681, Igor Fedotov)
  • ceph.spec.in: add BuildRequires: systemd (issue#13860, pr#6692, Nathan Cutler)
  • client: a better check for MDS availability (pr#6253, John Spray)
  • client: close mds sessions in shutdown() (pr#6269, John Spray)
  • client: don’t invalidate page cache when inode is no longer used (pr#6380, Yan, Zheng)
  • client: modify a word in log (pr#6906, YongQiang He)
  • cls/cls_rbd.cc: fix misused metadata_name_from_key (issue#13922, pr#6661, Xiaoxi Chen)
  • cmake: Add common/PluginRegistry.cc to CMakeLists.txt (pr#6805, Pete Zaitcev)
  • cmake: add rgw_basic_types.cc to librgw.a (pr#6786, Orit Wasserman)
  • cmake: add TracepointProvider.cc to libcommon (pr#6823, Orit Wasserman)
  • cmake: define STRERROR_R_CHAR_P for GNU-specific strerror_r (pr#6751, Ilya Dryomov)
  • cmake: update for recent librbd changes (pr#6715, John Spray)
  • cmake: update for recent rbd changes (pr#6818, Mykola Golub)
  • common: add generic plugin infrastructure (pr#6696, Sage Weil)
  • common: add latency perf counter for finisher (pr#6175, Xinze Chi)
  • common: buffer: add cached_crc and cached_crc_adjust counts to perf dump (pr#6535, Ning Yao)
  • common: buffer: remove unneeded list destructor (pr#6456, Michal Jarzabek)
  • common/ceph_context.cc:fix order of initialisers (pr#6838, Michal Jarzabek)
  • common: don’t reverse hobject_t hash bits when zero (pr#6653, Piotr Dałek)
  • common: log: Assign LOG_DEBUG priority to syslog calls (issue#13993, pr#6815, Brad Hubbard)
  • common: log: predict log message buffer allocation size (pr#6641, Adam Kupczyk)
  • common: optimize debug logging code (pr#6441, Adam Kupczyk)
  • common: perf counter for bufferlist history total alloc (pr#6198, Xinze Chi)
  • common: reduce CPU usage by making stringstream in stringify function thread local (pr#6543, Evgeniy Firsov)
  • common: re-enable backtrace support (pr#6771, Jason Dillaman)
  • common: SubProcess: fix multiple definition bug (pr#6790, Yunchuan Wen)
  • common: use namespace instead of subclasses for buffer (pr#6686, Michal Jarzabek)
  • configure.ac: macro fix (pr#6769, Igor Podoski)
  • doc: admin/build-doc: add lxml dependencies on debian (pr#6610, Ken Dreyer)
  • doc/cephfs/posix: update (pr#6922, Sage Weil)
  • doc: CodingStyle: fix broken URLs (pr#6733, Kefu Chai)
  • doc: correct typo ‘restared’ to ‘restarted’ (pr#6734, Yilong Zhao)
  • doc/dev/index: refactor/reorg (pr#6792, Nathan Cutler)
  • doc/dev/index.rst: begin writing Contributing to Ceph (pr#6727, Nathan Cutler)
  • doc/dev/index.rst: fix headings (pr#6780, Nathan Cutler)
  • doc: dev: introduction to tests (pr#6910, Loic Dachary)
  • doc: file must be empty when writing layout fields of file use “setfattr” (pr#6848, Cilang Zhao)
  • doc: Fixed incorrect name of a “List Multipart Upload Parts” Response Entity (issue#14003, pr#6829, Lenz Grimmer)
  • doc: Fixes a spelling error (pr#6705, Jeremy Qian)
  • doc: fix typo in cephfs/quota (pr#6745, Drunkard Zhang)
  • doc: fix typo in developer guide (pr#6943, Nathan Cutler)
  • doc: INSTALL redirect to online documentation (pr#6749, Loic Dachary)
  • doc: little improvements for troubleshooting scrub issues (pr#6827, Mykola Golub)
  • doc: Modified a note section in rbd-snapshot doc. (pr#6908, Nilamdyuti Goswami)
  • doc: note that cephfs auth stuff is new in jewel (pr#6858, John Spray)
  • doc: osd: s/schedued/scheduled/ (pr#6872, Loic Dachary)
  • doc: remove unnecessary period in headline (pr#6775, Marc Koderer)
  • doc: rst style fix for pools document (pr#6816, Drunkard Zhang)
  • doc: Update list of admin/build-doc dependencies (issue#14070, pr#6934, Nathan Cutler)
  • init-ceph: do umount when the path exists. (pr#6866, Xiaoxi Chen)
  • journal: disconnect watch after watch error (issue#14168, pr#7113, Jason Dillaman)
  • journal: fire replay complete event after reading last object (issue#13924, pr#6762, Jason Dillaman)
  • journal: support replaying beyond skipped splay objects (pr#6687, Jason Dillaman)
  • librados: aix gcc librados port (pr#6675, Rohan Mars)
  • librados: avoid malloc(0) (which can return NULL on some platforms) (issue#13944, pr#6779, Dan Mick)
  • librados: clean up Objecter.h (pr#6731, Jie Wang)
  • librados: include/rados/librados.h: fix typo (pr#6741, Nathan Cutler)
  • librbd: automatically flush IO after blocking write operations (issue#13913, pr#6742, Jason Dillaman)
  • librbd: better handling of exclusive lock transition period (pr#7204, Jason Dillaman)
  • librbd: check for presence of journal before attempting to remove (issue#13912, pr#6737, Jason Dillaman)
  • librbd: clear error when older OSD doesn’t support image flags (issue#14122, pr#7035, Jason Dillaman)
  • librbd: correct include guard in RenameRequest.h (pr#7143, Jason Dillaman)
  • librbd: correct issues discovered during teuthology testing (issue#14108, issue#14107, pr#6974, Jason Dillaman)
  • librbd: correct issues discovered when cache is disabled (issue#14123, pr#6979, Jason Dillaman)
  • librbd: correct race conditions discovered during unit testing (issue#14060, pr#6923, Jason Dillaman)
  • librbd: disable copy-on-read when not exclusive lock owner (issue#14167, pr#7129, Jason Dillaman)
  • librbd: do not ignore self-managed snapshot release result (issue#14170, pr#7043, Jason Dillaman)
  • librbd: ensure copy-on-read requests are complete prior to closing parent image (pr#6740, Jason Dillaman)
  • librbd: ensure librados callbacks are flushed prior to destroying (issue#14092, pr#7040, Jason Dillaman)
  • librbd: fix journal iohint (pr#6917, Jianpeng Ma)
  • librbd: fix known test case race condition failures (issue#13969, pr#6800, Jason Dillaman)
  • librbd: fix merge-diff for >2GB diff-files (issue#14030, pr#6889, Yunchuan Wen)
  • librbd: fix test case race condition for journaling ops (pr#6877, Jason Dillaman)
  • librbd: fix tracepoint parameter in diff_iterate (pr#6892, Yunchuan Wen)
  • librbd: image refresh code paths converted to async state machines (pr#6859, Jason Dillaman)
  • librbd: include missing header for bool type (pr#6798, Mykola Golub)
  • librbd: initial collection of state machine unit tests (pr#6703, Jason Dillaman)
  • librbd: integrate journaling for maintenance operations (pr#6625, Jason Dillaman)
  • librbd: journaling-related lock dependency cleanup (pr#6777, Jason Dillaman)
  • librbd: not necessary to hold owner_lock while releasing snap id (issue#13914, pr#6736, Jason Dillaman)
  • librbd: only send signal when AIO completions queue empty (pr#6729, Jianpeng Ma)
  • librbd: optionally validate new RBD pools for snapshot support (issue#13633, pr#6925, Jason Dillaman)
  • librbd: partial revert of commit 9b0e359 (issue#13969, pr#6789, Jason Dillaman)
  • librbd: properly handle replay of snap remove RPC message (issue#14164, pr#7042, Jason Dillaman)
  • librbd: reduce verbosity of common error condition logging (issue#14234, pr#7114, Jason Dillaman)
  • librbd: simplify IO method signatures for 32bit environments (pr#6700, Jason Dillaman)
  • librbd: support eventfd for AIO completion notifications (pr#5465, Haomai Wang)
  • mailmap: add UMCloud affiliation (pr#6820, Jiaying Ren)
  • mailmap: Jewel updates (pr#6750, Abhishek Lekshmanan)
  • makefiles: remove bz2-dev from dependencies (issue#13981, pr#6939, Piotr Dałek)
  • mds: add ‘p’ flag in auth caps to control setting pool in layout (pr#6567, John Spray)
  • mds: fix client capabilities during reconnect (client.XXXX isn’t responding to mclientcaps(revoke)) (issue#11482, pr#6432, Yan, Zheng)
  • mds: fix setvxattr (broken in a536d114) (issue#14029, pr#6941, John Spray)
  • mds: repair the command option “–hot-standby” (pr#6454, Wei Feng)
  • mds: tear down connections from tell commands (issue#14048, pr#6933, John Spray)
  • mon: fix ceph df pool available calculation for 0-weighted OSDs (pr#6660, Chengyuan Li)
  • mon: fix routed_request_tids leak (pr#6102, Ning Yao)
  • mon: support min_down_reporter by subtree level (default by host) (pr#6709, Xiaoxi Chen)
  • mount.ceph: memory leaks (pr#6905, Qiankun Zheng)
  • osd: add osd op queue latency perfcounter (pr#5793, Haomai Wang)
  • osd: Allow repair of history.last_epoch_started using config (pr#6793, David Zafman)
  • osd: avoid duplicate op->mark_started in ReplicatedBackend (pr#6689, Jacek J. Łakis)
  • osd: cancel failure reports if we fail to rebind network (pr#6278, Xinze Chi)
  • osd: correctly handle small osd_scrub_interval_randomize_ratio (pr#7147, Samuel Just)
  • osd: defer decoding of MOSDRepOp/MOSDRepOpReply (pr#6503, Xinze Chi)
  • osd: don’t update epoch and rollback_info objects attrs if there is no need (pr#6555, Ning Yao)
  • osd: dump number of missing objects for each peer with pg query (pr#6058, Guang Yang)
  • osd: enable perfcounters on sharded work queue mutexes (pr#6455, Jacek J. Łakis)
  • osd: FileJournal: reduce locking scope in write_aio_bl (issue#12789, pr#5670, Zhi Zhang)
  • osd: FileStore: remove __SWORD_TYPE dependency (pr#6263, John Coyle)
  • osd: fix FileStore::_destroy_collection error return code (pr#6612, Ruifeng Yang)
  • osd: fix incorrect throttle in WBThrottle (pr#6713, Zhang Huan)
  • osd: fix MOSDRepScrub reference counter in replica_scrub (pr#6730, Jie Wang)
  • osd: fix rollback_info_trimmed_to before index() (issue#13965, pr#6801, Samuel Just)
  • osd: fix trivial scrub bug (pr#6533, Li Wang)
  • osd: KeyValueStore: don’t queue NULL context (pr#6783, Haomai Wang)
  • osd: make backend and block device code a bit more generic (pr#6759, Sage Weil)
  • osd: move newest decode version of MOSDOp and MOSDOpReply to the front (pr#6642, Jacek J. Łakis)
  • osd: pg_pool_t: add dictionary for pool options (issue#13077, pr#6081, Mykola Golub)
  • osd: reduce memory consumption of some structs (pr#6475, Piotr Dałek)
  • osd: release the message throttle when OpRequest unregistered (issue#14248, pr#7148, Samuel Just)
  • osd: remove __SWORD_TYPE dependency (pr#6262, John Coyle)
  • osd: slightly reduce actual size of pg_log_entry_t (pr#6690, Piotr Dałek)
  • osd: support pool level recovery_priority and recovery_op_priority (pr#5953, Guang Yang)
  • osd: use pg id (without shard) when referring the PG (pr#6236, Guang Yang)
  • packaging: add build dependency on python devel package (pr#7205, Josh Durgin)
  • pybind/cephfs: add symlink and its unit test (pr#6323, Shang Ding)
  • pybind: decode empty string in conf_parse_argv() correctly (pr#6711, Josh Durgin)
  • pybind: Implementation of rados_ioctx_snapshot_rollback (pr#6878, Florent Manens)
  • pybind: port the rbd bindings to Cython (issue#13115, pr#6768, Hector Martin)
  • pybind: support ioctx:exec (pr#6795, Noah Watkins)
  • qa: erasure-code benchmark plugin selection (pr#6685, Loic Dachary)
  • qa/krbd: Expunge generic/247 (pr#6831, Douglas Fuller)
  • qa/workunits/cephtool/test.sh: false positive fail on /tmp/obj1. (pr#6837, Robin H. Johnson)
  • qa/workunits/cephtool/test.sh: no ./ (pr#6748, Sage Weil)
  • qa/workunits/rbd: rbd-nbd test should use sudo for map/unmap ops (issue#14221, pr#7101, Jason Dillaman)
  • rados: bench: fix off-by-one to avoid writing past object_size (pr#6677, Tao Chang)
  • rbd: add –object-size option, deprecate –order (issue#12112, pr#6830, Vikhyat Umrao)
  • rbd: add RBD pool mirroring configuration API + CLI (pr#6129, Jason Dillaman)
  • rbd: fix build with “–without-rbd” (issue#14058, pr#6899, Piotr Dałek)
  • rbd: journal: configuration via conf, cli, api and some fixes (pr#6665, Mykola Golub)
  • rbd: merge_diff test should use new –object-size parameter instead of –order (issue#14106, pr#6972, Na Xie, Jason Dillaman)
  • rbd-nbd: network block device (NBD) support for RBD (pr#6657, Yunchuan Wen, Li Wang)
  • rbd: output formatter may not be closed upon error (issue#13711, pr#6706, xie xingguo)
  • rgw: add a missing cap type (pr#6774, Yehuda Sadeh)
  • rgw: add an inspection to the field of type when assigning user caps (pr#6051, Kongming Wu)
  • rgw: add LifeCycle feature (pr#6331, Ji Chen)
  • rgw: add support for Static Large Objects of Swift API (issue#12886, issue#13452, pr#6643, Yehuda Sadeh, Radoslaw Zarzynski)
  • rgw: fix a glaring syntax error (pr#6888, Pavan Rallabhandi)
  • rgw: fix the build failure (pr#6927, Kefu Chai)
  • rgw: multitenancy support (pr#6784, Yehuda Sadeh, Pete Zaitcev)
  • rgw: Remove unused code in PutMetadataAccount:execute (pr#6668, Pete Zaitcev)
  • rgw: remove unused variable in RGWPutMetadataBucket::execute (pr#6735, Radoslaw Zarzynski)
  • rgw/rgw_resolve: fallback to res_query when res_nquery not implemented (pr#6292, John Coyle)
  • rgw: static large objects (Radoslaw Zarzynski, Yehuda Sadeh)
  • rgw: swift bulk delete (Radoslaw Zarzynski)
  • systemd: start/stop/restart ceph services by daemon type (issue#13497, pr#6276, Zhi Zhang)
  • sysvinit: allow custom cluster names (pr#6732, Richard Chan)
  • test/encoding/readable.sh fix (pr#6714, Igor Podoski)
  • test: fix osd-scrub-snaps.sh (pr#6697, Xinze Chi)
  • test/librados/test.cc: clean up EC pools’ crush rules too (issue#13878, pr#6788, Loic Dachary, Dan Mick)
  • tests: allow object corpus readable test to skip specific incompat instances (pr#6932, Igor Podoski)
  • tests: ceph-helpers assert success getting backfills (pr#6699, Loic Dachary)
  • tests: ceph_test_keyvaluedb_iterators: fix broken test (pr#6597, Haomai Wang)
  • tests: fix failure for osd-scrub-snap.sh (issue#13986, pr#6890, Loic Dachary, Ning Yao)
  • tests: fix race condition testing auto scrub (issue#13592, pr#6724, Xinze Chi, Loic Dachary)
  • tests: flush op work queue prior to destroying MockImageCtx (issue#14092, pr#7002, Jason Dillaman)
  • tests: –osd-scrub-load-threshold=2000 for more consistency (issue#14027, pr#6871, Loic Dachary)
  • tests: osd-scrub-snaps.sh to display full osd logs on error (issue#13986, pr#6857, Loic Dachary)
  • test: use sequential journal_tid for object cacher test (issue#13877, pr#6710, Josh Durgin)
  • tools: add cephfs-table-tool ‘take_inos’ (pr#6655, John Spray)
  • tools: Fix layout handing in cephfs-data-scan (#13898) (pr#6719, John Spray)
  • tools: support printing part cluster map in readable fashion (issue#13079, pr#5921, Bo Cai)
  • vstart.sh: add mstart, mstop, mrun wrappers for running multiple vstart-style test clusters out of src tree (pr#6901, Yehuda Sadeh)

GETTING CEPH