Apache Celeborn™ 0.6.1 Release Notes
Highlight
- Support to register application info with user identifier and extra info
- Support celeborn.client.push.maxBytesSizeInFlight
- Fix the issue where reading replica partition that returns zero chunk causes tasks to hang
Improvement
- [CELEBORN-1258] Support to register application info with user identifier and extra info
- [CELEBORN-1793] Add netty pinned memory metrics
- [CELEBORN-1844][FOLLOWUP] alway try to use memory storage if available
- [CELEBORN-1917] Support celeborn.client.push.maxBytesSizeInFlight
- [CELEBORN-2044] Proactively cleanup stream state from ChunkStreamManager when the stream ends
- [CELEBORN-2056] Make the wait time for the client to read non shuffle partitions configurable
- [CELEBORN-2061] Introduce metrics to count the amount of data flushed into different storage types
- [CELEBORN-2070][CIP-14] Support MapperEnd/Response in CppClient
- [CELEBORN-2072] Add missing instance filter to grafana dashboard
- [CELEBORN-2077] Improve toString by JEP-280 instead of ToStringBuilder
- [CELEBORN-2081] PushDataHandler onFailure log shuffle key
- [CELEBORN-2082] Add the log of excluded workers with high workloads
- [CELEBORN-2083] For
WorkerStatusTracker
, log error forrecordWorkerFailure
- [CELEBORN-2085] Use a fixed buffer for flush copying to reduce GC
- [CELEBORN-2090] Support Lz4 Decompression in CppClient
- [CELEBORN-2092] Inc COMMIT_FILES_FAIL_COUNT when TimerWriter::close timeout
- [CELEBORN-2102] Introduce SorterCacheHitRate metric to monitor the hit reate of index cache for sorter
- [CELEBORN-2104] Clean up sources of NettyRpcEnv, Master and Worker to avoid thread leaks
- [CELEBORN-2106] CommitFile/Reserved location shows detail primary location UniqueId
- [CELEBORN-2108] Remove redundant PartitionType
- [CELEBORN-2112] Introduce PausePushDataStatus and PausePushDataAndReplicateStatus metric to record status of pause push data
- [CELEBORN-2117] Use git submodules for Chart Actions
- [CELEBORN-2118] Introduce IsHighWorkload metric to monitor worker overload status
- [CELEBORN-2122] Avoiding multiple accesses to HDFS when retrieving indexes in DfsPartitionReader
- [CELEBORN-2123] Add log for commit file size
- [CELEBORN-2125] Improve PartitionFilesSorter sort timeout log
- [CELEBORN-2128] Close hadoopFs FileSystem when worker is closed
- [CELEBORN-2129] CelebornBufferStream should invoke openStreamInternal in moveToNextPartitionIfPossible to avoid client creation timeout
- [CELEBORN-2133] LifecycleManager should log stack trace of Throwable for invoking appShuffleTrackerCallback
Stability and Bug Fix
- [CELEBORN-1792][FOLLOWUP] Add missing break in resumeByPinnedMemory
- [CELEBORN-1844][FOLLOWUP] Fix the condition of StoragePolicy that worker uses memory storage
- [CELEBORN-2052] Fix unexpected warning logs in Flink caused by duplicate BufferStreamEnd messages
- [CELEBORN-2064] Fix the issue where reading replica partition that returns zero chunk causes tasks to hang
- [CELEBORN-2068] TransportClientFactory should close channel explicitly to avoid resource leak for timeout or failure
- [CELEBORN-2071] Fix the issue where some gauge metrics were not registered to the metricRegistry
- [CELEBORN-2073] Fix PartitionFileSizeBytes metrics
- [CELEBORN-2075] Fix
OpenStreamTime
metrics forPbOpenStreamList
request - [CELEBORN-2078] Fix wrong grafana metrics units
- [CELEBORN-2086] S3FlushTask and OssFlushTask should close ByteArrayInputStream to avoid resource leak
- [CELEBORN-2088] Fix NPE if
celeborn.client.spark.fetch.cleanFailedShuffle
enabled - [CELEBORN-2100] Fix performance issue on readToReadOnlyBuffer
- [CELEBORN-2119] DfsTierWriter should close s3MultipartUploadHandler and ossMultipartUploadHandler for close resource
- [CELEBORN-2139] Fix the condition for using OSS storage
Documentation
- [CELEBORN-2135] Rename Blaze to Auron
- [CELEBORN-2087] Refine the docs configuration table view
Dependencies
- [CELEBORN-2080] Bump Flink from 1.19.2, 1.20.1 to 1.19.3, 1.20.2
Credits
Thanks to the following contributors who helped to review and commit to Apache Celeborn 0.6.1 version:
Contributors | |||||
---|---|---|---|---|---|
Ethan Feng | Hao Duan | Jiaming Xie | Mridul Muralidharan | Nicholas Jiang | Rui Zhuo |
Shaoyun Chen | Wang Fei | Xian Zhuang | Xinyu Wang | Xu Huang | Yang Liu |
Zhaohui Xu | Zhengqi Zhang |