celeborn.client.adaptive.optimizeSkewedPartitionRead.enabled |
false |
false |
If this is true, Celeborn will adaptively split skewed partitions instead of reading them by Spark map range. Please note that this feature requires the Celeborn-Optimize-Skew-Partitions-spark3_3.patch . |
0.6.0 |
|
celeborn.client.application.heartbeatInterval |
10s |
false |
Interval for client to send heartbeat message to master. |
0.3.0 |
celeborn.application.heartbeatInterval |
celeborn.client.application.unregister.enabled |
true |
false |
When true, Celeborn client will inform celeborn master the application is already shutdown during client exit, this allows the cluster to release resources immediately, resulting in resource savings. |
0.3.2 |
|
celeborn.client.application.uuidSuffix.enabled |
false |
false |
Whether to add UUID suffix for application id for unique. When true , add UUID suffix for unique application id. Currently, this only applies to Spark and MR. |
0.6.0 |
|
celeborn.client.chunk.prefetch.enabled |
false |
false |
Whether to enable chunk prefetch when creating CelebornInputStream. |
0.5.1 |
|
celeborn.client.closeIdleConnections |
true |
false |
Whether client will close idle connections. |
0.3.0 |
|
celeborn.client.commitFiles.ignoreExcludedWorker |
false |
false |
When true, LifecycleManager will skip workers which are in the excluded list. |
0.3.0 |
|
celeborn.client.eagerlyCreateInputStream.threads |
32 |
false |
Threads count for streamCreatorPool in CelebornShuffleReader. |
0.3.1 |
|
celeborn.client.excludePeerWorkerOnFailure.enabled |
true |
false |
When true, Celeborn will exclude partition's peer worker on failure when push data to replica failed. |
0.3.0 |
|
celeborn.client.excludedWorker.expireTimeout |
180s |
false |
Timeout time for LifecycleManager to clear reserved excluded worker. Default to be 1.5 * celeborn.master.heartbeat.worker.timeout to cover worker heartbeat timeout check period |
0.3.0 |
celeborn.worker.excluded.expireTimeout |
celeborn.client.fetch.buffer.size |
64k |
false |
Size of reducer partition buffer memory for shuffle reader. The fetched data will be buffered in memory before consuming. For performance consideration keep this buffer size not less than celeborn.client.push.buffer.max.size . |
0.4.0 |
|
celeborn.client.fetch.dfsReadChunkSize |
8m |
false |
Max chunk size for DfsPartitionReader. |
0.3.1 |
|
celeborn.client.fetch.excludeWorkerOnFailure.enabled |
false |
false |
Whether to enable shuffle client-side fetch exclude workers on failure. |
0.3.0 |
|
celeborn.client.fetch.excludedWorker.expireTimeout |
<value of celeborn.client.excludedWorker.expireTimeout> |
false |
ShuffleClient is a static object, it will be used in the whole lifecycle of Executor, We give a expire time for excluded workers to avoid a transient worker issues. |
0.3.0 |
|
celeborn.client.fetch.maxReqsInFlight |
3 |
false |
Amount of in-flight chunk fetch request. |
0.3.0 |
celeborn.fetch.maxReqsInFlight |
celeborn.client.fetch.maxRetriesForEachReplica |
3 |
false |
Max retry times of fetch chunk on each replica |
0.3.0 |
celeborn.fetch.maxRetriesForEachReplica,celeborn.fetch.maxRetries |
celeborn.client.fetch.timeout |
600s |
false |
Timeout for a task to open stream and fetch chunk. |
0.3.0 |
celeborn.fetch.timeout |
celeborn.client.flink.compression.enabled |
true |
false |
Whether to compress data in Flink plugin. |
0.3.0 |
remote-shuffle.job.enable-data-compression |
celeborn.client.flink.inputGate.concurrentReadings |
2147483647 |
false |
Max concurrent reading channels for a input gate. |
0.3.0 |
remote-shuffle.job.concurrent-readings-per-gate |
celeborn.client.flink.inputGate.memory |
32m |
false |
Memory reserved for a input gate. |
0.3.0 |
remote-shuffle.job.memory-per-gate |
celeborn.client.flink.inputGate.supportFloatingBuffer |
true |
false |
Whether to support floating buffer in Flink input gates. |
0.3.0 |
remote-shuffle.job.support-floating-buffer-per-input-gate |
celeborn.client.flink.metrics.scope.shuffle |
<host>.taskmanager.<tm_id>.<job_name>.<task_name>.<subtask_index>.<shuffle_id> |
false |
Defines the scope format string that is applied to all metrics scoped to a shuffle. Only effective when a identifier-based reporter is configured |
0.6.0 |
|
celeborn.client.flink.partitionConnectionException.enabled |
false |
false |
If enabled, org.apache.flink.runtime.io.network.partition.consumer.PartitionConnectionException would be thrown when RemoteBufferStreamReader finds that the current exception is about connection failure, then Flink can be aware of the lost Celeborn server side nodes and be able to re-compute affected data. |
0.6.0 |
|
celeborn.client.flink.resultPartition.memory |
64m |
false |
Memory reserved for a result partition. |
0.3.0 |
remote-shuffle.job.memory-per-partition |
celeborn.client.flink.resultPartition.supportFloatingBuffer |
true |
false |
Whether to support floating buffer for result partitions. |
0.3.0 |
remote-shuffle.job.support-floating-buffer-per-output-gate |
celeborn.client.flink.shuffle.fallback.policy |
AUTO |
false |
Celeborn supports the following kind of fallback policies. 1. ALWAYS: always use flink built-in shuffle implementation; 2. AUTO: prefer to use celeborn shuffle implementation, and fallback to use flink built-in shuffle implementation based on certain factors, e.g. availability of enough workers and quota; 3. NEVER: always use celeborn shuffle implementation, and fail fast when it it is concluded that fallback is required based on factors above. |
0.6.0 |
|
celeborn.client.inputStream.creation.window |
16 |
false |
Window size that CelebornShuffleReader pre-creates CelebornInputStreams, for coalesced scenario where multiple Partitions are read |
0.5.1 |
|
celeborn.client.mr.pushData.max |
32m |
false |
Max size for a push data sent from mr client. |
0.4.0 |
|
celeborn.client.partition.reader.checkpoint.enabled |
false |
false |
Whether or not checkpoint reads when re-creating a partition reader. Setting to true minimizes the amount of unnecessary reads during partition read retries |
0.6.0 |
|
celeborn.client.push.buffer.initial.size |
8k |
false |
|
0.3.0 |
celeborn.push.buffer.initial.size |
celeborn.client.push.buffer.max.size |
64k |
false |
Max size of reducer partition buffer memory for shuffle hash writer. The pushed data will be buffered in memory before sending to Celeborn worker. For performance consideration keep this buffer size higher than 32K. Example: If reducer amount is 2000, buffer size is 64K, then each task will consume up to 64KiB * 2000 = 125MiB heap memory. |
0.3.0 |
celeborn.push.buffer.max.size |
celeborn.client.push.excludeWorkerOnFailure.enabled |
false |
false |
Whether to enable shuffle client-side push exclude workers on failures. |
0.3.0 |
|
celeborn.client.push.limit.inFlight.sleepInterval |
50ms |
false |
Sleep interval when check netty in-flight requests to be done. |
0.3.0 |
celeborn.push.limit.inFlight.sleepInterval |
celeborn.client.push.limit.inFlight.timeout |
<undefined> |
false |
Timeout for netty in-flight requests to be done. Default value should be celeborn.client.push.timeout * 2 . |
0.3.0 |
celeborn.push.limit.inFlight.timeout |
celeborn.client.push.limit.strategy |
SIMPLE |
false |
The strategy used to control the push speed. Valid strategies are SIMPLE and SLOWSTART. The SLOWSTART strategy usually works with congestion control mechanism on the worker side. |
0.3.0 |
|
celeborn.client.push.maxReqsInFlight.perWorker |
32 |
false |
Amount of Netty in-flight requests per worker. Default max memory of in flight requests per worker is celeborn.client.push.maxReqsInFlight.perWorker * celeborn.client.push.buffer.max.size * compression ratio(1 in worst case): 64KiB * 32 = 2MiB. The maximum memory will not exceed celeborn.client.push.maxReqsInFlight.total . |
0.3.0 |
|
celeborn.client.push.maxReqsInFlight.total |
256 |
false |
Amount of total Netty in-flight requests. The maximum memory is celeborn.client.push.maxReqsInFlight.total * celeborn.client.push.buffer.max.size * compression ratio(1 in worst case): 64KiB * 256 = 16MiB |
0.3.0 |
celeborn.push.maxReqsInFlight |
celeborn.client.push.queue.capacity |
512 |
false |
Push buffer queue size for a task. The maximum memory is celeborn.client.push.buffer.max.size * celeborn.client.push.queue.capacity , default: 64KiB * 512 = 32MiB |
0.3.0 |
celeborn.push.queue.capacity |
celeborn.client.push.replicate.enabled |
false |
false |
When true, Celeborn worker will replicate shuffle data to another Celeborn worker asynchronously to ensure the pushed shuffle data won't be lost after the node failure. It's recommended to set false when HDFS is enabled in celeborn.storage.availableTypes . |
0.3.0 |
celeborn.push.replicate.enabled |
celeborn.client.push.retry.threads |
8 |
false |
Thread number to process shuffle re-send push data requests. |
0.3.0 |
celeborn.push.retry.threads |
celeborn.client.push.revive.batchSize |
2048 |
false |
Max number of partitions in one Revive request. |
0.3.0 |
|
celeborn.client.push.revive.interval |
100ms |
false |
Interval for client to trigger Revive to LifecycleManager. The number of partitions in one Revive request is celeborn.client.push.revive.batchSize . |
0.3.0 |
|
celeborn.client.push.revive.maxRetries |
5 |
false |
Max retry times for reviving when celeborn push data failed. |
0.3.0 |
|
celeborn.client.push.sendBufferPool.checkExpireInterval |
30s |
false |
Interval to check expire for send buffer pool. If the pool has been idle for more than celeborn.client.push.sendBufferPool.expireTimeout , the pooled send buffers and push tasks will be cleaned up. |
0.3.1 |
|
celeborn.client.push.sendBufferPool.expireTimeout |
60s |
false |
Timeout before clean up SendBufferPool. If SendBufferPool is idle for more than this time, the send buffers and push tasks will be cleaned up. |
0.3.1 |
|
celeborn.client.push.slowStart.initialSleepTime |
500ms |
false |
The initial sleep time if the current max in flight requests is 0 |
0.3.0 |
|
celeborn.client.push.slowStart.maxSleepTime |
2s |
false |
If celeborn.client.push.limit.strategy is set to SLOWSTART, push side will take a sleep strategy for each batch of requests, this controls the max sleep time if the max in flight requests limit is 1 for a long time |
0.3.0 |
|
celeborn.client.push.sort.randomizePartitionId.enabled |
false |
false |
Whether to randomize partitionId in push sorter. If true, partitionId will be randomized when sort data to avoid skew when push to worker |
0.3.0 |
celeborn.push.sort.randomizePartitionId.enabled |
celeborn.client.push.stageEnd.timeout |
<value of celeborn.<module>.io.connectionTimeout> |
false |
Timeout for waiting StageEnd. During this process, there are celeborn.client.requestCommitFiles.maxRetries times for retry opportunities for committing files and 1 times for releasing slots request. User can customize this value according to your setting. By default, the value is the max timeout value celeborn.<module>.io.connectionTimeout . |
0.3.0 |
celeborn.push.stageEnd.timeout |
celeborn.client.push.takeTaskMaxWaitAttempts |
1 |
false |
Max wait times if no task available to push to worker. |
0.3.0 |
|
celeborn.client.push.takeTaskWaitInterval |
50ms |
false |
Wait interval if no task available to push to worker. |
0.3.0 |
|
celeborn.client.push.timeout |
120s |
false |
Timeout for a task to push data rpc message. This value should better be more than twice of celeborn.<module>.push.timeoutCheck.interval |
0.3.0 |
celeborn.push.data.timeout |
celeborn.client.readLocalShuffleFile.enabled |
false |
false |
Enable read local shuffle file for clusters that co-deployed with yarn node manager. |
0.3.1 |
|
celeborn.client.readLocalShuffleFile.threads |
4 |
false |
Threads count for read local shuffle file. |
0.3.1 |
|
celeborn.client.registerShuffle.maxRetries |
3 |
false |
Max retry times for client to register shuffle. |
0.3.0 |
celeborn.shuffle.register.maxRetries |
celeborn.client.registerShuffle.retryWait |
3s |
false |
Wait time before next retry if register shuffle failed. |
0.3.0 |
celeborn.shuffle.register.retryWait |
celeborn.client.requestCommitFiles.maxRetries |
4 |
false |
Max retry times for requestCommitFiles RPC. |
0.3.0 |
|
celeborn.client.reserveSlots.maxRetries |
3 |
false |
Max retry times for client to reserve slots. |
0.3.0 |
celeborn.slots.reserve.maxRetries |
celeborn.client.reserveSlots.rackaware.enabled |
false |
false |
Whether need to place different replicates on different racks when allocating slots. |
0.3.1 |
celeborn.client.reserveSlots.rackware.enabled |
celeborn.client.reserveSlots.retryWait |
3s |
false |
Wait time before next retry if reserve slots failed. |
0.3.0 |
celeborn.slots.reserve.retryWait |
celeborn.client.rpc.cache.concurrencyLevel |
32 |
false |
The number of write locks to update rpc cache. |
0.3.0 |
celeborn.rpc.cache.concurrencyLevel |
celeborn.client.rpc.cache.expireTime |
15s |
false |
The time before a cache item is removed. |
0.3.0 |
celeborn.rpc.cache.expireTime |
celeborn.client.rpc.cache.size |
256 |
false |
The max cache items count for rpc cache. |
0.3.0 |
celeborn.rpc.cache.size |
celeborn.client.rpc.commitFiles.askTimeout |
<value of celeborn.rpc.askTimeout> |
false |
Timeout for CommitHandler commit files. |
0.4.1 |
|
celeborn.client.rpc.getReducerFileGroup.askTimeout |
<value of celeborn.rpc.askTimeout> |
false |
Timeout for ask operations during getting reducer file group information. During this process, there are celeborn.client.requestCommitFiles.maxRetries times for retry opportunities for committing files and 1 times for releasing slots request. User can customize this value according to your setting. |
0.2.0 |
|
celeborn.client.rpc.maxRetries |
3 |
false |
Max RPC retry times in client. |
0.3.2 |
|
celeborn.client.rpc.registerShuffle.askTimeout |
<value of celeborn.rpc.askTimeout> |
false |
Timeout for ask operations during register shuffle. During this process, there are two times for retry opportunities for requesting slots, one request for establishing a connection with Worker and celeborn.client.reserveSlots.maxRetries times for retry opportunities for reserving slots. User can customize this value according to your setting. |
0.3.0 |
celeborn.rpc.registerShuffle.askTimeout |
celeborn.client.rpc.requestPartition.askTimeout |
<value of celeborn.rpc.askTimeout> |
false |
Timeout for ask operations during requesting change partition location, such as reviving or splitting partition. During this process, there are celeborn.client.reserveSlots.maxRetries times for retry opportunities for reserving slots. User can customize this value according to your setting. |
0.2.0 |
|
celeborn.client.rpc.reserveSlots.askTimeout |
<value of celeborn.rpc.askTimeout> |
false |
Timeout for LifecycleManager request reserve slots. |
0.3.0 |
|
celeborn.client.rpc.retryWait |
1s |
false |
Client-specified time to wait before next retry on RpcTimeoutException. |
0.5.4 |
|
celeborn.client.rpc.shared.threads |
16 |
false |
Number of shared rpc threads in LifecycleManager. |
0.3.2 |
|
celeborn.client.shuffle.batchHandleChangePartition.interval |
100ms |
false |
Interval for LifecycleManager to schedule handling change partition requests in batch. |
0.3.0 |
celeborn.shuffle.batchHandleChangePartition.interval |
celeborn.client.shuffle.batchHandleChangePartition.partitionBuckets |
256 |
false |
Max number of change partition requests which can be concurrently processed. |
0.5.0 |
|
celeborn.client.shuffle.batchHandleChangePartition.threads |
8 |
false |
Threads number for LifecycleManager to handle change partition request in batch. |
0.3.0 |
celeborn.shuffle.batchHandleChangePartition.threads |
celeborn.client.shuffle.batchHandleCommitPartition.interval |
5s |
false |
Interval for LifecycleManager to schedule handling commit partition requests in batch. |
0.3.0 |
celeborn.shuffle.batchHandleCommitPartition.interval |
celeborn.client.shuffle.batchHandleCommitPartition.threads |
8 |
false |
Threads number for LifecycleManager to handle commit partition request in batch. |
0.3.0 |
celeborn.shuffle.batchHandleCommitPartition.threads |
celeborn.client.shuffle.batchHandleReleasePartition.interval |
5s |
false |
Interval for LifecycleManager to schedule handling release partition requests in batch. |
0.3.0 |
|
celeborn.client.shuffle.batchHandleReleasePartition.threads |
8 |
false |
Threads number for LifecycleManager to handle release partition request in batch. |
0.3.0 |
|
celeborn.client.shuffle.batchHandleRemoveExpiredShuffles.enabled |
false |
false |
Whether to batch remove expired shuffles. This is an optimization switch on removing expired shuffles. |
0.6.0 |
|
celeborn.client.shuffle.checkWorker.enabled |
true |
false |
When true, before registering shuffle, LifecycleManager should check if current cluster have available workers, if cluster don't have available workers, fallback to default shuffle. |
0.5.0 |
celeborn.client.spark.shuffle.checkWorker.enabled |
celeborn.client.shuffle.compression.codec |
LZ4 |
false |
The codec used to compress shuffle data. By default, Celeborn provides three codecs: lz4 , zstd , none . none means that shuffle compression is disabled. Since Flink version 1.16, zstd is supported for Flink shuffle client. |
0.3.0 |
celeborn.shuffle.compression.codec,remote-shuffle.job.compression.codec |
celeborn.client.shuffle.compression.zstd.level |
1 |
false |
Compression level for Zstd compression codec, its value should be an integer between -5 and 22. Increasing the compression level will result in better compression at the expense of more CPU and memory. |
0.3.0 |
celeborn.shuffle.compression.zstd.level |
celeborn.client.shuffle.decompression.lz4.xxhash.instance |
<undefined> |
false |
Decompression XXHash instance for Lz4. Available options: JNI, JAVASAFE, JAVAUNSAFE. |
0.3.2 |
|
celeborn.client.shuffle.dynamicResourceEnabled |
false |
false |
When enabled, the ChangePartitionManager will obtain candidate workers from the availableWorkers pool during heartbeats when worker resource change. |
0.6.0 |
|
celeborn.client.shuffle.dynamicResourceFactor |
0.5 |
false |
The ChangePartitionManager will check whether (unavailable workers / shuffle allocated workers) is more than the factor before obtaining candidate workers from the requestSlots RPC response when celeborn.client.shuffle.dynamicResourceEnabled set true |
0.6.0 |
|
celeborn.client.shuffle.expired.checkInterval |
60s |
false |
Interval for client to check expired shuffles. |
0.3.0 |
celeborn.shuffle.expired.checkInterval |
celeborn.client.shuffle.manager.port |
0 |
false |
Port used by the LifecycleManager on the Driver. |
0.3.0 |
celeborn.shuffle.manager.port |
celeborn.client.shuffle.partition.type |
REDUCE |
false |
Type of shuffle's partition. |
0.3.0 |
celeborn.shuffle.partition.type |
celeborn.client.shuffle.partitionSplit.mode |
SOFT |
false |
soft: the shuffle file size might be larger than split threshold. hard: the shuffle file size will be limited to split threshold. |
0.3.0 |
celeborn.shuffle.partitionSplit.mode |
celeborn.client.shuffle.partitionSplit.threshold |
1G |
false |
Shuffle file size threshold, if file size exceeds this, trigger split. |
0.3.0 |
celeborn.shuffle.partitionSplit.threshold |
celeborn.client.shuffle.rangeReadFilter.enabled |
false |
false |
If a spark application have skewed partition, this value can set to true to improve performance. |
0.2.0 |
celeborn.shuffle.rangeReadFilter.enabled |
celeborn.client.shuffle.register.filterExcludedWorker.enabled |
false |
false |
Whether to filter excluded worker when register shuffle. |
0.4.0 |
|
celeborn.client.shuffle.reviseLostShuffles.enabled |
false |
false |
Whether to revise lost shuffles. |
0.6.0 |
|
celeborn.client.slot.assign.maxWorkers |
10000 |
false |
Max workers that slots of one shuffle can be allocated on. Will choose the smaller positive one from Master side and Client side, see celeborn.master.slot.assign.maxWorkers . |
0.3.1 |
|
celeborn.client.spark.fetch.cleanFailedShuffle |
false |
false |
whether to clean those disk space occupied by shuffles which cannot be fetched |
0.6.0 |
|
celeborn.client.spark.fetch.cleanFailedShuffleInterval |
1s |
false |
the interval to clean the failed-to-fetch shuffle files, only valid when celeborn.client.spark.fetch.cleanFailedShuffle is enabled |
0.6.0 |
|
celeborn.client.spark.push.dynamicWriteMode.enabled |
false |
false |
Whether to dynamically switch push write mode based on conditions.If true, shuffle mode will be only determined by partition count |
0.5.0 |
|
celeborn.client.spark.push.dynamicWriteMode.partitionNum.threshold |
2000 |
false |
Threshold of shuffle partition number for dynamically switching push writer mode. When the shuffle partition number is greater than this value, use the sort-based shuffle writer for memory efficiency; otherwise use the hash-based shuffle writer for speed. This configuration only takes effect when celeborn.client.spark.push.dynamicWriteMode.enabled is true. |
0.5.0 |
|
celeborn.client.spark.push.sort.memory.maxMemoryFactor |
0.4 |
false |
the max portion of executor memory which can be used for SortBasedWriter buffer (only valid when celeborn.client.spark.push.sort.memory.useAdaptiveThreshold is enabled |
0.5.0 |
|
celeborn.client.spark.push.sort.memory.smallPushTolerateFactor |
0.2 |
false |
Only be in effect when celeborn.client.spark.push.sort.memory.useAdaptiveThreshold is turned on. The larger this value is, the more aggressive Celeborn will enlarge the Sort-based Shuffle writer's memory threshold. Specifically, this config controls when to enlarge the sort shuffle writer's memory threshold. With N bytes data in memory and V as the value of this config, if the number of pushes, C, when using sort based shuffle writer C >= (1 + V) * C' where C' is the number of pushes if we were using hash based writer, we will enlarge the memory threshold by 2X. |
0.5.0 |
|
celeborn.client.spark.push.sort.memory.threshold |
64m |
false |
When SortBasedPusher use memory over the threshold, will trigger push data. |
0.3.0 |
celeborn.push.sortMemory.threshold |
celeborn.client.spark.push.sort.memory.useAdaptiveThreshold |
false |
false |
Adaptively adjust sort-based shuffle writer's memory threshold |
0.5.0 |
|
celeborn.client.spark.push.unsafeRow.fastWrite.enabled |
true |
false |
This is Celeborn's optimization on UnsafeRow for Spark and it's true by default. If you have changed UnsafeRow's memory layout set this to false. |
0.2.2 |
|
celeborn.client.spark.shuffle.fallback.numPartitionsThreshold |
2147483647 |
false |
Celeborn will only accept shuffle of partition number lower than this configuration value. This configuration only takes effect when celeborn.client.spark.shuffle.fallback.policy is AUTO . |
0.5.0 |
celeborn.shuffle.forceFallback.numPartitionsThreshold,celeborn.client.spark.shuffle.forceFallback.numPartitionsThreshold |
celeborn.client.spark.shuffle.fallback.policy |
AUTO |
false |
Celeborn supports the following kind of fallback policies. 1. ALWAYS: always use spark built-in shuffle implementation; 2. AUTO: prefer to use celeborn shuffle implementation, and fallback to use spark built-in shuffle implementation based on certain factors, e.g. availability of enough workers and quota, shuffle partition number; 3. NEVER: always use celeborn shuffle implementation, and fail fast when it it is concluded that fallback is required based on factors above. |
0.5.0 |
|
celeborn.client.spark.shuffle.forceFallback.enabled |
false |
false |
Always use spark built-in shuffle implementation. This configuration is deprecated, consider configuring celeborn.client.spark.shuffle.fallback.policy instead. |
0.3.0 |
celeborn.shuffle.forceFallback.enabled |
celeborn.client.spark.shuffle.getReducerFileGroup.broadcast.enabled |
false |
false |
Whether to leverage Spark broadcast mechanism to send the GetReducerFileGroupResponse. If the response size is large and Spark executor number is large, the Spark driver network may be exhausted because each executor will pull the response from the driver. With broadcasting GetReducerFileGroupResponse, it prevents the driver from being the bottleneck in sending out multiple copies of the GetReducerFileGroupResponse (one per executor). |
0.6.0 |
|
celeborn.client.spark.shuffle.getReducerFileGroup.broadcast.miniSize |
512k |
false |
The size at which we use Broadcast to send the GetReducerFileGroupResponse to the executors. |
0.6.0 |
|
celeborn.client.spark.shuffle.writer |
HASH |
false |
Celeborn supports the following kind of shuffle writers. 1. hash: hash-based shuffle writer works fine when shuffle partition count is normal; 2. sort: sort-based shuffle writer works fine when memory pressure is high or shuffle partition count is huge. This configuration only takes effect when celeborn.client.spark.push.dynamicWriteMode.enabled is false. |
0.3.0 |
celeborn.shuffle.writer |
celeborn.client.spark.stageRerun.enabled |
true |
false |
Whether to enable stage rerun. If true, client throws FetchFailedException instead of CelebornIOException. |
0.4.0 |
celeborn.client.spark.fetch.throwsFetchFailure |
celeborn.identity.provider |
org.apache.celeborn.common.identity.DefaultIdentityProvider |
false |
IdentityProvider class name. Default class is org.apache.celeborn.common.identity.DefaultIdentityProvider . Optional values: org.apache.celeborn.common.identity.HadoopBasedIdentityProvider user name will be obtained by UserGroupInformation.getUserName; org.apache.celeborn.common.identity.DefaultIdentityProvider user name and tenant id are default values or user-specific values. |
0.6.0 |
celeborn.quota.identity.provider |
celeborn.identity.user-specific.tenant |
default |
false |
Tenant id if celeborn.identity.provider is org.apache.celeborn.common.identity.DefaultIdentityProvider. |
0.6.0 |
celeborn.quota.identity.user-specific.tenant |
celeborn.identity.user-specific.userName |
default |
false |
User name if celeborn.identity.provider is org.apache.celeborn.common.identity.DefaultIdentityProvider. |
0.6.0 |
celeborn.quota.identity.user-specific.userName |
celeborn.master.endpoints |
<localhost>:9097 |
false |
Endpoints of master nodes for celeborn clients to connect. Client uses resolver provided by celeborn.master.endpoints.resolver to resolve the master endpoints. By default Celeborn uses org.apache.celeborn.common.client.StaticMasterEndpointResolver which take static master endpoints as input. Allowed pattern: <host1>:<port1>[,<host2>:<port2>]* , e.g. clb1:9097,clb2:9098,clb3:9099 . If the port is omitted, 9097 will be used. If the master endpoints are not static then users can pass custom resolver implementation to discover master endpoints actively using celeborn.master.endpoints.resolver. |
0.2.0 |
|
celeborn.master.endpoints.resolver |
org.apache.celeborn.common.client.StaticMasterEndpointResolver |
false |
Resolver class that can be used for discovering and updating the master endpoints. This allows users to provide a custom master endpoint resolver implementation. This is useful in environments where the master nodes might change due to scaling operations or infrastructure updates. Clients need to ensure that provided resolver class should be present in the classpath. |
0.5.2 |
|
celeborn.quota.enabled |
true |
false |
When Master side sets to true, the master will enable to check the quota via QuotaManager. When Client side sets to true, LifecycleManager will request Master side to check whether the current user has enough quota before registration of shuffle. Fallback to the default shuffle service when Master side checks that there is no enough quota for current user. |
0.2.0 |
|
celeborn.quota.interruptShuffle.enabled |
false |
false |
Whether to enable interrupt shuffle when quota exceeds. |
0.6.0 |
|
celeborn.storage.availableTypes |
HDD |
false |
Enabled storages. Available options: MEMORY,HDD,SSD,HDFS,S3,OSS. Note: HDD and SSD would be treated as identical. |
0.3.0 |
celeborn.storage.activeTypes |
celeborn.storage.hdfs.dir |
<undefined> |
false |
HDFS base directory for Celeborn to store shuffle data. |
0.2.0 |
|
celeborn.storage.oss.access.key |
<undefined> |
false |
OSS access key for Celeborn to store shuffle data. |
0.6.0 |
|
celeborn.storage.oss.dir |
<undefined> |
false |
OSS base directory for Celeborn to store shuffle data. |
0.6.0 |
|
celeborn.storage.oss.endpoint |
<undefined> |
false |
OSS endpoint for Celeborn to store shuffle data. |
0.6.0 |
|
celeborn.storage.oss.ignore.credentials |
true |
false |
Whether to skip oss credentials, disable this config to support jindo sdk . |
0.6.0 |
|
celeborn.storage.oss.secret.key |
<undefined> |
false |
OSS secret key for Celeborn to store shuffle data. |
0.6.0 |
|
celeborn.storage.s3.dir |
<undefined> |
false |
S3 base directory for Celeborn to store shuffle data. |
0.6.0 |
|
celeborn.storage.s3.endpoint.region |
<undefined> |
false |
S3 endpoint for Celeborn to store shuffle data. |
0.6.0 |
|
celeborn.tags.tagsExpr |
|
true |
Expression to filter workers by tags. The expression is a comma-separated list of tags. The expression is evaluated as a logical AND of all tags. For example, prod,high-io filters workers that have both the prod and high-io tags. |
0.6.0 |
|