通用ipc超时控制
除了DataNode外的超时,比如Yarn自己通信超时那么也会报如下错误。
failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel
增大core-site.xml中的ipc.ping.interval。
如果关闭ipc.client.ping。则要增大ipc.client.rpc-timeout.ms。
如下是Hadoop3.1.1的源码。由源码可知。如果开启ipc.client.ping心跳,则没有RPC超时。以心跳超时为准。
/*** The time after which a RPC will timeout.* If ping is not enabled (via ipc.client.ping), then the timeout value is the* same as the pingInterval.* If ping is enabled, then there is no timeout value.** @param conf Configuration* @return the timeout period in milliseconds. -1 if no timeout value is set* @deprecated use {@link #getRpcTimeout(Configuration)} instead*/@Deprecatedfinal public static int getTimeout(Configuration conf) {int timeout = getRpcTimeout(conf);if (timeout > 0) {return timeout;}if (!conf.getBoolean(CommonConfigurationKeys.IPC_CLIENT_PING_KEY,CommonConfigurationKeys.IPC_CLIENT_PING_DEFAULT)) {return getPingInterval(conf);}return -1;}/*** The time after which a RPC will timeout.** @param conf Configuration* @return the timeout period in milliseconds.*/public static final int getRpcTimeout(Configuration conf) {int timeout =conf.getInt(CommonConfigurationKeys.IPC_CLIENT_RPC_TIMEOUT_KEY,CommonConfigurationKeys.IPC_CLIENT_RPC_TIMEOUT_DEFAULT);return (timeout < 0) ? 0 : timeout;}
参考资料
https://www.cnblogs.com/yjt1993/p/11164492.html
https://yq.aliyun.com/articles/476766
