Sub疑难杂症中文问题解答专区

提问前请把自己所有情况列举清楚,有人看见了会给你解答:
CPU型号跟数量:
内存条数量跟大小:
硬盘数量:
网络环境和地区:
是否有公网IP:
网络环境有几层路由:
node文件启动参数:

farm文件启动参数:

是否能提供日志文件:
日志生成在简单的防止进程崩溃的Ubuntu和Windos脚本自动重启 - #3 by z_W
问题说明:

举例:
CPU型号跟数量:双路EPYC 7702
内存条数量跟大小:16G 3200 * 4
硬盘数量:英特尔P45107.68t*4
网络环境和地区:江苏电信
是否有公网IP:否
网络环境有几层路由:家庭宽带从光猫直连
日志文件:https://raw.githubusercontent.com/Allen9168/log/main/node_log.txt
node文件启动参数:

  ./subspace-node-ubuntu-x86_64-skylake-gemini-3h-2024-mar-25 \
    run \
    --chain gemini-3h \
    --base-path NODE_DATA_PATH \
    --farmer \
    --name "INSERT_YOUR_ID"

farm文件启动参数:

./subspace-farmer-ubuntu-x86_64-skylake-gemini-3h-2024-mar-25 farm --reward-address WALLET_ADDRESS path=PATH_TO_FARM,size=PLOT_SIZE

问题说明:
我的node经常崩溃

请按照以上格式在这里留言问题

1 Like

耕种错误,图形界面提示:
F\ [3.8t:
Farm crashed: Background task plotting-0 panicked
log:
24-04-01T03:20:19.583368Z INFO Node: substrate: :zzz: Idle (27 peers), best: #883455 (0x4a48…29e9), finalized #798553 (0x330f…3d9d), :arrow_down: 29.7kiB/s :arrow_up: 66.4kiB/s
2024-04-01T03:20:20.104701Z INFO Node: substrate: :sparkles: Imported #e[1;37m883456e[0m (0x66c7…0dd4)
2024-04-01T03:20:24.583673Z INFO Node: substrate: :zzz: Idle (27 peers), best: #883456 (0x66c7…0dd4), finalized #798553 (0x330f…3d9d), :arrow_down: 27.7kiB/s :arrow_up: 32.3kiB/s
2024-04-01T03:20:29.584487Z INFO Node: substrate: :zzz: Idle (27 peers), best: #883456 (0x66c7…0dd4), finalized #798553 (0x330f…3d9d), :arrow_down: 35.3kiB/s :arrow_up: 37.0kiB/s
2024-04-01T03:20:34.585859Z INFO Node: substrate: :zzz: Idle (27 peers), best: #883456 (0x66c7…0dd4), finalized #798553 (0x330f…3d9d), :arrow_down: 18.8kiB/s :arrow_up: 22.0kiB/s
2024-04-01T03:20:35.876958Z INFO Node: substrate: :sparkles: Imported #e[1;37m883457e[0m (0x1e31…7c30)
2024-04-01T03:20:38.278128Z INFO Node: substrate: :sparkles: Imported #e[1;37m883458e[0m (0x729a…a881)
2024-04-01T03:20:38.596244Z ERROR space_acres::backend::farmer: Farm errored and stopped farm_index=0 error=Background task plotting-0 panicked
2024-04-01T03:20:39.586884Z INFO Node: substrate: :zzz: Idle (27 peers), best: #883458 (0x729a…a881), finalized #798553 (0x330f…3d9d), :arrow_down: 27.4kiB/s :arrow_up: 62.8kiB/s
2024-04-01T03:20:39.986616Z INFO Node: substrate: :sparkles: Imported #e[1;37m883459e[0m (0x9018…0f66)
2024-04-01T03:20:40.959982Z ERROR jsonrpsee_core::client::async_client: [backend]: Networking or low-level protocol error: WebSocket connection error: connection closed
2024-04-01T03:20:40.971093Z WARN Node: sc_proof_of_time::source::gossip: Failed to send incoming message error=send failed because receiver is gone
2024-04-01T03:20:40.971119Z ERROR Node: sc_proof_of_time::source::gossip: Gossip engine has terminated
2024-04-01T03:20:40.971159Z ERROR Node: sc_service::task_manager: Essential task pot-gossip failed. Shutting down service.
2024-04-01T03:20:40.975696Z ERROR space_acres::backend: Failed to send run error notification error=send failed because receiver is gone
2024-04-01T03:20:41.024602Z INFO space_acres: Exiting space-acres 0.1.12 exit_status_code=Exit
启动参数:默认
CPU型号跟数量:5950x1
内存条数量跟大小:8g
2
硬盘数量:3.8t*1
网络环境和地区:中国移动 福建
是否有公网IP:无
网络环境有几层路由:2
问题说明: 耕种到一定时间会奔溃,同配置同网络其他机器正常

如果同配置同网络其他机器正常 ,那检查这台机器的硬件即可,看你也问CTO了,cto也回答你了,优先检查内存条,把好的机器的内存条跟这台互换,重装下系统,如果有转接卡这些东西也一起换,优先检查硬件.

CPU型号跟数量:Q80-30/2cpu Ampere
内存条数量跟大小:32G82/ddr4 2933
硬盘数量:sata1.927,d3600/2T2,sn630/3.84t*1
网络环境和地区:中国,重庆移动
是否有公网IP:v4无公网,V6有公网
网络环境有几层路由:直连路由器
farmer文件启动参数: ./farmer-3h farm --node-rpc-url ws://192.168.100.209:9944 --reward-address stxxxxx
path=/mnt/sda/polt,size=1787GiB path=/mnt/sdb/polt,size=1787GiB path=/mnt/sdc/polt,size=1787GiB path=/mnt/sdd/polt,size=1787GiB path=/mnt/sde/polt,size=1787GiB path=/mnt/sdf/polt,size=1787GiB path=/mnt/sdg/polt,size=1787GiB path=/mnt/nvme01/polt,size=1861GiB path=/mnt/nvme02/polt,size=1861GiB path=/mnt/nvme03/polt,size=3575GiB
–farm-during-initial-plotting true --listen-on /ip4/0.0.0.0/tcp/31633 --listen-on /ip4/0.0.0.0/udp/31633/quic-v1
node用的本地自建节点

问题说明
单路的时候单个块大概是80秒左右,目前双路反而到了2分30秒,终端看CPU占用是比较高,但是通过ipmi监控CPU核心功耗,发现单路能跑到180W,现在双路只有100+100左右,跑其他程序能够接近2倍收益,功耗也能跑到180+160W左右,我分别测试了 [gemini-3h-2024-mar-29]和 [gemini-3h-2024-mar-25] 的[subspace-farmer-ubuntu-aarch64-gemini-3h-2024-mar-29]结果均是如此,

目前只有sdg这个盘没有完成,我也测试了同时P2个盘,P盘速度和CPU功耗没有变化

什么操作系统,windos还是linux

Linux,他暂时上不来,我帮他回答了。

你这款CPU我没见过也没体验过
我只能拿我见过出现过类似的EPYC给你提供几个思路
第一检查numa设置进行调优,最好是本地CPU访问本地硬盘,不要让1CPU去访问2号CPU那边的硬盘

如果还是无法,可以试试手动指定CPU核心,

可以参考这个官方文档里面的
如果以上方法还不行,你可以根据上面的文档里面的基准测试部分,把单U和双U的两个基准测试的报告拿去https://forum.autonomys.xyz/c/support/5用英语进行提问,CTO会回答

CPU型号跟数量:7950X3D
内存条数量跟大小:16G* 2/ddr5 6400
硬盘数量:d3600/2T* 4,
网络环境和地区:中国,重庆移动
是否有公网IP:v4无公网,V6有公网
网络环境有几层路由:直连路由器

目前报错,已经替换最新版本

以下是启动最开始的CPU分组信息
2024-04-27T17:01:36.490090Z INFO subspace_farmer::commands::farm: Multiple L3 cache groups detected l3_cache_groups=2
2024-04-27T17:01:36.490111Z INFO subspace_farmer::commands::farm: Preparing plotting thread pools plotting_thread_pool_core_indices=[CpuCoreSet { cores: CpuSet(0-7,16-23), … }, CpuCoreSet { cores: CpuSet(8-15,24-31), … }] replotting_thread_pool_core_indices=[CpuCoreSet { cores: CpuSet(0-7), … }, CpuCoreSet { cores: CpuSet(8-15), … }]

报错信息如下:
root@ry:/home/ry/sub# ./farmer-3h farm --node-rpc-url ws://192.168.100.209:9944 --reward-address sssssssssssssssssssssssxxxxxxxxxxxxxxx path=/mnt/nvme01/polt,size=1850GiB path=/mnt/nvme02/polt,size=1850GiB path=/mnt/nvme03/polt,size=1850GiB path=/mnt/nvme04/polt,size=1850GiB --listen-on /ip4/0.0.0.0/tcp/31633 --listen-on /ip4/0.0.0.0/udp/31633/quic-v1 > /home/farmerlog.txt
thread ‘thread ‘plotting-0.11plotting-0.1’ panicked at crates/subspace-proof-of-space/src/chiapos/table.rs’ panicked at thread ‘plotting-0.14’ panicked at crates/subspace-proof-of-space/src/chiapos/table.rs:310:49:
index out of bounds: the len is 15113 but the index is 18446744072383588291
:crates/subspace-proof-of-space/src/chiapos/table.rs310::49310:
:index out of bounds: the len is 15113 but the index is 18446744073615120069
49:

2024-04-27T17:21:47.716841Z INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (3.92% complete) sector_index=73
2024-04-27T17:21:51.363188Z ERROR subspace_farmer::commands::farm: Farm errored and stopped farm_index=2 error=Low-level plotting error: Plotting progress stream ended before plotting finished
2024-04-27T17:21:51.545375Z INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (3.98% complete) sector_index=74
2024-04-27T17:21:52.069016Z INFO {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Plotting sector (4.03% complete) sector_index=75
2024-04-27T17:21:52.069060Z WARN {farm_index=3}: subspace_farmer::single_disk_farm::plotting: Failed to send sector index for initial plotting error=send failed because receiver is gone
2024-04-27T17:21:52.069061Z WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-04-27T17:21:52.069934Z ERROR subspace_farmer::commands::farm: Farm exited with error farm_index=3 error=Low-level plotting error: Plotting progress stream ended before plotting finished
2024-04-27T17:21:55.137970Z WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-04-27T17:21:55.709288Z WARN subspace_farmer::plotter::cpu: Failed to send error progress update error=send failed because receiver is gone
2024-04-27T17:22:21.365252Z ERROR subspace_farmer::commands::farm: Farm errored and stopped farm_index=2 error=Low-level plotting error: Plotting progress stream ended before plotting finished
index out of bounds: the len is 15113 but the index is 18446744072357195116
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread ‘tokio-runtime-worker’ panicked at /home/ubuntu/actions-runner/_work/subspace/subspace/crates/subspace-farmer/src/plotter/cpu.rs:188:84:
Number of table generators is the same as number of thread pools; qed

你应该使用了远程node?就是node不在本机,检查这两者连接的稳定性

同时你的CPU和内存的稳定性怎么样,超频了?