Node Shutdown Due to Essential Task Failure

Issue Report

Environment

  • Ubuntu Server 22.04.4 LTS
  • Advanced CLI

Problem

After migrating the system to a new disk I get the following errors after starting the domain node, without it everything works normally. Prior to the migration, the domain node worked without errors.

./subspace-node-ubuntu-x86_64-skylake-gemini-3h-2024-jul-29 run       --chain gemini-3h       --name anton       --base-path NODE_DATA_PATH       --blocks-pruning archive-canonical       --state-pruning archive-canonical         --sync full       --       --domain-id 1       --operator-id 53       --listen-on /ip4/0.0.0.0/tcp/40333
2024-08-23T16:32:26.190198Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #216089 (0x6201…8d81), finalized #0 (0x0a4d…0a83), ⬇ 2.3kiB/s ⬆ 0.9kiB/s    
2024-08-23T16:32:29.639559Z  INFO Consensus: substrate: ⚙️  Syncing  0.0 bps, target=#2964764 (40 peers), best: #2964701 (0x5394…1b5b), finalized #2896413 (0x416a…cd58), ⬇ 128.1kiB/s ⬆ 3.2kiB/s    
2024-08-23T16:32:31.190406Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #216089 (0x6201…8d81), finalized #0 (0x0a4d…0a83), ⬇ 70 B/s ⬆ 0.2kiB/s    
2024-08-23T16:32:34.639882Z  INFO Consensus: substrate: ⚙️  Syncing  0.0 bps, target=#2964767 (40 peers), best: #2964701 (0x5394…1b5b), finalized #2896413 (0x416a…cd58), ⬇ 144.4kiB/s ⬆ 3.0kiB/s    
2024-08-23T16:32:36.190643Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #216089 (0x6201…8d81), finalized #0 (0x0a4d…0a83), ⬇ 0.2kiB/s ⬆ 0.3kiB/s    
2024-08-23T16:32:39.640242Z  INFO Consensus: substrate: ⚙️  Preparing  0.0 bps, target=#2964767 (40 peers), best: #2964701 (0x5394…1b5b), finalized #2896413 (0x416a…cd58), ⬇ 86.1kiB/s ⬆ 2.4kiB/s    
2024-08-23T16:32:40.260383Z  INFO Domain: substrate: 🏆 Imported #242000 (0xe516…6f00 → 0x3e89…076c)    
2024-08-23T16:32:41.141174Z  WARN Domain: domain_client_operator::bundle_processor: Slow domain block execution, took 943ms consensus_block_info=(0x653d3907bbaa3d78a5ee2131289793a01de3d3e8e05b2127e42e0c95542db03d, 2964697) built_block_info=(0xa591f9d332814aea4cb9656a9d4c5188ae9ad6f39014d6b6df836b3cf6172680, 242001) reference_block_execution_duration_ms=0
2024-08-23T16:32:41.190845Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242001 (0xa591…2680), finalized #0 (0x0a4d…0a83), ⬇ 0.2kiB/s ⬆ 0.2kiB/s    
2024-08-23T16:32:41.375285Z  WARN Domain: domain_client_operator::bundle_processor: Slow domain block execution, took 233ms consensus_block_info=(0x8c6b0066e5af6dca91f680e7dd171e953b102c7cb7c486f23c5e8f05ea93c189, 2964698) built_block_info=(0x2145e8fe1ade5ba6515ab7d32d545c6a3119f272a901feaf982610da154bdf3b, 242002) reference_block_execution_duration_ms=0
2024-08-23T16:32:44.640526Z  INFO Consensus: substrate: ⚙️  Syncing  1.8 bps, target=#2964770 (40 peers), best: #2964710 (0x91a6…806e), finalized #2896413 (0x416a…cd58), ⬇ 83.0kiB/s ⬆ 2.6kiB/s    
2024-08-23T16:32:46.190990Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242011 (0x2be8…f7fc), finalized #0 (0x0a4d…0a83), ⬇ 43 B/s ⬆ 0.1kiB/s    
2024-08-23T16:32:49.640980Z  INFO Consensus: substrate: ⚙️  Preparing  2.8 bps, target=#2964770 (40 peers), best: #2964724 (0x6d43…5bda), finalized #2896413 (0x416a…cd58), ⬇ 90.3kiB/s ⬆ 3.0kiB/s    
2024-08-23T16:32:51.191526Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242020 (0xa05d…edeb), finalized #0 (0x0a4d…0a83), ⬇ 0 ⬆ 0    
2024-08-23T16:32:54.641445Z  INFO Consensus: substrate: ⚙️  Preparing  0.8 bps, target=#2964771 (40 peers), best: #2964728 (0xa982…aab6), finalized #2896413 (0x416a…cd58), ⬇ 90.3kiB/s ⬆ 3.4kiB/s    
2024-08-23T16:32:56.191698Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242024 (0xfc93…e91c), finalized #0 (0x0a4d…0a83), ⬇ 0.4kiB/s ⬆ 0.4kiB/s    
2024-08-23T16:32:59.641857Z  INFO Consensus: substrate: ⚙️  Preparing  1.2 bps, target=#2964773 (40 peers), best: #2964734 (0x5e3f…b7d1), finalized #2896413 (0x416a…cd58), ⬇ 90.4kiB/s ⬆ 4.7kiB/s    
2024-08-23T16:33:01.192421Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242027 (0x505e…2b17), finalized #0 (0x0a4d…0a83), ⬇ 43 B/s ⬆ 0.1kiB/s    
2024-08-23T16:33:04.642411Z  INFO Consensus: substrate: ⚙️  Syncing  0.4 bps, target=#2964775 (40 peers), best: #2964736 (0x86bb…4b13), finalized #2896413 (0x416a…cd58), ⬇ 95.2kiB/s ⬆ 3.8kiB/s    
2024-08-23T16:33:06.192585Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242031 (0x8435…2c27), finalized #0 (0x0a4d…0a83), ⬇ 0 ⬆ 0    
2024-08-23T16:33:09.643068Z  INFO Consensus: substrate: ⚙️  Syncing  1.0 bps, target=#2964776 (40 peers), best: #2964741 (0x6302…c1ad), finalized #2896413 (0x416a…cd58), ⬇ 89.5kiB/s ⬆ 4.3kiB/s    
2024-08-23T16:33:11.192712Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242035 (0x7264…60e8), finalized #0 (0x0a4d…0a83), ⬇ 55 B/s ⬆ 92 B/s    
2024-08-23T16:33:14.643366Z  INFO Consensus: substrate: ⚙️  Syncing  0.8 bps, target=#2964777 (40 peers), best: #2964745 (0x86e2…1a1d), finalized #2896413 (0x416a…cd58), ⬇ 132.1kiB/s ⬆ 5.6kiB/s    
2024-08-23T16:33:16.192832Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242039 (0x5f84…6b44), finalized #0 (0x0a4d…0a83), ⬇ 0.3kiB/s ⬆ 0.4kiB/s    
2024-08-23T16:33:19.643751Z  INFO Consensus: substrate: ⚙️  Preparing  1.0 bps, target=#2964777 (40 peers), best: #2964750 (0xce92…8dbf), finalized #2896413 (0x416a…cd58), ⬇ 106.8kiB/s ⬆ 21.0kiB/s    
2024-08-23T16:33:21.196033Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242043 (0x310c…11da), finalized #0 (0x0a4d…0a83), ⬇ 0.8kiB/s ⬆ 0.4kiB/s    
2024-08-23T16:33:24.643975Z  INFO Consensus: substrate: ⚙️  Preparing  0.8 bps, target=#2964778 (40 peers), best: #2964754 (0x9535…8bb0), finalized #2896413 (0x416a…cd58), ⬇ 112.7kiB/s ⬆ 5.5kiB/s    
2024-08-23T16:33:26.196159Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242047 (0xe1bf…f925), finalized #0 (0x0a4d…0a83), ⬇ 0.6kiB/s ⬆ 0.7kiB/s    
2024-08-23T16:33:29.644318Z  INFO Consensus: substrate: ⚙️  Preparing  1.0 bps, target=#2964780 (40 peers), best: #2964759 (0xb8b4…5e1a), finalized #2896413 (0x416a…cd58), ⬇ 67.0kiB/s ⬆ 4.5kiB/s    
2024-08-23T16:33:31.196412Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242051 (0x286a…299e), finalized #0 (0x0a4d…0a83), ⬇ 75 B/s ⬆ 0.2kiB/s    
2024-08-23T16:33:34.644868Z  INFO Consensus: substrate: ⚙️  Preparing  0.8 bps, target=#2964780 (40 peers), best: #2964763 (0x96d9…992e), finalized #2896413 (0x416a…cd58), ⬇ 52.5kiB/s ⬆ 2.7kiB/s    
2024-08-23T16:33:36.196634Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242054 (0xa4f3…720e), finalized #0 (0x0a4d…0a83), ⬇ 0.1kiB/s ⬆ 0.2kiB/s    
2024-08-23T16:33:39.645183Z  INFO Consensus: substrate: ⚙️  Preparing  1.0 bps, target=#2964780 (40 peers), best: #2964768 (0x84f0…66cc), finalized #2896413 (0x416a…cd58), ⬇ 65.9kiB/s ⬆ 2.7kiB/s    
2024-08-23T16:33:41.198202Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242058 (0xf6f8…14e1), finalized #0 (0x0a4d…0a83), ⬇ 3.0kiB/s ⬆ 1.5kiB/s    
2024-08-23T16:33:44.645519Z  INFO Consensus: substrate: ⚙️  Preparing  1.4 bps, target=#2964782 (40 peers), best: #2964775 (0xe0bb…f2eb), finalized #2896413 (0x416a…cd58), ⬇ 76.0kiB/s ⬆ 2.7kiB/s    
2024-08-23T16:33:46.198333Z  INFO Domain: substrate: 💤 Idle (0 peers), best: #242062 (0x98d0…eacc), finalized #0 (0x0a4d…0a83), ⬇ 1.3kiB/s ⬆ 0.8kiB/s    
2024-08-23T16:33:46.332549Z  INFO Domain: substrate: 🆕 Imported #242063 (0x98d0…eacc → 0xfc04…32ff)    
2024-08-23T16:33:49.646002Z  INFO Consensus: substrate: ⚙️  Preparing  0.4 bps, target=#2964783 (40 peers), best: #2964777 (0xc00f…9604), finalized #2896413 (0x416a…cd58), ⬇ 132.9kiB/s ⬆ 3.3kiB/s    
2024-08-23T16:33:49.869711Z ERROR Domain: domain_client_operator::domain_worker: Failed to process consensus block error=Application("Failed to generate invalid state transition fraud proof: Error at calling runtime api: Api called for an unknown Block: State already discarded for 0x19eced8a398741565ed99fb039843dfeeb4f1de5e6e2e29c4a17b2da0f55560a")
2024-08-23T16:33:49.870893Z ERROR Domain: sc_service::task_manager: Essential task `domain-operator-worker` failed. Shutting down service.    
2024-08-23T16:33:51.152523Z ERROR Domain: subspace_node::commands::run: Domain starter exited with an error err=Other: Essential task failed.
2024-08-23T16:33:51.152993Z ERROR sc_service::task_manager: Essential task `domain` failed. Shutting down service.    
Error: SubstrateService(Other("Essential task failed."))

How long will this last? hummm

Hey, sorry for missing this! Which release was the node running before the migration?

Before the migration was this same release gemini-3h-2024-jul-29

, now already using gemini-3h-2024-sep-03

Did you see any log containing “Submit fraud proof” before the migration?

Your domain node seems deviated from the domain chain because it derives a different domain block from the consensus block, you check the hash of the domain block in the log and compare it with the block hash from the auto-id domain RPC node.

We did observe a similar issue in one of our internal nodes, the root cause is the mar-25 contains a breaking change and we didn’t upgrade from the mar-22 release to mar-25 in time, so some domain blocks were produced with the old release so result in a different hash. The solution is to wipe the node and resync from genesis.

After erasing the node and synchronizing it, I get the following logs:

2024-09-13T21:48:13.705985Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269726 err=FetchAssignedMessages
2024-09-13T21:48:13.709118Z  INFO Domain: substrate: 🆕 Imported #375197 (0x4b3a…f8e0 → 0xe9c2…a547)    
2024-09-13T21:48:14.237933Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269826 (0x7517…dc0b), finalized #3194700 (0xe74b…9bb7), ⬇ 109.9kiB/s ⬆ 84.6kiB/s    
2024-09-13T21:48:15.780342Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375197 (0xe9c2…a547), finalized #0 (0xbaf8…9d61), ⬇ 65 B/s ⬆ 0.4kiB/s    
2024-09-13T21:48:17.611856Z  INFO Consensus: substrate: 🏆 Imported #3269827 (0x7517…dc0b → 0xf5f7…ddab)    
2024-09-13T21:48:17.612453Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269727 err=FetchAssignedMessages
2024-09-13T21:48:17.616103Z  INFO Domain: substrate: 🆕 Imported #375198 (0xe9c2…a547 → 0xbe9b…5491)    
2024-09-13T21:48:19.238228Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269827 (0xf5f7…ddab), finalized #3194700 (0xe74b…9bb7), ⬇ 95.5kiB/s ⬆ 134.6kiB/s    
2024-09-13T21:48:20.780586Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375198 (0xbe9b…5491), finalized #0 (0xbaf8…9d61), ⬇ 64 B/s ⬆ 64 B/s    
2024-09-13T21:48:24.238817Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269827 (0xf5f7…ddab), finalized #3194700 (0xe74b…9bb7), ⬇ 74.8kiB/s ⬆ 93.4kiB/s    
2024-09-13T21:48:24.978812Z  INFO Consensus: substrate: 🏆 Imported #3269828 (0xf5f7…ddab → 0x2a4f…0d76)    
2024-09-13T21:48:24.979399Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269728 err=FetchAssignedMessages
2024-09-13T21:48:24.982698Z  INFO Domain: substrate: 🆕 Imported #375199 (0xbe9b…5491 → 0xffd2…b12f)    
2024-09-13T21:48:25.780828Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375199 (0xffd2…b12f), finalized #0 (0xbaf8…9d61), ⬇ 25 B/s ⬆ 25 B/s    
2024-09-13T21:48:29.239107Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269828 (0x2a4f…0d76), finalized #3194700 (0xe74b…9bb7), ⬇ 144.6kiB/s ⬆ 121.3kiB/s    
2024-09-13T21:48:29.863953Z  INFO Consensus: substrate: 🏆 Imported #3269829 (0x2a4f…0d76 → 0x4fb4…6724)    
2024-09-13T21:48:29.864546Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269729 err=FetchAssignedMessages
2024-09-13T21:48:29.869574Z  INFO Domain: substrate: 🆕 Imported #375200 (0xffd2…b12f → 0x1f7a…3fb2)    
2024-09-13T21:48:30.781086Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375200 (0x1f7a…3fb2), finalized #0 (0xbaf8…9d61), ⬇ 1.3kiB/s ⬆ 92 B/s    
2024-09-13T21:48:34.239355Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269829 (0x4fb4…6724), finalized #3194700 (0xe74b…9bb7), ⬇ 130.1kiB/s ⬆ 83.5kiB/s    
2024-09-13T21:48:35.781282Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375200 (0x1f7a…3fb2), finalized #0 (0xbaf8…9d61), ⬇ 64 B/s ⬆ 64 B/s    
2024-09-13T21:48:39.239766Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269829 (0x4fb4…6724), finalized #3194700 (0xe74b…9bb7), ⬇ 111.5kiB/s ⬆ 114.5kiB/s    
2024-09-13T21:48:40.781497Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375200 (0x1f7a…3fb2), finalized #0 (0xbaf8…9d61), ⬇ 25 B/s ⬆ 25 B/s    
2024-09-13T21:48:42.097567Z  INFO Consensus: substrate: 🏆 Imported #3269830 (0x4fb4…6724 → 0xbc1c…3b39)    
2024-09-13T21:48:42.098082Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269730 err=FetchAssignedMessages
2024-09-13T21:48:42.101826Z  INFO Domain: substrate: 🆕 Imported #375201 (0x1f7a…3fb2 → 0x234b…07c4)    
2024-09-13T21:48:44.240148Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269830 (0xbc1c…3b39), finalized #3194700 (0xe74b…9bb7), ⬇ 92.2kiB/s ⬆ 220.0kiB/s    
2024-09-13T21:48:45.781762Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375201 (0x234b…07c4), finalized #0 (0xbaf8…9d61), ⬇ 12 B/s ⬆ 12 B/s    
2024-09-13T21:48:49.240562Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269830 (0xbc1c…3b39), finalized #3194700 (0xe74b…9bb7), ⬇ 80.7kiB/s ⬆ 108.8kiB/s    
2024-09-13T21:48:50.782011Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375201 (0x234b…07c4), finalized #0 (0xbaf8…9d61), ⬇ 64 B/s ⬆ 64 B/s    
2024-09-13T21:48:53.115671Z  INFO Consensus: substrate: 🏆 Imported #3269831 (0xbc1c…3b39 → 0x28cb…a0b1)    
2024-09-13T21:48:53.116148Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269731 err=FetchAssignedMessages
2024-09-13T21:48:53.119968Z  INFO Domain: substrate: 🆕 Imported #375202 (0x234b…07c4 → 0x2102…6d42)    
2024-09-13T21:48:54.240987Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269831 (0x28cb…a0b1), finalized #3194700 (0xe74b…9bb7), ⬇ 150.9kiB/s ⬆ 124.6kiB/s    
2024-09-13T21:48:55.782185Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375202 (0x2102…6d42), finalized #0 (0xbaf8…9d61), ⬇ 25 B/s ⬆ 25 B/s    
2024-09-13T21:48:59.241275Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269831 (0x28cb…a0b1), finalized #3194700 (0xe74b…9bb7), ⬇ 88.9kiB/s ⬆ 103.7kiB/s    
2024-09-13T21:49:00.782324Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375202 (0x2102…6d42), finalized #0 (0xbaf8…9d61), ⬇ 12 B/s ⬆ 12 B/s    
2024-09-13T21:49:01.695214Z  INFO Consensus: substrate: 🏆 Imported #3269832 (0x28cb…a0b1 → 0x1cf2…4061)    
2024-09-13T21:49:01.695759Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269732 err=FetchAssignedMessages
2024-09-13T21:49:01.699171Z  INFO Domain: substrate: 🆕 Imported #375203 (0x2102…6d42 → 0x40dd…89bb)    
2024-09-13T21:49:04.014003Z  INFO Consensus: substrate: 🏆 Imported #3269833 (0x1cf2…4061 → 0x35f4…20ef)    
2024-09-13T21:49:04.014603Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269733 err=FetchAssignedMessages
2024-09-13T21:49:04.241615Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269833 (0x35f4…20ef), finalized #3194700 (0xe74b…9bb7), ⬇ 150.4kiB/s ⬆ 77.3kiB/s    
2024-09-13T21:49:05.782490Z  INFO Domain: substrate: 💤 Idle (8 peers), best: #375203 (0x40dd…89bb), finalized #0 (0xbaf8…9d61), ⬇ 3.6kiB/s ⬆ 1.1kiB/s    
2024-09-13T21:49:09.241959Z  INFO Consensus: substrate: 💤 Idle (40 peers), best: #3269833 (0x35f4…20ef), finalized #3194700 (0xe74b…9bb7), ⬇ 95.4kiB/s ⬆ 97.6kiB/s    
2024-09-13T21:49:10.069093Z  INFO Consensus: substrate: 🏆 Imported #3269834 (0x35f4…20ef → 0xcd56…84b3)    
2024-09-13T21:49:10.069571Z ERROR Domain: message::relayer: Failed to submit messages from the chain Domain(DomainId(1)) at the block (3269734 err=FetchAssignedMessages
2024-09-13T21:49:10.073102Z  INFO Domain: substrate: 🆕 Imported #375204 (0x40dd…89bb → 0x184b…a5a1) 

How can I fix this error and what does it mean?

I noticed this with an Auto ID domain operator I was running. I discussed it internally with the engineers and it’s due to domain blocks being pruned. The fix is to add these flags to the domain side of your arguments (after the -- ):

--blocks-pruning archive-canonical
--state-pruning archive-canonical

Unfortunately, this does mean you will need to re-sync from scratch which I did (took ~3 days) but I am now running an operator on the Auto ID domain without the FetchAssignedMessages errors.

2 Likes