RL-ACO: Reinforcement Learning Adaptive Consensus Optimization for Scalable Blockchain-Based Greenhouse Gas Monitoring

Alick Andrew Sakala, Yu Chen

doi:10.29322/IJSRP.16.05.2026.p17325

IJSRP, Volume 16, Issue 5, May 2026 Edition [ISSN 2250-3153]

RL-ACO: Reinforcement Learning Adaptive Consensus Optimization for Scalable Blockchain-Based Greenhouse Gas Monitoring

Alick Andrew Sakala, Yu Chen

Abstract: Byzantine Fault Tolerant (BFT) consensus protocols underpin data-integrity guarantees in permissioned blockchains, yet their O(N2) message complexity renders them impractical for the large multi-stakeholder consortia required by industrial greenhouse-gas (GHG) Monitoring, Reporting, and Verification (MRV) systems. At N = 400 validators representative of a pan-West-African climate coalition classical PBFT throughput collapses from approximately 2,610 TPS to 337 TPS, violating the minimum viability threshold for continuous IoT-driven emissions tracking. This paper presents RL-ACO, an AI-driven consensus framework that embeds a Deep Q-Network (DQN) agent directly into the consensus control loop. The agent observes a ten-dimensional blockchain state vector and selects from 18 discrete parameter-adjustment actions to dynamically tune cluster count k, block interval I, and emission-alert priority weight ω. A composite climate-aware reward function R(s, a) jointly optimizes throughput, P99 latency, Byzantine fault-tolerance margin, and GHG alert timeliness. Minimum Spanning Tree (MST) hierarchical cluster formation reduces message complexity from O(N2) to O(N log N), while BLS threshold signature aggregation cuts per-round bandwidth by an order of magnitude. Security and liveness are formally proven under partial synchrony for f < N/3 Byzantine nodes. Evaluated on three public environmental datasets EPA GHGRP, CDP Supply Chain, and OpenGHG RL-ACO sustains 3,625 TPS at N = 400, a 10.8 improvement over PBFT and 3.0 over IBFT 2.0. The DQN agent converges in approximately 1,200 training episodes, raises anomaly-detection F1 from 65.3 % to 91.2 %, and achieves an ISO 14064-3 compliance score of 96/100. An 864-configuration sensitivity analysis confirms that the framework’s throughput advantage over IBFT 2.0 never falls below +127 % irrespective of workload, Byzantine rate, or hyperparameter choice.

[VIEW FULL PAPER]

[DOWNLOAD]

[Reference this Paper] [BACK]

Reference this Research Paper (Copy):

Alick Andrew Sakala, Yu Chen (2026); RL-ACO: Reinforcement Learning Adaptive Consensus Optimization for Scalable Blockchain-Based Greenhouse Gas Monitoring; International Journal of Scientific and Research Publications (IJSRP) 16(5) (ISSN: 2250-3153), DOI: http://dx.doi.org/10.29322/IJSRP.16.05.2026.p17325

IJSRP, Volume 16, Issue 5, May 2026 Edition [ISSN 2250-3153]

CALL FOR PAPERS 2026

Related Research Papers

Estimation and Fault Detection on Hydraulic System with Adaptive-Scaling Kalman and Consensus Filtering

Collaboration among Agents to Detect Fault in Power Distribution System

Performance Analysis of MANET with Low Bandwidth Estimation