Designing clustered unmanned aerial vehicle (UAV) communication networks based on cognitive radio (CR) and reinforcement learning can significantly improve the intelligence level of clustered UAV communication networks and the robustness of the system in a time-varying environment. Among them, designing smarter systems for spectrum sensing and access is a key research issue in CR. Therefore, we focus on the dynamic cooperative spectrum sensing and channel access in clustered cognitive UAV (CUAV) communication networks. Due to the lack of prior statistical information on the primary user (PU) channel occupancy state, we propose to use multi-agent reinforcement learning (MARL) to model CUAV spectrum competition and cooperative decision-making problem in this dynamic scenario, and a return function based on the weighted compound of sensing-transmission cost and utility is introduced to characterize the real-time rewards of multi-agent game. On this basis, a time slot multi-round revisit exhaustive search algorithm based on virtual controller (VC-EXH), a Q-learning algorithm based on independent learner (IL-Q) and a deep Q-learning algorithm based on independent learner (IL-DQN) are respectively proposed. Further, the information exchange overhead, execution complexity and convergence of the three algorithms are briefly analyzed. Through the numerical simulation analysis, all three algorithms can converge quickly, significantly improve system performance and increase the utilization of idle spectrum resources.