The advent of new special-purpose hardware such as FPGA or ASIC-based annealers and quantum processors has shown potential in solving certain families of complex combinatorial optimization problems more efficiently than conventional CPUs. We show that to address an industrial optimization problem, a hybrid architecture of CPUs and non-CPU devices is inevitable. In this paper, we propose problem decomposition as an effective method for designing a hybrid CPU--non-CPU optimization solver. We introduce the required algorithmic elements for making problem decomposition a viable approach in meeting the real-world constraints such as communication time and the potential higher cost of using non-CPU hardware. We then turn to the well-known maximum clique problem, and propose a new method of decomposition for this problem. Our method enables us to solve the maximum clique problem on very large graphs using non-CPU hardware that is considerably smaller than the size of the graph. As an example, we show that the maximum clique problem on the com-Amazon graph, with 334,863 vertices and 925,872 edges, can be solved with a single call to a device that can embed a fully connected graph of size at least 21 nodes, such as the D-Wave 2000Q. We also show that our proposed problem decomposition approach can improve the runtime of two of the best-known classical algorithms for large, sparse graphs, namely PMC and BBMCSP, by orders of magnitude. In the light of our study, we believe that new non-CPU hardware that is small in size could become competitive with CPUs if it could be either mass produced and highly parallelized, or able to provide high-quality solutions to specific, small-sized problems significantly faster than CPUs.