Subgraph isomorphism is a well-known NP-hard problem that is widely used in many applications, such as social network analysis and query over the knowledge graph. Due to the inherent hardness, its performance is often a bottleneck in various real-world applications. Therefore, we address this by designing an efficient subgraph isomorphism algorithm leveraging features of GPU architecture, such as massive parallelism and memory hierarchy. Existing GPU-based solutions adopt a two-step output scheme, performing the same join process twice in order to write intermediate results concurrently. They also lack GPU architecture-aware optimizations that allow scaling to large graphs. In this paper, we propose a GPU-friendly subgraph isomorphism algorithm, GSI. Different from existing edge join-based GPU solutions, we propose a Prealloc-Combine strategy based on the vertex-oriented framework, which avoids joining-twice in existing solutions. Also, a GPU-friendly data structure (called PCSR) is proposed to represent an edge-labeled graph. Extensive experiments on both synthetic and real graphs show that GSI outperforms the state-of-the-art algorithms by up to several orders of magnitude and has good scalability with graph size scaling to hundreds of millions of edges.