Load balancing at transport layer is an important function in data centers, content delivery networks, and mobile networks, where per-connection consistency (PCC) has to be met for optimal performance. Cloud-native L4 load balancers are commonly deployed as virtual network functions (VNFs) and are a critical forwarding element in modern cloud infrastructure. We identify load imbalance among service instances as the main cause of additional processing delay caused by transport-layer load balancers. Existing transport-layer load balancers rely on one of two methods: host-level traffic redirection, which may add as much as 12.48% additional traffic to underlying networks, or connection tracking, which consumes a considerable amount of memory in load balancers. Both of these methods result in inefficient usage of network resources. We propose the in-network congestion-aware load Balancer (INCAB) to achieve even load distribution among service instances and optimal network resources usage in addition to meeting the PCC requirement. We show that INCAB is capable of identifying and monitoring each instance's most-utilized resource and can improve the load distribution among all service instances. INCAB utilizes a Bloom filter and an ultra-compact connection table for in-network flow distribution. Furthermore, it does not rely on end hosts for traffic redirection. Our flow level simulations show that INCAB improves flows' average completion time by 31.97% compared to stateless solutions.