Information exchange systems differ in many ways, but all share a common vulnerability to selfish behavior and free-riding. In this paper, we build incentives schemes based on social norms. Social norms prescribe a social strategy for the users in the system to follow and deploy reputation schemes to reward or penalize users depending on their behaviors. Because users in these systems often have only limited capability to observe the global system information, e.g. the reputation distribution of the users participating in the system, their beliefs about the reputation distribution are heterogeneous and biased. Such belief heterogeneity causes a positive fraction of users to not follow the social strategy. In such practical scenarios, the standard equilibrium analysis deployed in the economics literature is no longer directly applicable and hence, the system design needs to consider these differences. To investigate how the system designs need to change when the participating users have only limited observations, we focus on a simple social norm with binary reputation labels but allow adjusting the punishment severity through randomization. First, we model the belief heterogeneity using a suitable Bayesian belief function. Next, we formalize the users' optimal decision problems and derive in which scenarios they follow the prescribed social strategy. With this result, we then study the system dynamics and formally define equilibrium in the sense that the system is stable when users strategically optimize their decisions. By rigorously studying two specific cases where users' belief distribution is constant or is linearly influenced by the true reputation distribution, we prove that the optimal reputation update rule is to choose the mildest possible punishment. This result is further confirmed for higher order beliefs in simulations.