On the Effectiveness of Self-supervised Pre-training for Modeling User Behavior Sequences
Modeling the temporal dependency in user historical behavior is crucial to improve the conversion prediction in mobile game advertising. One common approach is to encode the time-dependent behavior sequence into meaningful representations which can enhance expressiveness of the conversion prediction model. In this work, we propose a self-supervised learning (SSL) scheme for pre-training such representations with a sequential network. An SSL pretext task is introduced to model the correlation between past and future events without labels. The pre-trained sequential network can then be transferred to perform the downstream task, i.e. conversion prediction, along with a dense network that models the feature interaction between the target ads and their context. We assess the proposed models on a real-world dataset collected from our online advertising system. From the experiments, we observe that the models with the proposed pre-training scheme (1) achieve lower test log-losses and higher AUC values, and (2) require fewer labels to achieve the similar prediction accuracy than those without in the various scenarios where the models have access to the limited or full labels. Accordingly, the proposed pre-training scheme enhances the downstream models in terms of generalization ability and label efficiency, facilitating the deployment of the sequential model at scale in the online advertising system.