A Deep-learning Approach for Identifying Neuropeptides Based on Contrastive Learning
Introduction
Neuropeptides form a group of signaling molecules that are essential for the regulation of physiological functions and behaviors in the nervous system. Neuropeptides play essential roles in multiple processes, including neurotransmission, immune reactions, appetite management and mood regulation. The recognition of neuropeptides furnishes vital scientific data for the early detection, precise treatment, and tailored medical approaches for associated diseases. Herein, we introduce NeuroCL, a state-of-the-art deep learning architecture engineered to proficiently detect neuropeptides (NPs) via the integration of cross-attention techniques and contrastive learning approaches. The NeuroCL framework incorporates the large language model (ESM-2) with classical feature encoding methods, thereby encompassing both broad contextual insights and granular sequence specifics, thus facilitating a detailed, multifaceted feature representation. Cross-attention mechanisms link features from the language model and handcrafted features, optimizing overall representation. Self-attention enhances the model's ability to identify critical information within sequences. Contrastive learning improves class separation and consistency, boosting classification accuracy and robustness in complex data. In summary, NeuroCL surpasses current cutting-edge techniques in the identification of NPs, outperforming them across five evaluation metrics.