Multi-modal pre-training models have been intensively explored to bridge vision and language in recent years. However, most of them explicitly model the cross-modal interaction between image-text pairs, by assuming that there exists strong semantic correlation between the text and image modalities. Since this strong assumption is often invalid in real-world scenarios, we choose to implicitly model the cross-modal correlation for large-scale multi-modal pretraining, which is the focus of the Chinese project 'Wen-Lan' led by our team. Specifically, with the weak correlation assumption over image-text pairs, we propose a twotower pre-training model called BriVL within the crossmodal contrastive learning framework. Unlike OpenAI CLIP that adopts a simple contrastive learning method, we devise a more advanced algorithm by adapting the latest method MoCo into the cross-modal scenario. By building a large queue-based dictionary, our BriVL can incorporate more negative samples in limited GPU resources. We further construct a large Chinese multi-source imagetext dataset called RUC-CAS-WenLan for pre-training our BriVL model. Extensive experiments demonstrate that the pre-trained BriVL model outperforms both UNITER and OpenAI CLIP on various downstream tasks.
Background This study was intended to review our management strategy for sinonasal squamous cell carcinomas (SNSCCs) with orbital invasion and to explore the role of radiotherapy in orbital preservation. Methods We retrospectively analyzed 93 SNSCC patients with orbital invasion who underwent radiotherapy with or without surgery over the past 15 years. The degree of orbital invasion was classified into 3 grades. Results Eighty‐eight patients presented with T4 tumors and 36 had grade III orbital invasion. Seventy‐two patients received surgery plus radiation and 67 received platinum‐based chemotherapy. The median follow‐up for surviving patients was 60 months. Five‐year overall survival (OS) for the whole group was 57.4%. The patients treated with surgery plus radiation had a 5‐year survival rate of 62.2% and orbital preservation was feasible in 90.3% of cases. Twenty‐one patients with SNSCCs that extended into the extraocular muscles or eye globe also underwent orbital preservation. Five‐year locoregional relapse‐free survival (LRFS) was 69.5% for patients treated with orbital preservation and 57.1% for those treated with orbital exenteration, indicating no statistical difference. Five‐year survival, 5‐year progression‐free survival (PFS), and 5‐year distant metastasis‐free survival (DMFS) were similar between groups. Grade III orbital invasion was independently associated with shorter OS, LRFS, PFS, and DMFS. Conclusion Orbital invasion in grade III was associated with the worst survival outcomes. Invasion of either the extraocular muscles or eye globe is not a contraindication for eye‐sparing surgery. Preoperative chemoradiation continues to offer hope to patients with a strong desire to preserve their eyes.
Background There is a scarcity of data about the prognostic value of orbital invasion in esthesioneuroblastoma (ENB), as well as about its management strategies. Indications for the preservation of orbital contents remain controversial, and the evaluation of orbital invasion has been ill defined. Methods This retrospective analysis contained 60 ENB patients with orbital invasion who underwent radiotherapy with or without surgery over the past 14 years. Orbital invasion was classified into three grades. Results There were 52 patients at stage C and 8 at stage D, according to Foote classifications. Grade I, grade II and grade III orbital invasion was detected in 12, 23, and 25 patients, respectively. The median follow-up was 57 months (IQR 32–95 months). Fourteen patients received radical radiotherapy, with a 5-year overall survival (OS) of 63.5%; 46 received surgery plus radiation, with a 5-year OS of 70.7%; and the difference was not statistically significant ( p = 0.847). Orbital preservation was feasible in 100% of cases, including 18 cases that extended to extraocular muscles or the eye globe. Five-year locoregional relapse-free survival was 100% in patients with prophylactic elective neck irradiation (PENI) and 58.1% in patients without PENI ( p = 0.004). Univariate analysis showed that grade II/III orbital invasion was associated with poorer OS and progression-free survival. Neck metastasis (with a Foote stage of D) was independently associated with shorter OS and distant metastasis–free survival in multivariate analysis. Conclusions Our data suggested that primary radiotherapy achieved comparable survival to surgery plus radiotherapy in advanced ENB. Invasion of either the extraocular muscles or the eye globe is not a contraindication for eye-sparing surgery. Orbital invasion in grade II/III was significantly associated with adverse survival outcomes. Prophylactic radiotherapy to the neck with N0 significantly reduces the risk of regional recurrence.
Automatic emotion recognition is an active research topic with wide range of applications. Due to the high manual annotation cost and inevitable label ambiguity, the development of emotion recognition dataset is limited in both scale and quality. Therefore, one of the key challenges is how to build effective models with limited data resource. Previous works have explored different approaches to tackle this challenge including data enhancement, transfer learning, and semi-supervised learning etc. However, the weakness of these existing approaches includes such as training instability, large performance loss during transfer, or marginal improvement. In this work, we propose a novel semi-supervised multi-modal emotion recognition model based on cross-modality distribution matching, which leverages abundant unlabeled data to enhance the model training under the assumption that the inner emotional status is consistent at the utterance level across modalities. We conduct extensive experiments to evaluate the proposed model on two benchmark datasets, IEMOCAP and MELD. The experiment results prove that the proposed semi-supervised learning model can effectively utilize unlabeled data and combine multi-modalities to boost the emotion recognition performance, which outperforms other state-of-the-art approaches under the same condition. The proposed model also achieves competitive capacity compared with existing approaches which take advantage of additional auxiliary information such as speaker and interaction context. CCS Concepts• Computing methodologies → Semi-supervised learning settings; Semantic networks; • Human-centered computing → HCI design and evaluation methods.
Hypopharyngeal squamous-cell carcinoma (HSCC) is a relatively rare head and neck cancer, with great variation in patient outcomes. This study aimed to develop a prognostic nomogram for patients with HSCC. From the Surveillance, Epidemiology, and End Results (SEER) database, we retrieved the clinical data of 2198 patients diagnosed with HSCC between 2010 and 2016. The patients were randomly assigned at a 4:1 ratio to the training set or the validation set. An external validation was performed by a set of 233 patients with locally advanced HSCC treated at our center. A Cox proportional hazards regression model was used to assess the relationship between each variable and overall survival (OS). Cox multivariate regression analysis was performed, and the results were used to develop a prognostic nomogram. The calibration curve and concordance index (C-index) were used to evaluate the accuracy of the prognostic nomogram. With a median overall follow-up time of 41 months (interquartile range: 20 to 61), the median OS for the entire cohort of SEER database was 24 months. The 3-year and 5-year OS rates were 41.3% and 32.5%, respectively. The Cox multivariate regression analysis of the training set showed that age, marital status, race, T stage, N stage, M stage, TNM stage, local treatment, and chemotherapy were correlated with OS. The nomogram showed a superior C-index over TNM stage (training set: 0.718 vs 0.627; validation set: 0.708 vs 0.598; external validation set: 0.709 vs 0.597), and the calibration curve showed a high level of concordance between the predicted OS and the actual OS. The nomogram provides a relatively accurate and applicable prediction of the survival outcome of patients with HSCC.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.