Chip-based photonic systems have undergone substantial progress over the last decade. However, the realization of photonic devices still depends largely on intuition-based trial-and-error methods, with a limited focus on characteristic analysis. In this work, we demonstrate an in-depth investigation of photonic power splitters by considering the transmission properties of 16,000 unique ultra-compact silicon-based structures engraved with SiO 2 , Al 2 O 3 , and Si 3 N 4 nanoholes. The characterization has been performed using finite-difference time-domain (FDTD) simulations for each dielectric material and both TE and TM polarizations at the fundamental modes in a wideband optical communication spectrum ranging from 1.45 to 1.65 µm. The corresponding transmissions, splitting ratio, and reflection loss were calculated, generating a dataset that can be used for both forward and inverse modeling purposes, using Machine Learning (ML) and Deep Learning (DL) algorithms. With an optimized hole radius of 35 nm, the proposed device area footprint of 2 µm × 2 µm is among the smallest with the best transmission reported to date. Si 3 N 4 holes show excellent transmission because they offer 90% transmittance in 96% of the data while exhibiting maximum fabrication tolerance. Forward modeling analysis, predicting the transmission properties, was performed using both Linear Model (LM) and Artificial Neural Network (ANN), where LM showed marginally better accuracy than ANN in foreseeing the transmittance. The proposed observation will aid in achieving robust, optimized optical power splitters with a wide range of splitting ratios in lesser time.INDEX TERMS Forward modeling, machine learning, optical power splitter, artificial neural network.
As there is a scarcity of large representative corpora for most languages, it is important for Multilingual Language Models (MLLM) to extract the most out of existing corpora. In this regard, script diversity presents a challenge to MLLMs by reducing lexical overlap among closely related languages. Therefore, transliterating closely related languages that use different writing scripts to a common script may improve the downstream task performance of MLLMs. In this paper, we pretrain two AL-BERT models to empirically measure the effect of transliteration on MLLMs. We specifically focus on the Indo-Aryan language family, which has the highest script diversity in the world. Afterward, we evaluate our models on the IndicGLUE benchmark. We perform Mann-Whitney U test to rigorously verify whether the effect of transliteration is significant or not. We find that transliteration benefits the low-resource languages without negatively affecting the comparatively high-resource languages. We also measure the cross-lingual representation similarity (CLRS) of the models using centered kernel alignment (CKA) on parallel sentences of eight languages from the FLORES-101 dataset. We find that the hidden representations of the transliteration-based model have higher and more stable CLRS scores. Our code is available at Github 1 Hugging Face Hub 2 , 3
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.