Aiming at facilitating a real-world, ever-evolving and scalable autonomous driving system, we present a large-scale benchmark for standardizing the evaluation of different self-supervised and semi-supervised approaches by learning from raw data, which is the first and largest benchmark to date. Existing autonomous driving systems heavily rely on 'perfect' visual perception models (e.g., detection) trained using extensive annotated data to ensure the safety. However, it is unrealistic to elaborately label instances of all scenarios and circumstances (e.g., night, extreme weather, cities) when deploying a robust autonomous driving system. Motivated by recent powerful advances of self-supervised and semi-supervised learning, a promising direction is to learn a robust detection model by collaboratively exploiting large-scale unlabeled data and few labeled data. Existing dataset (e.g., KITTI, Waymo) either provides only a small amount of data or covers limited domains with full annotation, hindering the exploration of large-scale pre-trained models. Here, we release a Large-Scale Object Detection benchmark for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories. To improve diversity, the images are collected every ten seconds per frame within 32 different cities under different weather conditions, periods and location scenes. We provide extensive experiments and deep analyses of existing supervised state-of-the-art detection models, popular self-supervised and semi-supervised approaches, and some insights about how to develop future models. We show that SODA10M can serve as a promising pretraining dataset for different self-supervised learning methods, which gives superior performance when finetuning autonomous driving downstream tasks. This benchmark will be used to hold the ICCV2021 SSLAD challenge. The data and more up-to-date information have been released at https://soda-2d.github.io.
Single image rain removal is a challenging ill‐posed problem due to various shapes and densities of rain streaks. We present a novel incremental randomly wired network (IRWN) for single image deraining. Different from previous methods, most structures of modules in IRWN are generated by a stochastic network generator based on the random graph theory, which ease the burden of manual design and further help to characterize more complex rain streaks. To decrease network parameters and extract more details efficiently, the image pyramid is fused via the multi‐scale network structure. An incremental rectified loss is proposed to better remove rain streaks in different rain conditions and recover the texture information of target objects. Extensive experiments on synthetic and real‐world datasets demonstrate that the proposed method outperforms the state‐of‐the‐art methods significantly. In addition, an ablation study is conducted to illustrate the improvements obtained by different modules and loss items in IRWN.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.