Are There Sex Differences in Geographic Knowledge and Understanding?

Traditional software engineering programming paradigms are mostly object or procedure oriented, driven by deterministic algorithms. With the advent of deep learning and cognitive sciences there is an emerging trend for data-driven programming, creating a shift in the programming paradigm among the software engineering communities. Visualizing and interpreting the execution of a current large scale data-driven software development is challenging. Further, for deep learning development there are many libraries in multiple programming languages such as TensorFlow (Python), CAFFE (C++), Theano (Python), Torch (Lua), and Deeplearning4j (Java), driving a huge need for interoperability across libraries.We propose a model driven development based solution framework, that facilitates intuitive designing of deep learning models in a platform agnostic fashion. This framework could potentially generate library specific code, perform program translation across languages, and debug the training process of a deep learning model from a fault localization and repair perspective.Further we identify open research problems in this emerging domain, and discuss some new software tooling requirements to serve this new age data-driven programming paradigm.

show abstract

Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Madaan¹,

Padhi²,

Panwar³

et al. 2020

Preprint

View full text Add to dashboard Cite

Machine Learning has seen tremendous growth recently, which has led to a larger adoption of ML systems for educational assessments, credit risk, healthcare, employment, criminal justice, to name a few. Trustworthiness of ML and NLP systems is a crucial aspect and requires guarantee that the decisions they make are fair and robust. Aligned with this, we propose a framework GYC, to generate a set of counterfactual text samples, which are crucial for testing these ML systems. Our main contributions include a) We introduce GYC, a framework to generate counterfactual samples such that the generation is plausible, diverse, goal-oriented and effective, b) We generate counterfactual samples, that can direct the generation towards a corresponding condition such as named-entity tag, semantic role label, or sentiment. Our experimental results on various domains, show that GYC generates counterfactual text samples exhibiting the above four properties. GYC generates counterfactuals that can act as test cases to evaluate a model and any text debiasing algorithm.

show abstract

Sanskrit Sandhi Splitting using seq2(seq)2

Aralikatte

Gantayat

Panwar

et al. 2018

View full text Add to dashboard Cite

In Sanskrit, small words (morphemes) are combined to form compound words through a process known as Sandhi. Sandhi splitting is the process of splitting a given compound word into its constituent morphemes. Although rules governing word splitting exists in the language, it is highly challenging to identify the location of the splits in a compound word. Though existing Sandhi splitting systems incorporate these pre-defined splitting rules, they have a low accuracy as the same compound word might be broken down in multiple ways to provide syntactically correct splits.In this research, we propose a novel deep learning architecture called Double Decoder RNN (DD-RNN), which (i) predicts the location of the split(s) with 95% accuracy, and (ii) predicts the constituent words (learning the Sandhi splitting rules) with 79.5% accuracy, outperforming the state-of-art by 20%. Additionally, we show the generalization capability of our deep learning model, by showing competitive results in the problem of Chinese word segmentation, as well.1 Different syntactic splits given by one of the popular Sandhi splitters:

show abstract

Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Madaan¹,

Padhi²,

Panwar³

et al. 2021

AAAI

View full text Add to dashboard Cite

Machine Learning has seen tremendous growth recently, which has led to a larger adaptation of ML systems for educational assessments, credit risk, healthcare, employment, criminal justice, to name a few. The trustworthiness of ML and NLP systems is a crucial aspect and requires a guarantee that the decisions they make are fair and robust. Aligned with this, we propose a novel framework GYC, to generate a set of exhaustive counterfactual text, which are crucial for testing these ML systems. Our main contributions include a) We introduce GYC, a framework to generate counterfactual samples such that the generation is plausible, diverse, goal-oriented, and effective, b) We generate counterfactual samples, that can direct the generation towards a corresponding \texttt{condition} such as named-entity tag, semantic role label, or sentiment. Our experimental results on various domains show that GYC generates counterfactual text samples exhibiting the above four properties. GYC generates counterfactuals that can act as test cases to evaluate a model and any text debiasing algorithm.

show abstract

Sanskrit Sandhi Splitting using seq2(seq)^2

Aralikatte¹,

Gantayat²,

Panwar³

et al. 2018

Preprint

View full text Add to dashboard Cite

DLPaper2Code: Auto-Generation of Code From Deep Learning Research Papers

Sethi

Sankaran²,

Panwar³

et al. 2018

AAAI

View full text Add to dashboard Cite

With an abundance of research papers in deep learning, reproducibility or adoption of the existing works becomes a challenge. This is due to the lack of open source implementations provided by the authors. Even if the source code is available, then re-implementing research papers in a different library is a daunting task. To address these challenges, we propose a novel extensible approach, DLPaper2Code, to extract and understand deep learning design flow diagrams and tables available in a research paper and convert them to an abstract computational graph. The extracted computational graph is then converted into execution ready source code in both Keras and Caffe, in real-time. An arXiv-like website is created where the automatically generated designs is made publicly available for 5,000 research papers. The generated designs could be rated and edited using an intuitive drag-and-drop UI framework in a crowd sourced manner. To evaluate our approach, we create a simulated dataset with over 216,000 valid deep learning design flow diagrams using a manually defined grammar. Experiments on the simulated dataset show that the proposed framework provide more than 93% accuracy in flow diagram content extraction.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Naveen Panwar

Data Quality for Machine Learning Tasks

A visual programming paradigm for abstract deep learning model development

DARVIZ: Deep Abstract Representation, Visualization, and Verification of Deep Learning Models

Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Sanskrit Sandhi Splitting using seq2(seq)2

Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text

Sanskrit Sandhi Splitting using seq2(seq)^2

DLPaper2Code: Auto-Generation of Code From Deep Learning Research Papers

Contact Info

Product

Resources

About