As researchers embrace open and transparent data sharing, they will need to provide information about their data that effectively helps others understand their data sets’ contents. Without proper documentation, data stored in online repositories such as OSF will often be rendered unfindable and unreadable by other researchers and indexing search engines. Data dictionaries and codebooks provide a wealth of information about variables, data collection, and other important facets of a data set. This information, called metadata, provides key insights into how the data might be further used in research and facilitates search-engine indexing to reach a broader audience of interested parties. This Tutorial first explains terminology and standards relevant to data dictionaries and codebooks. Accompanying information on OSF presents a guided workflow of the entire process from source data (e.g., survey answers on Qualtrics) to an openly shared data set accompanied by a data dictionary or codebook that follows an agreed-upon standard. Finally, we discuss freely available Web applications to assist this process of ensuring that psychology data are findable, accessible, interoperable, and reusable.
Semantic spaces are used as a representation of language, capturing the meaning between linguistic units. These spaces are often built in large corpora requiring advanced equipment, specialized computational skills, and considerable effort. This project note will introduce and demonstrate the use of an accessible Shiny graphical interface allowing users to create semantic space models easily. Shiny is an R package in which one can program interactive web applications in R for others to interact with data or analyses. The advantage to Shiny applications is that naïve users can explore data without understanding the programming, and open sharing of code with the application can aid in learning the programming for one’s own use in their research. Within the application, users will be able to load popular semantic spaces or their own corpus for semantic space creation utilizing their preferred modeling technique, including LSA and TOPICS. A variety of user-friendly graphical tools, such as n-nearest neighbors or topic weighted graph, will further aid data visualization of the semantic network. Additionally, the application provides the calculation of cosine or simple co-occurrence, among other popular-relatedness values. This tool is intended for researchers who may not be programming-savvy, or as a teaching extension for psycholinguistics courses.
As researchers embrace open and transparent data sharing, they will need to provide information about their data that effectively helps others understand its contents. Without proper documentation, data stored in online repositories such as OSF will often be rendered unfindable and unreadable by other researchers and indexing search engines. Data dictionaries and codebooks provide a wealth of information about variables, data collection, and other important facets of a dataset. This information, called metadata, provides key insights into how the data might be further used in research and facilitates search engine indexing to reach a broader audience of interested parties. This tutorial first explains the terminology and standards surrounding data dictionaries and codebooks. We then present a guided workflow of the entire process from source data (e.g., survey answers on Qualtrics) to an openly shared dataset accompanied by a data dictionary or codebook that follows an agreed-upon standard. Finally, we explain how to use freely available web applications to assist this process of ensuring that psychology data are findable, accessible, interoperable, and reusable (FAIR; Wilkinson et al., 2016).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.