Beyond the lab: Using big data to discover principles of cognition

Lupyan, G., & Goldstone, R. L. (2019). Introduction to special issue. Beyond the lab: Using big data to discover principles of cognition.  Behavior Research Methods, 51, 1473-1476.

Like many other scientific disciplines, psychological science has felt the impact of the big-data revolution. This impact arises from the meeting of three forces: data availability, data heterogeneity, and data analyzability. In terms of data availability, consider that for decades, researchers relied on the Brown Corpus of about one million words (Kučera & Francis, 1969). Modern resources, in contrast, are larger by six orders of magnitude (e.g., Google’s 1T corpus) and are available in a growing number of languages. About 240 billion photos have been uploaded to Facebook,1  and Instagram receives over 100 million new photos each day.2  The largescale digitization of these data has made it possible in principle to analyze and aggregate these resources on a previously unimagined scale. Heterogeneity  refers to the availability of different types  of data. For example, recent progress in automatic image recognition is owed not just to improvements in algorithms and hardware, but arguably more to the ability to merge large collections of images with linguistic labels (produced by crowdsourced human taggers) that serve as training data to the algorithms. Making use of heterogeneous data sources often depends on their standardization. For example, the ability to combine demographic and grammatical data about thousands of languages led to the finding that languages spoken by more people have simpler morphologies (Lupyan & Dale, 2010 ). The ability to combine these data types would have been substantially more difficult without the existence of standardized language and country codes that could be used to merge the different data sources. Finally, analyzability  must be ensured, for without appropriate tools to process and analyze different types of data, the “ data”  are merely bytes.

Download PDF of this paper

See all of the papers appearing in the Big Data Special Issue of Behavior Research Methods

Instruction in computer modeling can support broad application of complex systems knowledge

Tullis, J. G., & Goldstone, R. L. (2017).  Instruction in computer modeling can support broad application of complex systems knowledge.  Frontiers in Education, 2:4, 1-18.  doi: 10.3389/feduc.2017.00004

Learners often struggle to grasp the important, central principles of complex systems, which describe how interactions between individual agents can produce complex, aggre-gate-level patterns. Learners have even more difficulty transferring their understanding of these principles across superficially dissimilar instantiations of the principles. Here, we provide evidence that teaching high school students an agent-based modeling language can enable students to apply complex system principles across superficially different domains. We measured student performance on a complex systems assessment before and after 1 week training in how to program models using NetLogo (Wilensky, 1999a). Instruction in NetLogo helped two classes of high school students apply complex sys-tems principles to a broad array of phenomena not previously encountered. We argue that teaching an agent-based computational modeling language effectively combines the benefits of explicitly defining the abstract principles underlying agent-level interac-tions with the advantages of concretely grounding knowledge through interactions with agent-based models.

Download PDF of article

An experiment on the cognitive complexity of code

Hansen, M. E., Lumsdaine, A., & Goldstone, R. L. (2013).  An experiment on the cognitive complexity of code.  Proceedings of the Thirty-Fifth Annual Conference of the Cognitive Science Society. Berlin, Germany: Cognitive Science Society.

What simple factors impact the cognitive complexity of code? We present an experiment in which participants predict the output of ten small Python programs. Even with such simple programs, we find a complex relationship between code, expertise, and correctness. We use subtle differences between program versions to demonstrate that small notational changes can have profound effects on comprehension. We catalog common errors for each program, and perform an in-depth data analysis to uncover effects on response correctness and speed.

Download PDF version of this paper

Cognitive Architectures: A Way Forward for the Psychology of Programming

Hansen, M. E., Lumsdaine, A., & Goldstone, R. L. (2012).  Cognitive Architectures: A Way Forward for the Psychology of Programming.  Onward! Workshop at the Third Annual SPLASH Conference 2012.

Programming language and library designers often debate the usability of particular design choices. These choices may impact many developers, yet scientific evidence for them is rarely provided. Cognitive models of program comprehension have existed for over thirty years, but lack quantitative definitions of their internal components and processes. To ease the burden of quantifying existing models, we recommend using the ACT-R cognitive architecture: a simulation framework for psychological models. In this paper, we provide a high-level overview of modern cognitive architectures while concentrating on the details of ACT-R. We review an existing quantitative program comprehension model, and consider how it could be simplified and implemented within the ACT-R framework. Lastly, we discuss the challenges and potential benefits associated with building a comprehensive cognitive model on top of a cognitive architecture.

 Get PDF version of the paper