Learning How to Look as well as Listen: Building Capacity for Video Based Transcription and Analysis in Social and Educational Research

Principal investigator

Alfredo Artiles

Direct sponsor

Spencer Foundation

Award start date


Award end date


The challenge

Just as technology has revolutionized classroom instruction, so it promises rapid advancements for the field of education research. Despite that promise, many scholars have been slow to take advantage.

One example is video recording-based research, which, though accessible and inexpensive, is not as widely or effectively adopted as it could be. Mary Lou Fulton Teachers College researchers identified two challenges to its widespread implementation: New scholars are reluctant to use it in place of the traditional methods they were trained in; and they don’t know how to effectively use the visual information in an audio-visual recording.

In the latter instance, when video recordings are used they are often employed in the same way audio-only recordings have always been, with a focus on transcription and later analysis. This discards the rich ecology of meaning found in real-time social interaction, including nonverbal behavior that accompanies speaking, participant relations with artifacts in the scene, and reactions of mutual influence between listeners and speakers.

This additional data may hold critical information that benefits the research.

The approach

With funding from the Spencer Foundation, the “Learning How to Look and Listen Conference” was held in November 2016 to address the knowledge gap surrounding video-based research. Its particular aim was to document and examine primary research processes of noticing that analysts employ while reviewing video footage; how they look and listen as they repeatedly view footage while preparing a formal transcription and analysis.

Participants were researchers from Vanderbilt University, University College London, George Washington University, New York University, University of Victoria, Northwestern University, University of Virginia, University of Texas at Austin, University of Southern California, and the University of California campuses in Los Angeles, Santa Barbara, San Diego and Santa Cruz. They represented a pioneer generation of video researchers as well as a second generation of researchers, from disciplines including education, anthropology, sociology, linguistics and psychology.

The conference began with attendees submitting a brief reflection paper, and being videorecorded themselves in a “talk aloud” session in which they analyzed a short video.

In subsequent conference sessions, presenters demonstrated how they analyze video footage and transcribe together visual and auditory data. Conference activities allowed attendees to explore and experience best practice video-analysis methods as they relate to education research.

Findings and impact

All sessions of the conference were videorecorded, and are available on the project’s free public website, learninghowtolookandlisten.com. One section of the site offers individual viewing sessions from the conference. Another includes a group viewing session in which all of the scholars participated in a collaborative viewing and discussion of the same two-minute video they watched in individual viewing sessions.

Presentations available on the site include videos of researchers describing how they use video-based analysis; and “Future Directions,” which includes recordings of group discussions and a written summary of key issues, needs and recommendations surrounding the advancement of video-based analysis in education and the social sciences.