Techniques of Language Documentation (710)

Spring 2023

Instructor: Bradley McDonnell
Time: M 10:30 – 1:15
Location: PHYSCI 317
Office hours: T R 12 – 1 (Virtual)
Virtul meeting:

Course description

Contemporary language documentation is dedicated to collecting, annotating, curating, and disseminating long-lasting, multipurpose records of the linguistic practices of a language communities. This course will give you the skills you need to produce such a documentation, with special attention given to digital data collection, data sustainability, and the documentation of language-in-use. The skills you develop in this class can be extended to future fieldwork, community-based language work, and/or toward bringing an existing documentation project in line with current practice. Students will (1) gain an understanding of the current practices in digital language documentation; (2) develop skills in a prosody-based transcription system that can be applied to any spoken language; (3) become familiar with key software and hardware used in the field; (4) develop skills to troubleshoot data management problems in a variety of fieldwork situations. By the end of the course, students will be able to plan for conducting best-practice language documentation project of their own, from equipment purchase to recording language-in-use to data annotation to archiving and dissemination.

Learning objectives

At the end of this course, students will be able to:

  1. Develop a tailored Data Management Plan for a wide range of language documentation projects.
  2. Collect appropriate metadata for the documentation and manage it with specialized collection software Lameta.
  3. Create, edit, and convert high-quality audio and video recordings for archival and for dissemination to different audiences.
  4. Create and maintain a linked lexical and text corpus database using Fieldworks Langauge Explorer (FLEx) software that can be exported in various format for further annotation and/or dissemination.
  5. Transcribe language in use using time-aligned transcription software ELAN and a prosody-based transcription system that can be applied to any spoken language.

Required materials

Students are required to have (or have access to) the following:



  • You will need access to a computer with at least 1GB of space (or at the very least an external hard drive), since we will be working with video and audio files that can be quite large.
    • If this is going to be an issue, please communicate it to me early on in the semester.
  • I recommend purchasing closed-back (‘circumaural’) monitoring headphones for transcription and monitoring recordings.
  • You do not need to purchase any video or audio equipment; I will lend out equipment for you to use for these activities.

Assignments & Grading

  1. Participation & reading responses 20%
  • Students are required to be at every class and participate in class discussions of the readings.
  • When readings are assigned, students are required to send two responses (approximately 3-5 sentences each) that contain commentary, criticisms, or questions on the Laulima forum about the reading by 8pm on the Saturday before the reading is to be discussed.
  • Each student is required to reply to two of their classmates’ responses by 10am on Monday (approximately 2-4 sentences each).
  1. Homework assignments 20%
  • There will be a number of short assignments on the topics being discussed in class.
    1. IRB/CITI certification assignment
    2. Metadata assignment
    3. ELAN assignment
    4. FLEx lexicon assignment
    5. FLEx glossing assignment
    6. ELAN-FLEx-ELAN import export assignment
    7. Data management plan
    8. Audio assignment
    9. Video assignment
    10. Transcription
  1. Presentations 10%
  1. Audiovisual recording & transcription 15%
  • Each student will make a recording of a conversation of at least 1/2 hour.
    • The recording must use video with a separate audio recorder.
    • The audio and video will be synced, edited and exported in Adobe Premiere Pro for transcription in ELAN.
  • Students will transcribe a small portion of the recording using Discourse Transcription (Du Bois et al. 1993).
  1. Documentation enrichment project 15%
  • As a class, we will work together on enriching a documentation project, and each student will be responsible for enriching some portion of the documentation. In week four of the semester, we will begin discussing the areas of the documentation project on which each student would like to work. The goal of this project is to give students hands on experience working with documentary materials, using the technical skills that they have learned in this class.
  1. Technology review 15%
  • Each student will write a technology review of 2,000-3,000 words in length that is targeted for publication in the journal Language Documentation & Conservation on one of the following topics:

    • Review of hardware, such as audio recorder(s), video cameras, microphones, that either provides an in-depth review of one piece of hardware or a comparison of a small set of like items. The purpose is to help interested parties decide on the best equipment to use.

    • Review of linguistics software or web resource, which may include those list under Section 4

    • Review of “non-linguistics” software that may be repurposed for language work.

    • Workflow:

Attendance and attentiveness

  • Attendance in this course is crucial. In order to be successful in the course, you really need to attend every class and be punctual. Excessive absences or tardiness may result in a grade reduction. If you know that you are going to be absent, please talk to me as soon as you can so that we can develop a plan.
  • Please be attentive during class. This means that you should not be multi-tasking (e.g., responding to emails, chatting in discord, or posting on Facebook).
  • In each class be ready to discuss the readings and participate in the discussion.

Course readings

Course readings are posted on Laulima and linked to the Reading Discussions. If you are planning to read ahead more than a couple of weeks, please check with me first to make sure that Laulima is up-to-date.

Below is a complete list of the required and recommended readings for this course.

Arkhipov, Alexandre, and Nick Thieberger. 2018. “Reflections on Software and Technology for Language Documentation.” In Reflections on Language Documentation 20 Years After Himmelmann 1998, edited by Bradley McDonnell, Andrea L Berez-Kroeker, and Gary Holton, 140–49. Language Documentation & Conservation Special Publication 15. Honolulu: University of Hawai’i Press.
Artis, Anthony Q. 2014. The Shut up and Shoot Documentary Guide: A Down & Dirty DV Production. 2nd edition. New York ; London: Focal Press, Taylor & Francis Group.
Austin, Peter. 2016. “Language Documentation 20 Years On.” In Endangered Languages and Languages in Danger: Issues of Documentation, Policy, and Language Rights, edited by Luna Filipović and Martin Pütz, 147–70. IMPACT: Studies in Language and Society 42. Amsterdam: John Benjamins Publishing Company.
Bird, Steven, and Gary Simons. 2003. “Seven Dimensions of Portability for Language Documentation and Description.” Language 79 (3): 557–82.
Bow, Catherine, Baden Hughes, and Steven Bird. 2003. “Towards a General Model of Interlinear Text.” In Proceedings ofEMELD Workshop 2003: Digitizing and Annotating Textsand Field Recordings, 1–47. Lansing MI, USA.
Bowern, Claire. 2010. “Fieldwork and the IRB: A Snapshot.” Language 86 (4): 897–905.
———. 2015. Linguistic Fieldwork: A Practical Guide. 2nd ed. New York: Palgrave MacMillan.
Burnard, Lou. 2005. “Metadata for Corpus Work.” In Developing Linguistic Corpora: A Guide to Good Practice, edited by Martin Wynne, 30–46. Oxford: Oxbow Books.
Dimmendaal, Gerrit J. 2010. “Language Description and ‘the New Paradigm’: What Linguists May Learn from Ethnocinematographers.” Language Documentation & Conservation 4: 152–58.
DiPersio, Denise. 2014. “Linguistic Fieldwork and IRB Human Subjects Protocols.” Language and Linguistics Compass 8 (11): 505–11.
Driem, George van. 2016. “Endangered Language Research and the Moral Depravity of Ethics Protocols.” Language Documentation & Conservation 10: 243–52.
Du Bois, John W., Susanna Cumming, Stephan Schuetze-Coburn, and Danae Paolino. 1992. Discourse Transcription. Santa Barbara Papers in Linguistics 4. Santa Barbara: Department of Linguistics, University of California, Santa Barbara.
Du Bois, John W., Stephan Schuetze-Coburn, Susanna Cumming, and Danae Paolino. 1993. “Outline of Discourse Transcription.” In Talking Data: Transcription and Coding in Discourse Research, edited by Jane Anne Edwards and Martin D. Lampert, 45–89. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.
Good, Jeff. 2011. “Data and Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and J. Sallabank, 212–34. Cambridge: Cambridge University Press.
Himmelmann, Nicholas P. 2006. “The Challenges of Segmenting Spoken Language.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel. Trends in Linguistics. Studies and Monographs [TiLSM]. Berlin, New York: Mouton de Gruyter.
Margetts, Anna, and Andrew Margetts. 2011. “Audio and Video Recording Techniques for Linguistic Research.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 13–53. Cambridge: Cambridge University Press.
McDonnell, Bradley, Andrea L Berez-Kroeker, and Gary Holton. 2018. “Introduction.” In Reflections on Language Documentation 20 Years After Himmelmann 1998, edited by Bradley McDonnell, Andrea L. Berez-Kroeker, and Gary Holton, 1–11. Language Documentation & Conservation Special Publication 15. Honolulu: University of Hawai’i Press.
Robinson, Laura C. 2010. “Informed Consent Among Analog People in a Digital World.” Language & Communication 30 (3): 186–91.
Seyfeddinipur, Mandana, and Felix Rau. 2020. “Keeping It Real: Video Data in Language Documentation and Language Archiving.” Language Documentation & Conservation 14: 503–19.
Thieberger, Nicholas, and Andrea L. Berez. 2012. “Linguistic Data Management.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 90–118. Oxford: Oxford University Press.
Thieberger, Nick, Amanda Harris, and Linda Barwick. 2015. PARADISEC: Its History and Future.” In Research, Records and Responsibility: Ten Years of PARADISEC, edited by Amanda Harris, Nick Thieberger, and Linda Barwick, 1–16. Sydney: Sydney University Press.
Woodbury, Anthony C. 2011. “Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 159–76. Cambridge: Cambridge University Press.

Inclusivity policies

Diversity and Civility

I consider the classroom to be a place where you will be treated with respect, and we welcome individuals of all ages, backgrounds, beliefs, ethnicities, genders, gender identities, gender expressions, national origins, religious affiliations, sexual orientations, ability – and other visible and nonvisible differences. All members of this class are expected to contribute to a respectful, welcoming and inclusive environment for every other member of the class.

Preferred Names and Pronouns

I will gladly honor your request to address you by a preferred name or gender pronoun. Please advise us of this preference early in the semester so that I can make appropriate changes to my records.

Needs (ADA Statement)

If you have a disability for which you need accommodations in this class or any other special need (e.g. religious holidays), please inform the instructor as soon as possible. The KOKUA Program (Office for Students with Disabilities) can be reached at (808) 956-7511 or (808) 956-7612 (voice/text) in room 013 of the Queen Lili’uokalani Center for Student Services.