Text, Data, and the Infrastructure of Knowledge

This spring Andrew Piper and I are teaching a graduate seminar titled “Text, Data, and the Infrastructure of knowledge. Here’s the description:

In this seminar, we will consider a broad range of questions concerning the preservation, circulation, reproduction, and interpretation of texts in a digital realm from what we call the historical concerns of philology.  We will consider how philology, understood as a set of scholarly methods, concerns, and practices, might (or might not) help us critically engage with our contemporary media and epistemic environment, especially as they relate to contemporary data practices and computational forms of textual study. This seminar is thus organized as an exploration into an overarching hypothesis: namely, that the history of philology can help us better understand and actively address key concerns related to digitization, machine learning, and the preservation and production of knowledge.

Here is the rough schedule:

Wk. 1: Philology and the Future of Texts                             

– Boeckh, “The Idea of Philology”

– Dayeh, “The Potential of World Philology”

– Nietzsche, “We Philologists”

Wk. 2: Documentation: What is a text?

– Treharne, Text Technologies [Intro]

– Gitelman, Paper Knowledge [Intro]

– Rosenberg, “Data Before the Fact”

– Cordell, “Q i-jtb the Raven: Taking Dirty OCR Seriously”

Wk. 3: Collection: Curation and Loss

– Bode, “The Equivalence of Close and Distant Reading”

– Ernst, Stirrings in the Archive, 1-37

– Gavin, “How to think about EEBO”

– Gebru, “Datasheets for Datasets”

Wk. 5: Circulation: Intermediation and Care

– The Multigraph Collective, Interacting with Print [Intro]

– Erasmus, “On Method”

– Cassiodorus, “Introduction to Divine and Human Readings”

– Piper, Dreaming in Books, 85-97

– Piper, “Digitization”

Wk. 6: Indexation: Search and Accessibility

– “Index”, Interacting with Print, 155-169

– Jurafsky, “Vector Semantics”

– Salton, “A Vector Space Model for Automatic Indexing”

– Salton, Automatic Information Retrieval

– Brin and Page, “Anatomy of a large-scale hypertextual Web search-engine”

Wk. 7: Interpretation

– Schleiermacher, Lectures on Hermeneutics

– Felski, Uses of Literature [Intro]

– Fish, “Interpreting the Variorum”

– Kramnick, “Criticism and Truth”

Wk. 8: The Artificial / Informational Turn

– Grishman, “Message Understanding Conference 6: A Brief History”

– Halevy, “The Unreasonable Effectiveness of Data”

– Alpaydin, Machine Learning [chap. 2]

– Kelly, “The End of Theory

– Yonack, “A Non-Technical Introduction to Machine Learning

03. 04              ***Study Break***

Wk. 9: Probably: Accuracy and Uncertainty

– Porter, The Rise of Statistical Thinking, pp. 18-39, 93-109

– Shapin, “The Great Civility”

– Shapin, “Is there a crisis of truth?

– Open Science Foundation, “Estimating the Reproducibility of Psychological Science”

Wk.10: Openness: Visibility and Objectivity

– Daston/Galison, “Epistemologies of the Eye”

– Wellmon/Piper, “The Page Image”

Wk.11: Bias and Inequality

– Caliskan, “Semantics derived automatically from language corpora contain human-like biases”

– Blodgett, “Language (Technology) is Power”

– Noble, Algorithms of Oppression

– Bode, “You can’t model away bias”

– Wellmon/Piper, “Publication, Power, Patronage

– Hofstra, “The Diversity-Innovation Paradox”

Wk.12: Case Studies: The Good, the Bad, and the Ugly

[Student Presentations]

– Piper, “Think Small: On Literary Modeling”

Wk.13: Case Studies (cont’d)

chad wellmon

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s