In the course of its current effort to collect a “big data” set of newspaper and magazine articles, government documents, and other public discourse on the humanities for study through computational text analysis, the 4Humanities WhatEvery1Says research project acted on an intriguing idea by team member Professor Jeremy Douglass (UC Santa Barbara) to extend its idea of public discourse.
The project team collected a plain-text corpus of all U.S. patents related to the humanities issued since 1976 (the first year of full digital text available in the U. S. Patent Office’s searchable archive). Alan Liu scraped the patent descriptions as plain text (to facilitate study through computational text-analysis) and evaluated them as either “Humanities Patents” (mentioning “humanities” or “liberal arts” in a notable way) or “Humanities Patents–Extended Set” (mentioning “humanities” or “liberal arts” in a peripheral manner or in citations). Because patent descriptions are in the public domain, 4Humanities can present them here as folders of plain text files downloadable in zip form. (Missing are diagrams and other visuals.)
Much of the leading-edge scholarly response to public discussion about the apparent decline of the humanities has revolved around translating the older notion of “general” and (more recently) “flexible” humanities knowledge into such terms relevant to current socioeconomic change as “open access” and “collaboration.” Normally, these concepts are seen through the lens of copyright issues. Little attention has been paid to patents as part of the modernization of the humanities.
Gutenberg did not have a patent on print. However, thinking about the humanities in society today, researchers may need to evaluate critically how the humanities should adapt to the fact that he didn’t.
- Metadata (Excel spreadsheet)
- Humanities Patents (76 patents related to humanities or liberal arts) (zip file)
- Humanities Patents – Extended Set (336 additional patents that mention the humanities or liberal arts in a peripheral way–e.g., only in reference citations and institutional names of patent holders, as minor or arbitrary examples, etc.) (zip file)
- Humanities Patents – Total Set (412 patents; combined set of above “Humanities Patents” and “Humanities Patents – Extended Set”) (zip file)
Topic Models of This Data Set
4Humanities has created initial topic models of the Humanities Patents (spreadsheet of 30-topic model) and the Humanities Patents – Total Set (spreadsheet of 30-topic model). The topic models seem to be highly legible (perhaps owing to the tightly constrained form of patent descriptions). In essence, each topic in the models is a kind of doorway through which the humanities are invited into the highly technical, formalized, monetized world of late modernity as envisioned in patents. From the point of view of patents as a form of knowledge, the main doorways today can be labeled such things as “information,” “documents,” “education,” “search,” etc. –not such older humanistic portals as “philosophy,” “literature,” “history,” and so on.
Topics visualized as word clouds (full set of word clouds):