Author: jeff

Readings on data and statistical services for a liberal arts college


I’m beginning to do some scenario planning on what will data and statistical services offered by the library look like in 10-15 years. As part of that activity I’m compiling a list of articles, websites, & presentations that will help inform that perspective.

Many of these come from the article Teaching the next generation of statistics students to ‘think with data’: special issue on statistics and the undergraduate curriculum by Nicholas Horton and Johanna Hardin. That article has a nice section of key articles on statistics in the undergraduate curriculum, from which I’ve made some selections below.

Setting the stage for data science: integration of data management skills in introductory and second courses in statistics. Horton, Baumer, Wickham. 2015. (pdf)

Identifies 5 key elements that deserve greater emphasis in the undergrad curriculum:

  1. “Thinking creatively, but constructively, about data”…data cleaning, data storage
  2. working with data sets of varying sizes and understanding scalability issues…querying databases
  3. command-line skills. The authors mention R, Python. I also would include Unix. The command-driven environment “provide freedom from the un-reproducible point-and-click application paradigm”.
  4. “Experience wrestling with large, messy, complex, challenging data sets…these data are more similar to what analysts actually see in the wild.”
  5. “An ethos of reproducibility”

The article goes onto illustrate examples of utilizing these 5 elements in coursework.

 

Tidy Data – slides of presentation by Wickham

Data acquisition and preprocessing in studies on humans: what is not taught in statistics classes?

Statistics and Science: A Report of the London Workshop on the Future of the Statistical Sciences 2014 (pdf)

Humanities Data in R

Implications of the Data Revolution for Statistics Education (pdf) 2015 calls for more emphasis on big data, data visualization, and developing an “aesthetic for data handling and modeling based on solving practical problems”.

A data science course for undergraduates: thinking with data (pdf)

A cognitive interpretation of data analysis

Teaching and learning data visualization: ideas and assignments (pdf) 2015

Meeting Student Needs for Multivariate Data Analysis: A Case Study in Teaching a Multivariate Data Analysis Course with No Pre-requisites Amy Wagaman, Amherst College

Curriculum Guidelines for Undergraduate Programs in Statistical Science

 

Linked Open Data & Literary Networks


We had the first meeting today of what we’re calling our Linked Open Data Working Group. In addition to myself, group members are Mackenzie Brooks (Digital Humanities Librarian), Jeff Knudson (Senior Technology Architect/ITS), and Brandon Walsh (Mellon Digital Humanities Fellow).

What is it we want to accomplish through this group?

  • Develop a better understanding of Linked Open Data (LOD) and how it might apply to projects at W&L.

We want to think about LOD in the context of our specific DH projects in order to avoid talking about it in the abstract. But we also want to make sure that we clearly identify what we want to accomplish with those projects instead of having a solution (e.g., LOD) that is looking for a problem to solve. In other words, we’re going to develop the vision for the project and then work backwards.

We have several potential projects but the easiest to get started on is literary networks. This project evolved out of archival research relating to the Shenandoah literary magazine published by W&L. While Shenandoah is partially indexed in MLA, a full index of Shenandoah has never been produced. (Also, the contents are not in JSTOR.)  A student worker in the library has compiled an index as part of her job in Special Collections and Archives. Our DH Librarian (previously our Metadata Librarian) identified the necessary fields and created the spreadsheet for the data entry. The Shenandoah index has over 6,400 entries.

Our first task in this project doesn’t actually involve LOD: creating a web-based index to Shenandoah. But we want to keep LOD principles in mind as we develop the index. The Shenandoah data set provides research material that goes far beyond merely an index to a journal.

While the research agenda of the literary networks project is the topic of a future post, the essence is that I want to examine the relationships and connections among authors and editors.

 

Top-level functionality

Here are primary features involved in a LOD approach to this data set. Each feature is a different stage, or layer, of the project. A use case scenario describes functionality enabled by each stage.

  • a web-based index to Shenandoah

Use case: Queries based on editor. From these queries we can form network graphs based on relationships among authors and editors and issues.

  • expose the index/relationship data as LOD

Use case: Exposing this data set as a LOD triple store (with possibilities of generating query results in json & csv) allows for the data to be analyzed in a variety of tools such as Palladio, R, Gephi. Plus, it provides the ability for other projects to integrate this data.

  • the web index incorporates additional data about authors

Use case: the web interface shows brief biographical information and publishing history about each author. Instead of gathering this data manually and entering it directly into the data set, we want to explore creating a process that connects external information about these authors with this data by consuming LOD.

  • expanding the Shenandoah data set by including relationship information identified by archival research

Use case: literary networks are influenced by friendships and social contacts. Publishing decisions are often made by brokers (e.g., Ezra Pound), whose influence is not formally represented in the index data. There’s a complex challenge in figuring out how to represent this information.

  • expanding the data set with data of other literary journals in order to create a broader data set of authors publishing in the mid-century

Use case: While the data set starts with one specific literary journal (Shenandoah), it’s the connections among authors and editors publishing in a larger set of journals during the same time period that is more interesting. Authors do not simply write for one publication. Ultimately, the literary networks project will create a data set of authors (and editors) who published in mid-century literary journals.

From originating with the Shenandoah data set we gain experience with the process of utilizing LOD. What we learn through this initiative can be applied to other projects, particularly those with biographical data.

Process of the LOD working group

As a group we will meet once a month. We will use a Slack channel in the W&L DHAT team for communication.

DH as a trojan horse for information literacy


The digital humanities (DH) represent an academic library’s greatest opportunity for strengthening its role in the curriculum and research. DH provides a frame for understanding the creative process of scholarship.

The methods and tools within DH are not unique to disciplines within the humanities. The core activities of DH reflect the fundamental hallmarks of information literacy as expressed in the ACRL Framework for Information Literacy. DH does not exist without information literacy. Yet, a separation exists within the profession of librarianship between DH and information literacy. Our academic libraries are organized so that information literacy is the domain of subject liaisons/instructional librarians and DH emerges from R&D-type efforts. In most libraries these are entirely separate departments. In some universities DH is entirely separated from the library, even when a DH center is physically located within the library building.

DH and information literacy are on a collision path fighting for resources. Yet, DH can be a vehicle for strengthening the reach of information literacy in the curriculum. Opportunities exist for collaborative initiatives bringing the two together rather than siloed within organizational boundaries. Librarians must advocate for integrating these practices instead of competing for resources. Here are some steps for moving forward:

First, increase the dialogue between instruction librarians and DH specialists in order to move beyond the barrier of the term DH and recognize its essence as applicable to all disciplines.

Second, redefine the role of the subject liaison/specialist to incorporate a range of digital practices so that all academic librarians are digital scholarship librarians.

Third, take action and demonstrate through example by jumping in with both feet to figure out what works at a particular institution.

DH has the power to enhance an information literacy program. Within the confluence of DH and information literacy, a university can find the capacity to sustain digital scholarship.

First-year writing courses & DH


We’re half-a-year into our 4-year Mellon DH grant. On my way back from DLF in Vancouver at the end of October, I got stranded in the Chicago airport for most of the day. Those “opportunities” provide plenty of time to think. For a couple of years W&L has been issuing an open call to faculty to submit proposals for incentive grants in DH. As I was sitting in the airport, I started to reflect on how we could take a more systematic approach to ensuring that the grant money contributed to structural changes in the curriculum. In other words, what is it that we’re trying to incentivize?

One of our goals is to introduce more first-year students to DH.  Students encountering digital methods early in their academic careers are better equipped for handling DH assignments and projects appropriate for upper level courses.  Our students are challenged to grasp the implications of a world mediated through technology. The digital environment is not in opposition to the critical thinking nurtured through the processes of close reading and composition. Rather, through software we find tools that are suitable for enhancing our understanding of the world around us and to present new forms of expression.

Our students have the opportunity in their lifetimes to creatively define how technology impacts not only their future but also that of succeeding generations.

As their careers progress into the mid-century, our graduates’ entrepreneurial instincts and leadership will identify solutions that can only be met through their critical understanding of digital information and technology.

The foundations for that digital mindset of addressing humanistic concerns starts in the first-year of college.

Our initiative is to collaborate with faculty teaching the first-year writing courses and seminars to craft an introductory set of DH assignments that relates the core concepts of these courses with analytical and creative methods within DH that establishes a baseline of the critical understanding needed for thriving in a digital society.

What type of DH assignments are suitable for first-year writing courses? We’re not yet sure. It’s not appropriate for librarians and technologists to say, “This is what you should do.” Over the course of the coming year we want to define that with the faculty teaching those courses. We’re going to do that through a series of conversations. Plus, we’ll seek the advice of faculty at other institutions that have explored the concepts and are further along that path.

Readings on electronic literature, or conversations on digital narrative


I initially prepared the following list in preparation to guest lecture in an upcoming creative writing (fiction) course that will introduce students to ways of telling a story in digital media that takes forms other than linear prose.

First, why the term electronic literature (e-lit)? That’s a stiff sounding term that is a throwback to an earlier time before digital became commonplace. But e-lit is the term that seemingly has gained the most traction to refer to narratives that make innovative use of digital media. I have a slight discomfort with the term electronic literature (and also with digital literature) but my unease with the terminology is a topic for another post.

Any discussion of e-lit must involve the Electronic Literature Organization (ELO) at eliterature.org. That website is quickly overwhelming but the summary page What is E-Lit? is a good place to start.

The foundations of e-lit

An essay by Katherine Hayles offers a broad survey of e-lit (up to 2007). Hayles is an important figure in media studies and this essay is a good opportunity to introduce students to her works. Hayles includes references to landmark writers such as Kittler and Manovich and positions e-lit in a larger framework of the modern digital society.

This essay by Hayles is the first chapter in the book Electronic Literature: New Horizons for the Literary, which includes a companion website. The book attempts to establish a canon of early e-lit. But Janet Murray, another key scholar in the field of media studies, points out that these early works “are useful experiments, necessary failures, and limited successes, full of interesting mistakes that if appropriately acknowledged can push practice forward.” (Murray, Janet H. Hayles, N. Katherine. Electronic literature: new horizons for the literary. Modern fiction studies 55.2 01 Jan 2009: 407. Johns Hopkins University Press.)

A good summary of Hayles book on e-lit is provided on The Quarterly Conversation site in an article subtitled How Electronic Literature Makes Printed Literature Richer. Anyone who finds Hayles even slightly interesting should read her book How We Became Posthuman: Virtual Bodies in Cybernetics, Literature, and Informatics.

Pathfinders: Documenting the Experience of Early Digital Literature is the best source for understanding where e-lit comes from through an examination of pre-web hypertext literature in the years between 1986 and 1995. The scholarly literature on e-lit often refers to seminal works such as Shelley Jackson’s Patchwork Girl (1995) and other hypermedia texts created through tools developed by Eastgate System. However, the technology to actually read those works of e-lit today are inaccessible to most people. While Pathfinders does not provide a simulation of Patchwork Girl, it offers an intriguing methodology of showcasing how Shelley Jackson and readers interact with Patchwork Girl.

The present state of e-lit

There’s something odd about e-lit: it appears to be mostly discussed within academia and it’s difficult to find good examples on the web of what is called e-lit. How could that be?

Writer Paul La Farge provides a great comment:

“I actually don’t think digital literature is suffering from a lack of theory at this point; if anything, it suffers from a lack of practice. We need more writers! And a more diverse and robust way of getting their work into the world: not just more competent critics (we have some), but more kinds of competent critics, and more places where conversations about digital literature can happen, and more avenues by which digital lit can reach readers. All of this will surely happen in time. What I think the medium needs now is encouragement, and perhaps rescue from the forbiddingly technical language in which it has been theorized. It depresses me to think of digital literature as being exclusively an academic specialty: it’s as if Film Studies departments had sprung into existence all over the world, before anyone had made any movies.”

This quote is from an excellent series of posts by author Illya Szilak that appeared on the Huffington Post.

Note: I will be updating this post with new readings.

 

E-lit postings by Illya Szilak


Over the course of a year (late 2012 – late 2013) author Illya Szilak wrote a series of articles on Huffington Post about electronic literature that are worth reading for anyone interested in the topic. Szilak is the author of Queerskins – A Novel and Reconstructing Mayakovsky – A Novel of the Future.

Unlike most people who write about e-lit, Szilak is a physician and not an academic. As a creator of contemporary e-lit she brings a perspective that is often absent from the conversation on this topic.

Due to the navigational features of the Huffington Post it isn’t easy to read her articles in the order they were written. So, I arranged the following links to each article in chronological order.

The Death of the Novel: How E-Lit Revolutionizes Fiction 11/08/2012

Video in the House of the Word: How e-Lit Intersects With Cinema 11/20/2012

What Does a Polar Bear Do in a Jungle? How E-Lit Expands the Habitat of Literature 12/11/2012

The Death of the Author: E-lit and Collective Creativity 12/27/2012

It’s Got a Good Beat and You Can Dance to It: E-lit Plays With Time 1/17/2013

New Wor(l)d Order: E-lit Plays With Language 2/7/2013

It’s All Fun Until Someone Loses: E-lit Plays Games 3/7/2013

Just Playing Around: Why E-lit Matters 3/15/2013

Killing the Literary: The Death of E-lit 3/19/2013

Books That Nobody Reads: E-lit at the Library of Congress 4/24/2013

Fleshly Data: E-lit and the Post-Human 5/10/2013

Remembering the Human: E-lit and the Art of Memory 5/15/2013

Reorienting Narrative: E-lit as Pyschogeography 6/11/2013

The Silent History: E-lit Looks to the Future 7/1/2013

A Book Itself Is a Little Machine: Emily Short’s Interactive Fiction 10/30/2013

A Book Itself Is a Little Machine: Emily Short’s Interactive Fiction, pt 2 11/4/2013