Digital Curation
Version 1.3

Teleconference with Anna Perricci

Excerpt playing: (choose from the outline below)

Introduction

00:00 Introduction to the program (Jon Ippolito)

01:04 Introducing Anna Perricci

Varieties of web archiving

05:01 Introduction to web archiving

06:55 Automated versus human-scale archiving

("Automation still requires quality assurance"

09:35 Community collecting with Occupy Wall Street, Internet Archive, and Documenting the Now

Perricci's work archiving Occupy Wall Street

Introduction to Webrecorder

10:50 Introduction to Webrecorder

14:19 What is high-fidelity web archiving?

Special Webrecorder features

20:27 Built-in emulation of vintage browsers

You can choose a legacy browser on a site that has obsolete content (eg, Flash and Java)

21:19 Autopilot

Captures content for you, but depends on structure of social media sites.

21:26 How to get technical help

Questions about Webrecorder applications

23:17 Archiving representative samples (Matthew Revitt)

25:40 More on Autopilot (Meagan Doyle)

27:36 Browsertrix (Alex Kaelin)

28:25 How to patch missing content (Colin Windhorst)

Webrecorder demo

29:41 Demo of Webrecorder acting on a site

Questions about Webrecorder features

37:04 "Capture URL again" v. "Patch this URL" (Sarah Danser)

37:50 Using emulated browsers for both capture and playback (John Bell)

39:11 Time frame for archiving a social media site like Facebook (Renee DesRoberts)

40:22 Capturing beyond images, eg iframes and hidden web structures

42:07 Capturing data-driven websites (Cynde Moya)

43:08 Capturing outgoing requests, like a query in a search box (John)

Carnegie Hall case study

44:02 Ease of patching compared to other tools (Sean Crawford)

44:18 Editing options (Kim Sawtelle)

"Trying to pull a thread out of a tapestry"

46:02 Capturing live content in real-time, like streaming radio (Alex)

47:43 Saving local backups (Colin)

48:44 Capturing dead links, eg in Scalar books (Colin).

50:24 Case study of media-rich journalism (Rhonda Carpenter)

The Snowfall project

51:45 Archive-It and capturing database content (Matthew)

The bigger picture

54:17 Top-down harvesters (such as OAIS) versus bottom-up, human-scale solutions.

57:47 How to get more information

This teleconference is a project of the University of Maine's Digital Curation program. For more information, contact ude.eniam@otiloppij.

Timecodes are in Minutes: hours

In this interactive discussion hosted by the University of Maine's Digital Curation program, Anna Perricci presents new tools and techniques for saving Internet culture for posterity.

As more of our work and entertainment moves online, the challenge of preserving websites and social media becomes increasingly urgent. Studies peg the average time before a website or mobile app is rewritten or lost at 50-100 days. And unlike the static HTML pages of the 1990s, modern websites are built dynamically from JavaScript as they load in the browser; without those external calls, saving a Twitter or Facebook page can leave you with a handful of floating text snippets flanked by gray rectangles.

In this public webinar, Perricci presents new tools and techniques for saving online data and culture for posterity. Perricci has been working at the leading edge of web archiving for the better part of a decade. After positions at the New York Public Library and ArtStor, Perricci served as a web archiving librarian at Columbia University and a digital archivist for Occupy Wall Street. She's taught web archiving for the Society of American Archivists and been an associate director for Webrecorder, where she helped secure a million-dollar grant to produce and promote this innovative, browser-based tool used to create high-fidelity, interactive captures of websites.

Watch the entire video or choose an excerpt from the menu on this page.