(Roughly) Daily

Posts Tagged ‘Internet Archive

“All that mankind has done, thought or been: it is lying as in magic preservation in the pages of books”*…

… But books (and their predecessors) are fragile, and need special archival care if they are to survive. That’s even truer, as Adrienne Bernhard explains in The Long Now Foundation‘s newsletter, of digital data and documents…

The Dead Sea scrolls, made of parchment and papyrus, are still readable nearly two millennia after their creation — yet the expected shelf life of a DVD is about 100 years. Several of Andy Warhol’s doodles, created and stored on a Commodore Amiga computer in the 01980s, were forever stranded there in an obsolete format. During a data-migration in 02019, millions of songs, videos and photos were lost when MySpace — once the Internet’s leading social network — fell prey to an irreversible data loss.

A false sense of security persists surrounding digitized documents: because an infinite number of identical copies can be made of any original, most of us believe that our electronic files have an indefinite shelf life and unlimited retrieval opportunities. In fact, preserving the world’s online content is an increasing concern, particularly as file formats (and the hardware and software used to run them) become scarce, inaccessible, or antiquated, technologies evolve, and data decays. Without constant maintenance and management, most digital information will be lost in just a few decades. Our modern records are far from permanent.

Obstacles to data preservation are generally divided into three broad categories: hardware longevity (e.g., a hard drive that degrades and eventually fails); format accessibility (a 5 ¼ inch floppy disk formatted with a filesystem that can’t be read by a new laptop); and comprehensibility (a document with a long-abandoned file type that can’t be interpreted by any modern machine). The problem is compounded by encryption (data designed to be inaccessible) and abundance (deciding what among the vast human archive of stored data is actually worth preserving).

The looming threat of the so-called “Digital Dark Age”, accelerated by the extraordinary growth of an invisible commodity — data — suggests we have fallen from a golden age of preservation in which everything of value was saved. In fact, countless records of previous historical eras have all but disappeared. The first Dark Ages, shorthand for the period beginning with the fall of the Roman Empire and stretching into the Middle Ages (00500-01000 CE), weren’t actually characterized by intellectual and cultural emptiness but rather by a dearth of historical documentation produced during that era.

Even institutions built for the express purpose of information preservation have succumbed to the ravages of time, natural disaster or human conquest. The famous library of Alexandria, one of the most important repositories of knowledge in the ancient world, eventually faded into obscurity. Built in the fourth century B.C., the library flourished for some six centuries, an unparalleled center of intellectual pursuit. Alexandria’s archive was said to contain half a million papyrus scrolls — the largest collection of manuscripts in the ancient world — including works by Plato, Aristotle, Homer and Herodotus. By the fifth century A.D., however, the majority of its collections had been stolen or destroyed, and the library fell into disrepair.

Digital archives are no different. The durability of the web is far from guaranteed. Link rot, in which outdated links lead readers to dead content (or a cheeky dinosaur icon), sets in like a pestilence. Corporate data sets are often abandoned when a company folds, left to sit in proprietary formats that no one without the right combination of hardware, software, and encryption keys can access. Scientific data is a particularly thorny problem: unless it’s saved to a public repository accessible to other researchers, technical information essentially becomes unusable or lost. Beyond switching to analog alternatives, which have their own drawbacks, how might we secure our digital information so that it survives for generations? How can individuals, private corporations and public entities coordinate efforts to ensure that their data is saved in more resilient formats?…

Without maintenance, most digital information will be lost in just a few decades. How might we secure our data so that it survives for generations? “Shining a Light on the Digital Dark Age,” from @AdrienneEve and @longnow. Eminently worth reading in full.

C.F. also: “Very Long-Term Backup” by Kevin Kelly (@kevin2kelly).

* Thomas Carlyle

###

As we ponder preservation, we might recall that the #1 song in the U.S. and the U.K. (among other territories) was the Beatles’ “Help!” (their fourth of six #1 singles in a row on the American charts).

source

“If you want to understand today you have to search yesterday”*…

The redoubtable Brewster Kahle on the dangerous ephemerality of civil discourse in our digital times…

Many have now seen how, when someone deletes their Twitter account, their profile, their tweets, even their direct messages, disappear. According to the MIT Technology Review, around a million people have left so far, and all of this information has left the platform along with them. The mass exodus from Twitter and the accompanying loss of information, while concerning in its own right, shows something fundamental about the construction of our digital information ecosystem:  Information that was once readily available to you—that even seemed to belong to you—can disappear in a moment. 

Losing access to information of private importance is surely concerning, but the situation is more worrying when we consider the role that digital networks play in our world today. Governments make official pronouncements online. Politicians campaign online. Writers and artists find audiences for their work and a place for their voice. Protest movements find traction and fellow travelers.  And, of course, Twitter was a primary publishing platform of a certain U.S. president

If Twitter were to fail entirely, all of this information could disappear from their site in an instant. This is an important part of our history. Shouldn’t we be trying to preserve it?

I’ve been working on these kinds of questions, and building solutions to some of them, for a long time. That’s part of why, over 25 years ago, I founded the Internet Archive. You may have heard of our “Wayback Machine,” a free service anyone can use to view archived web pages from the mid-1990’s to the present. This archive of the web has been built in collaboration with over a thousand libraries around the world, and it holds hundreds of billions of archived webpages today–including those presidential tweets (and many others). In addition, we’ve been preserving all kinds of important cultural artifacts in digital form: books, television news, government records, early sound and film collections, and much more. 

The scale and scope of the Internet Archive can give it the appearance of something unique, but we are simply doing the work that libraries and archives have always done: Preserving and providing access to knowledge and cultural heritage…

While we have had many successes, it has not been easy… companies close, and change hands, and their commercial interests can cut against preservation and other important public benefits. Traditionally, libraries and archives filled this gap. But in the digital world, law and technology make their job increasingly difficult. For example, while a library could always simply buy a physical book on the open market in order to preserve it on their shelves, many publishers and platforms try to stop libraries from preserving information digitally. They may even use technical and legal measures to prevent libraries from doing so. While we strongly believe that fair use law enables libraries to perform traditional functions like preservation and lending in the digital environment, many publishers disagree, going so far as to sue libraries to stop them from doing so. 

We should not accept this state of affairs. Free societies need access to history, unaltered by changing corporate or political interests. This is the role that libraries have played and need to keep playing…

A important plea, eminently worth reading in full: “Our Digital History Is at Risk,” from @brewster_kahle @internetarchive.

* Pearl S. Buck

###

As we prioritize preservation, we might recall that it was on this date in 1940 that MGM released the first in what would be a long series of Tom and Jerry cartoons (though neither character was named in this inaugural outing, and one of the animators referred to them as Jasper and Jinx… Tom and Jerry were their monikers from the second cartoon, on). The basic premise was the one that would become familiar to audiences: “cat stalks and chases mouse in a frenzy of mayhem and slapstick violence.” Though studio executives were unimpressed, audiences loved the film, and it was nominated for an Academy Award.

Find Tom and Jerry at The Internet Archive.

Written by (Roughly) Daily

February 10, 2023 at 1:00 am

“I get slightly obsessive about working in archives because you don’t know what you’re going to find. In fact, you don’t know what you’re looking for until you find it.”*…

An update on that remarkable treasure, The Internet Archive

Within the walls of a beautiful former church in San Francisco’s Richmond district [the facade of which is pictured above], racks of computer servers hum and blink with activity. They contain the internet. Well, a very large amount of it.

The Internet Archive, a non-profit, has been collecting web pages since 1996 for its famed and beloved Wayback Machine. In 1997, the collection amounted to 2 terabytes of data. Colossal back then, you could fit it on a $50 thumb drive now.

Today, the archive’s founder Brewster Kahle tells me, the project is on the brink of surpassing 100 petabytes – approximately 50,000 times larger than in 1997. It contains more than 700bn web pages.

The work isn’t getting any easier. Websites today are highly dynamic, changing with every refresh. Walled gardens like Facebook are a source of great frustration to Kahle, who worries that much of the political activity that has taken place on the platform could be lost to history if not properly captured. In the name of privacy and security, Facebook (and others) make scraping difficult. News organisations’ paywalls (such as the FT’s) are also “problematic”, Kahle says. News archiving used to be taken extremely seriously, but changes in ownership or even just a site redesign can mean disappearing content. The technology journalist Kara Swisher recently lamented that some of her early work at The Wall Street Journal has “gone poof”, after the paper declined to sell the material to her several years ago…

A quarter of a century after it began collecting web pages, the Internet Archive is adapting to new challenges: “The ever-expanding job of preserving the internet’s backpages” (gift article) from @DaveLeeFT in the @FinancialTimes.

Antony Beevor

###

As we celebrate collection, we might recall that it was on this date in 2001 that the Polaroid Corporation– best known for its instant film and cameras– filed for bankruptcy. Its employment had peaked in 1978 at 21,000; it revenues, in 1991 at $3 Billion.

Polaroid 80B Highlander instant camera made in the USA, circa 1959

source

Written by (Roughly) Daily

October 11, 2022 at 1:00 am

“I rather think that archives exist to keep things safe – but not secret”*…

Brewster Kahle, founder and head of The Internet Archive couldn’t agree more, and for the last 25 years he’s put his energy, his money– his life– to work trying to make that happen…

In 1996, Kahle founded the Internet Archive, which stands alongside Wikipedia as one of the great not-for-profit knowledge-enhancing creations of modern digital technology. You may know it best for the Wayback Machine, its now quarter-century-old tool for deriving some sort of permanent record from the inherently transient medium of the web. (It’s collected 668 billion web pages so far.) But its ambitions extend far beyond that, creating a free-to-all library of 38 million books and documents, 14 million audio recordings, 7 million videos, and more…

That work has not been without controversy, but it’s an enormous public service — not least to journalists, who rely on it for reporting every day. (Not to mention the Wayback Machine is often the only place to find the first two decades of web-based journalism, most of which has been wiped away from its original URLs.)…

Joshua Benton (@jbenton) of @NiemanLab debriefs Brewster on the occasion of the Archive’s silver anniversary: “After 25 years, Brewster Kahle and the Internet Archive are still working to democratize knowledge.”

Amidst wonderfully illuminating reminiscences, Brewster goes right to the heart of the issue…

Corporations continue to control access to materials that are in the library, which is controlling preservation, and it’s killing us….

[The Archive and the movement of which it’s a part are] a radical experiment in radical sharing. I think the winner, the hero of the last 25 years, is the everyman. They’ve been the heroes. The institutions are the ones who haven’t adjusted. Large corporations have found this technology as a mechanism of becoming global monopolies. It’s been a boom time for monopolists.

Kevin Young

###

As we love librarians, we might send carefully-curated birthday greetings to Frederick Baldwin Adams Jr.; he was born on this date in 1910.  A bibliophile who was more a curator than an archivist, he was the the director of the Pierpont Morgan Library in New York City from 1948–1969.  His predecessor, Belle da Costa Greene, was responsible for organizing the results of Morgan’s rapacious collecting; Adams was responsible for broadening– and modernizing– that collection, adding works by Virginia Woolf, E. M. Forster, Willa Cather, Robert Frost,  E. A. Robinson, among many others, along with manuscripts and visual arts, and for enhancing the institution’s role as a research facility.

Adams was also an important collector in his own right.  He amassed two of the largest holdings of works by Thomas Hardy and Robert Frost, as well as one of the leading collections of writing by Karl Marx and left-wing Americana.

Adams

source

“Some folks look for answers, others look for fights”*…

Grateful Dead plays Red Rocks for the final time, August 13, 1987 [source]

Max Abelson takes a break from his (essential) coverage of money and power at Bloomberg News and Businessweek to appreciate the community that’s grown up around the Internet Archive’s Grateful Dead Archive (where one can find– among the over 15,000 concert recordings– not one, but two full takes of the show pictured above)…

On the Archive, the writing about the Dead’s live music often transcends the personal mode and approaches something closer to the galactic. Nothing brings out that cosmic style like “Dark Star,” a song that the band stretched from a three-minute studio single into its own solar system. Ginosega left a flight log for the same forty-three-minute 1973 version that played in the friend’s basement: “About 12 minutes in, Phil fires the engines and turns the ship out of orbit, until at 17 minutes we have arrived in the deepest, darkest part of the galaxy.” The trip isn’t half over. “Only at 21 minutes into the song do they actually start playing the song.” The post, which has a kind of sci-fi internal logic, describes interstellar wind and multicolored ooze, before, “at about 36 minutes, we start the return trip, passing through more familiar systems on our way back home.”

One of the magical things about how high the Dead flew is that they managed to do it without, say, Sly Stone’s rhythm, Joni Mitchell’s poetry, or Brian Wilson’s voice. The allure of this band—whatever it is that keeps sparking so much cosmic wonder and nostalgia—is foggy and mysterious. Paumgarten, in his New Yorker piece, identified a sprawling combination of factors, including Garcia’s soulful charisma and Appalachian gloom, the band’s 26,000-watt sound system, an ethos of group improvisation, and the “particular note of decay” in each cassette swapped from hand to hand. You can think about the Archive as not just the best tape rack of all, but as a collection of thousands of swings at saying the inexplicable. A user named Scottie78 was so moved by a half-hour version of “Dark Star” at the Spectrum in Philadelphia in 1972 that he not only came close to leaving a bullet point for each minute, but more or less created an identification system to differentiate the micro-micro-genres he heard, from “Space Jazz” and “Acid Jazz” to “Acid Jazzgrass.” It’s embarrassing and magnetic at the same time.

Others tip over from starry-eyed to freaked out. “So cacophonous, atonal and scary that it could potentially traumatize animals when played loud,” Phleshy said in 2004 about a version from Rotterdam in 1972. “If this explanation sounds stupid in words, then listen to the last half-hour of ‘Dark Star’ in a darkened room and see if you feel remotely secure.”

The line between the personal and astronomical is thin. Boboboy’s recollection of the 1989 show at JFK Stadium is what Didion might have described if she had witnessed more people sway: “I clearly remember seeing the swirling masses of thousands on the floor from my perch all the way back.” The dancers below looked like birds up above, “a flock of starlings cruising the sky, but in slow motion.”

Some of the writing aims even higher. “When you want to know what it is like being in heaven, cue up the second set,” Seedanrun wrote about the band’s beloved 1977 show at Cornell. “When you want to feel what it is like to be face to face with God, dim the lights and really focus on the ‘Morning Dew.’”

The glory of that show, performed inside the university’s Barton Hall on a snowy night in May, is perhaps the nearest the Dead Archives come to consensus. The thought of sullying it with a rating scale offended a user named GruUbic: “If this is five stars, is heaven a 4.5?” In 2004, BillDP went further, calling the show “the single best live performance I have ever heard from any group at any time.” His authoritativeness is only outdone by the dumbstruck. “Mere words cannot do justice,” Grateful Hillbilly posted in 2015. “Words like amazing and unbelievable and incomparable don’t capture the immensity of awe.”

[Brewster] Kahle, the Internet Archive’s founder, tells me that he wishes more of the web was shaped like the Dead Archive. “What you’re looking at,” he said, “is from an era of the Internet that I think is best typified by what Tim Berners-Lee called ‘pages.’” Today, he said, instead, what dominates is the “feed.” (“Horrible word,” he added.) Facebook and Twitter scroll by endlessly, unaccountably, and unpleasantly, but “it wasn’t always that way, and it was a choice.” Each Dead show, he said, is “something you can anchor to, it’s something you can revolve around.” He went on: “By making things endure, we can have people cherish them, use them, and invest in them. So the writing is fundamentally different. I think we should go back to it—or forward to it.”…

The way the internet was… and should be? “In the Dead Archives,” from @maxabelson.

* The Grateful Dead, “Playing In The Band” (written by Bob Weir, Robert Hunter, and Mickey Hart)

###

As we go hear Uncle John’s Band, we might send bluesy birthday greetings to Ronald Charles McKernan; he was born on this date in 1945. Better known by his stage name, Pigpen,” he was a founding member of The Warlocks… which became the Grateful Dead. He was the band’s original frontman, playing harmonica and electric organ; but Jerry Garcia’s and Phil Lesh’s influences on the band became increasingly stronger as they embraced psychedelic rock. Pigpen’s contributions receded to vocals, harmonica, and percussion (though he continued to be a frontman in concert for some numbers, including his interpretations of Bobby Bland’s “Turn On Your Love Light” and the Rascals’ “Good Lovin'”).

Pigpen was unique among his bandmates in preferring alcohol to psychedelics, and sadly succumbed to alcoholism– from complications of which he died in 1973.

source

%d bloggers like this: