(Roughly) Daily

Posts Tagged ‘Internet Archive

“I get slightly obsessive about working in archives because you don’t know what you’re going to find. In fact, you don’t know what you’re looking for until you find it.”*…

An update on that remarkable treasure, The Internet Archive

Within the walls of a beautiful former church in San Francisco’s Richmond district [the facade of which is pictured above], racks of computer servers hum and blink with activity. They contain the internet. Well, a very large amount of it.

The Internet Archive, a non-profit, has been collecting web pages since 1996 for its famed and beloved Wayback Machine. In 1997, the collection amounted to 2 terabytes of data. Colossal back then, you could fit it on a $50 thumb drive now.

Today, the archive’s founder Brewster Kahle tells me, the project is on the brink of surpassing 100 petabytes – approximately 50,000 times larger than in 1997. It contains more than 700bn web pages.

The work isn’t getting any easier. Websites today are highly dynamic, changing with every refresh. Walled gardens like Facebook are a source of great frustration to Kahle, who worries that much of the political activity that has taken place on the platform could be lost to history if not properly captured. In the name of privacy and security, Facebook (and others) make scraping difficult. News organisations’ paywalls (such as the FT’s) are also “problematic”, Kahle says. News archiving used to be taken extremely seriously, but changes in ownership or even just a site redesign can mean disappearing content. The technology journalist Kara Swisher recently lamented that some of her early work at The Wall Street Journal has “gone poof”, after the paper declined to sell the material to her several years ago…

A quarter of a century after it began collecting web pages, the Internet Archive is adapting to new challenges: “The ever-expanding job of preserving the internet’s backpages” (gift article) from @DaveLeeFT in the @FinancialTimes.

Antony Beevor

###

As we celebrate collection, we might recall that it was on this date in 2001 that the Polaroid Corporation– best known for its instant film and cameras– filed for bankruptcy. Its employment had peaked in 1978 at 21,000; it revenues, in 1991 at $3 Billion.

Polaroid 80B Highlander instant camera made in the USA, circa 1959

source

Written by (Roughly) Daily

October 11, 2022 at 1:00 am

“I rather think that archives exist to keep things safe – but not secret”*…

Brewster Kahle, founder and head of The Internet Archive couldn’t agree more, and for the last 25 years he’s put his energy, his money– his life– to work trying to make that happen…

In 1996, Kahle founded the Internet Archive, which stands alongside Wikipedia as one of the great not-for-profit knowledge-enhancing creations of modern digital technology. You may know it best for the Wayback Machine, its now quarter-century-old tool for deriving some sort of permanent record from the inherently transient medium of the web. (It’s collected 668 billion web pages so far.) But its ambitions extend far beyond that, creating a free-to-all library of 38 million books and documents, 14 million audio recordings, 7 million videos, and more…

That work has not been without controversy, but it’s an enormous public service — not least to journalists, who rely on it for reporting every day. (Not to mention the Wayback Machine is often the only place to find the first two decades of web-based journalism, most of which has been wiped away from its original URLs.)…

Joshua Benton (@jbenton) of @NiemanLab debriefs Brewster on the occasion of the Archive’s silver anniversary: “After 25 years, Brewster Kahle and the Internet Archive are still working to democratize knowledge.”

Amidst wonderfully illuminating reminiscences, Brewster goes right to the heart of the issue…

Corporations continue to control access to materials that are in the library, which is controlling preservation, and it’s killing us….

[The Archive and the movement of which it’s a part are] a radical experiment in radical sharing. I think the winner, the hero of the last 25 years, is the everyman. They’ve been the heroes. The institutions are the ones who haven’t adjusted. Large corporations have found this technology as a mechanism of becoming global monopolies. It’s been a boom time for monopolists.

Kevin Young

###

As we love librarians, we might send carefully-curated birthday greetings to Frederick Baldwin Adams Jr.; he was born on this date in 1910.  A bibliophile who was more a curator than an archivist, he was the the director of the Pierpont Morgan Library in New York City from 1948–1969.  His predecessor, Belle da Costa Greene, was responsible for organizing the results of Morgan’s rapacious collecting; Adams was responsible for broadening– and modernizing– that collection, adding works by Virginia Woolf, E. M. Forster, Willa Cather, Robert Frost,  E. A. Robinson, among many others, along with manuscripts and visual arts, and for enhancing the institution’s role as a research facility.

Adams was also an important collector in his own right.  He amassed two of the largest holdings of works by Thomas Hardy and Robert Frost, as well as one of the leading collections of writing by Karl Marx and left-wing Americana.

Adams

source

“Some folks look for answers, others look for fights”*…

Grateful Dead plays Red Rocks for the final time, August 13, 1987 [source]

Max Abelson takes a break from his (essential) coverage of money and power at Bloomberg News and Businessweek to appreciate the community that’s grown up around the Internet Archive’s Grateful Dead Archive (where one can find– among the over 15,000 concert recordings– not one, but two full takes of the show pictured above)…

On the Archive, the writing about the Dead’s live music often transcends the personal mode and approaches something closer to the galactic. Nothing brings out that cosmic style like “Dark Star,” a song that the band stretched from a three-minute studio single into its own solar system. Ginosega left a flight log for the same forty-three-minute 1973 version that played in the friend’s basement: “About 12 minutes in, Phil fires the engines and turns the ship out of orbit, until at 17 minutes we have arrived in the deepest, darkest part of the galaxy.” The trip isn’t half over. “Only at 21 minutes into the song do they actually start playing the song.” The post, which has a kind of sci-fi internal logic, describes interstellar wind and multicolored ooze, before, “at about 36 minutes, we start the return trip, passing through more familiar systems on our way back home.”

One of the magical things about how high the Dead flew is that they managed to do it without, say, Sly Stone’s rhythm, Joni Mitchell’s poetry, or Brian Wilson’s voice. The allure of this band—whatever it is that keeps sparking so much cosmic wonder and nostalgia—is foggy and mysterious. Paumgarten, in his New Yorker piece, identified a sprawling combination of factors, including Garcia’s soulful charisma and Appalachian gloom, the band’s 26,000-watt sound system, an ethos of group improvisation, and the “particular note of decay” in each cassette swapped from hand to hand. You can think about the Archive as not just the best tape rack of all, but as a collection of thousands of swings at saying the inexplicable. A user named Scottie78 was so moved by a half-hour version of “Dark Star” at the Spectrum in Philadelphia in 1972 that he not only came close to leaving a bullet point for each minute, but more or less created an identification system to differentiate the micro-micro-genres he heard, from “Space Jazz” and “Acid Jazz” to “Acid Jazzgrass.” It’s embarrassing and magnetic at the same time.

Others tip over from starry-eyed to freaked out. “So cacophonous, atonal and scary that it could potentially traumatize animals when played loud,” Phleshy said in 2004 about a version from Rotterdam in 1972. “If this explanation sounds stupid in words, then listen to the last half-hour of ‘Dark Star’ in a darkened room and see if you feel remotely secure.”

The line between the personal and astronomical is thin. Boboboy’s recollection of the 1989 show at JFK Stadium is what Didion might have described if she had witnessed more people sway: “I clearly remember seeing the swirling masses of thousands on the floor from my perch all the way back.” The dancers below looked like birds up above, “a flock of starlings cruising the sky, but in slow motion.”

Some of the writing aims even higher. “When you want to know what it is like being in heaven, cue up the second set,” Seedanrun wrote about the band’s beloved 1977 show at Cornell. “When you want to feel what it is like to be face to face with God, dim the lights and really focus on the ‘Morning Dew.’”

The glory of that show, performed inside the university’s Barton Hall on a snowy night in May, is perhaps the nearest the Dead Archives come to consensus. The thought of sullying it with a rating scale offended a user named GruUbic: “If this is five stars, is heaven a 4.5?” In 2004, BillDP went further, calling the show “the single best live performance I have ever heard from any group at any time.” His authoritativeness is only outdone by the dumbstruck. “Mere words cannot do justice,” Grateful Hillbilly posted in 2015. “Words like amazing and unbelievable and incomparable don’t capture the immensity of awe.”

[Brewster] Kahle, the Internet Archive’s founder, tells me that he wishes more of the web was shaped like the Dead Archive. “What you’re looking at,” he said, “is from an era of the Internet that I think is best typified by what Tim Berners-Lee called ‘pages.’” Today, he said, instead, what dominates is the “feed.” (“Horrible word,” he added.) Facebook and Twitter scroll by endlessly, unaccountably, and unpleasantly, but “it wasn’t always that way, and it was a choice.” Each Dead show, he said, is “something you can anchor to, it’s something you can revolve around.” He went on: “By making things endure, we can have people cherish them, use them, and invest in them. So the writing is fundamentally different. I think we should go back to it—or forward to it.”…

The way the internet was… and should be? “In the Dead Archives,” from @maxabelson.

* The Grateful Dead, “Playing In The Band” (written by Bob Weir, Robert Hunter, and Mickey Hart)

###

As we go hear Uncle John’s Band, we might send bluesy birthday greetings to Ronald Charles McKernan; he was born on this date in 1945. Better known by his stage name, Pigpen,” he was a founding member of The Warlocks… which became the Grateful Dead. He was the band’s original frontman, playing harmonica and electric organ; but Jerry Garcia’s and Phil Lesh’s influences on the band became increasingly stronger as they embraced psychedelic rock. Pigpen’s contributions receded to vocals, harmonica, and percussion (though he continued to be a frontman in concert for some numbers, including his interpretations of Bobby Bland’s “Turn On Your Love Light” and the Rascals’ “Good Lovin'”).

Pigpen was unique among his bandmates in preferring alcohol to psychedelics, and sadly succumbed to alcoholism– from complications of which he died in 1973.

source

“Doing research on the Web is like using a library assembled piecemeal by pack rats and vandalized nightly”*…

But surely, argues Jonathan Zittrain, it shouldn’t be that way…

Sixty years ago the futurist Arthur C. Clarke observed that any sufficiently advanced technology is indistinguishable from magic. The internet—how we both communicate with one another and together preserve the intellectual products of human civilization—fits Clarke’s observation well. In Steve Jobs’s words, “it just works,” as readily as clicking, tapping, or speaking. And every bit as much aligned with the vicissitudes of magic, when the internet doesn’t work, the reasons are typically so arcane that explanations for it are about as useful as trying to pick apart a failed spell.

Underpinning our vast and simple-seeming digital networks are technologies that, if they hadn’t already been invented, probably wouldn’t unfold the same way again. They are artifacts of a very particular circumstance, and it’s unlikely that in an alternate timeline they would have been designed the same way.

The internet’s distinct architecture arose from a distinct constraint and a distinct freedom: First, its academically minded designers didn’t have or expect to raise massive amounts of capital to build the network; and second, they didn’t want or expect to make money from their invention.

The internet’s framers thus had no money to simply roll out a uniform centralized network the way that, for example, FedEx metabolized a capital outlay of tens of millions of dollars to deploy liveried planes, trucks, people, and drop-off boxes, creating a single point-to-point delivery system. Instead, they settled on the equivalent of rules for how to bolt existing networks together.

Rather than a single centralized network modeled after the legacy telephone system, operated by a government or a few massive utilities, the internet was designed to allow any device anywhere to interoperate with any other device, allowing any provider able to bring whatever networking capacity it had to the growing party. And because the network’s creators did not mean to monetize, much less monopolize, any of it, the key was for desirable content to be provided naturally by the network’s users, some of whom would act as content producers or hosts, setting up watering holes for others to frequent.

Unlike the briefly ascendant proprietary networks such as CompuServe, AOL, and Prodigy, content and network would be separated. Indeed, the internet had and has no main menu, no CEO, no public stock offering, no formal organization at all. There are only engineers who meet every so often to refine its suggested communications protocols that hardware and software makers, and network builders, are then free to take up as they please.

So the internet was a recipe for mortar, with an invitation for anyone, and everyone, to bring their own bricks. Tim Berners-Lee took up the invite and invented the protocols for the World Wide Web, an application to run on the internet. If your computer spoke “web” by running a browser, then it could speak with servers that also spoke web, naturally enough known as websites. Pages on sites could contain links to all sorts of things that would, by definition, be but a click away, and might in practice be found at servers anywhere else in the world, hosted by people or organizations not only not affiliated with the linking webpage, but entirely unaware of its existence. And webpages themselves might be assembled from multiple sources before they displayed as a single unit, facilitating the rise of ad networks that could be called on by websites to insert surveillance beacons and ads on the fly, as pages were pulled together at the moment someone sought to view them.

And like the internet’s own designers, Berners-Lee gave away his protocols to the world for free—enabling a design that omitted any form of centralized management or control, since there was no usage to track by a World Wide Web, Inc., for the purposes of billing. The web, like the internet, is a collective hallucination, a set of independent efforts united by common technological protocols to appear as a seamless, magical whole.

This absence of central control, or even easy central monitoring, has long been celebrated as an instrument of grassroots democracy and freedom. It’s not trivial to censor a network as organic and decentralized as the internet. But more recently, these features have been understood to facilitate vectors for individual harassment and societal destabilization, with no easy gating points through which to remove or label malicious work not under the umbrellas of the major social-media platforms, or to quickly identify their sources. While both assessments have power to them, they each gloss over a key feature of the distributed web and internet: Their designs naturally create gaps of responsibility for maintaining valuable content that others rely on. Links work seamlessly until they don’t. And as tangible counterparts to online work fade, these gaps represent actual holes in humanity’s knowledge…

The glue that holds humanity’s knowledge together is coming undone: “The Internet Is Rotting.” @zittrain explains what we can do to heal it.

(Your correspondent seconds his call to support the critically-important work of The Internet Archive and the Harvard Library Innovation Lab, along with the other initiatives he outlines.)

* Roger Ebert

###

As we protect our past for the future, we might recall that it was on this date in 1937 that Hormel introduced Spam. It was the company’s attempt to increase sales of pork shoulder, not at the time a very popular cut. While there are numerous speculations as to the “meaning of the name” (from a contraction of “spiced ham” to “Scientifically Processed Animal Matter”), its true genesis is known to only a small circle of former Hormel Foods executives.

As a result of the difficulty of delivering fresh meat to the front during World War II, Spam became a ubiquitous part of the U.S. soldier’s diet. It became variously referred to as “ham that didn’t pass its physical,” “meatloaf without basic training,” and “Special Army Meat.” Over 150 million pounds of Spam were purchased by the military before the war’s end. During the war and the occupations that followed, Spam was introduced into Guam, Hawaii, Okinawa, the Philippines, and other islands in the Pacific. Immediately absorbed into native diets, it has become a unique part of the history and effects of U.S. influence in the Pacific islands.

source

“Maps codify the miracle of existence”*…

… and almost always, something more…

“Now when I was a little chap I had a passion for maps”, says the seafaring raconteur Charles Marlow in Joseph Conrad’s Heart of Darkness (1899). “At that time there were many blank spaces on the earth, and when I saw one that looked particularly inviting on a map (but they all look that) I would put my finger on it and say, ‘When I grow up I will go there.'” Of course, these “blank spaces” were anything but. The no-man’s-lands that colonial explorers like Marlow found most inviting (the Congo River basin, Tasmania, the Andaman Islands) were, in fact, richly populated, and faced devastating consequences in the name of imperial expansion.

In the same troublesome vein as Marlow, Edward Quin’s Historical Atlas painted cartographic knowledge as a candle coruscating against the void of ignorance, represented in his unique vision by a broiling mass of black cloud. Each map represents the bounds of geographical learning at a particular point in history, from a specific civilizational perspective, beginning with Eden, circa “B.C. 2348”. In the next map titled “B.C. 1491. The Exodus of the Israelites”, Armenia, Assyria, Arabia, Aram, and Egypt form an island of light, pushing back the black clouds of unknowing. As history progresses — through various Roman dynasties, the reign of Charlemagne, and the Crusades — the foul weather retreats further. In the map titled “A.D. 1498. The Discovery of America”, the transatlantic exploits of the so-called Age of Discovery force Quin to employ a shift in scale — the luminescence of his globe now extends to include Africa and most of Asia, but North America hides behind cumulus clouds, with its “unnamed” eastern shores peeking out from beneath a storm of oblivion. In the Atlas‘ last map, we find a world without darkness, not a trace of cloud. Instead, unexplored territories stretch out in the pale brown of vellum parchment, demarcating “barbarous and uncivilized countries”, as if the hinterlands of Africa and Canada are awaiting colonial inscription. 

Looking back from a contemporary vantage, the Historical Atlas remains memorable for what is not shown. Quin’s cartography inadvertently visualizes the ideology of empire: a geographic chauvinism that had little respect for the knowledge of those beyond imperial borders. And aside from depicting the reach of Kublai Khan, his focus remains narrowly European and Judeo-Christian. While Quin strives for accuracy, he admits to programmatic omission. “The colours we have used being generally meant to point out and distinguish one state or empire from another. . . were obviously inapplicable to deserts peopled by tribes having no settled form of government, or political existence, or known territorial limits”. Instead of representing these groups, Quin, like his clouds, has erased them from view.

Clouds of Unknowing: Edward Quin’s (1830) Historical Atlas. From the David Rumsey Map Collection (via the Internet Archive), where you can view it all. Via the invaluable Public Domain Review (@PublicDomainRev).

* Nicholas Crane, Mercator: The Man Who Mapped the Planet

###

As we find our place, we might recall that it was on this date in 1971 that an epic mapping expedition began: NASA launched the Mariner 9 space probe, the first space craft to orbit Mars (or any other planet). Mariner 9 was designed to continue the atmospheric studies begun by Mariner 6 and 7, and to map over 70% of the Martian surface from the lowest altitude (930 mi) and at the highest resolutions (from 1,100 to 110 yards per pixel) of any Mars mission up to that point. After a spate of dust storms on the planet for several months following its arrival, the orbiter managed to send back clear pictures of the surface. Mariner 9 successfully returned 7,329 images over the course of its mission, which concluded in October 1972.

source

%d bloggers like this: