(Roughly) Daily

Posts Tagged ‘data

“A better world won’t come about simply because we use data; data has its dark underside.”*…

 

Data

 

Data isn’t the new oil, it’s the new CO2. It’s a common trope in the data/tech field to say that “data is the new oil”. The basic idea being – it’s a new resource that is being extracted, it is valuable, and is a raw product that fuels other industries. But it also implies that data in inherently valuable in and of itself and that “my data” is valuable, a resource that I really should tap in to.

In reality, we are more impacted by other people’s data (with whom we are grouped) than we are by data about us. As I have written in the MIT Technology Review – “even if you deny consent to ‘your’ data being used, an organisation can use data about other people to make statistical extrapolations that affect you.” We are bound by other people’s consent. Our own consent (or lack thereof) is becoming increasingly irrelevant. We won’t solve the societal problems pervasive data surveillance is causing by rushing through online consent forms. If you see data as CO2, it becomes clearer that its impacts are societal not solely individual. My neighbour’s car emissions, the emissions from a factory on a different continent, impact me more than my own emissions or lack thereof. This isn’t to abdicate individual responsibility or harm. It’s adding a new lens that we too often miss entirely.

We should not endlessly be defending arguments along the lines that “people choose to willingly give up their freedom in exchange for free stuff online”. The argument is flawed for two reasons. First the reason that is usually given – people have no choice but to consent in order to access the service, so consent is manufactured.  We are not exercising choice in providing data but rather resigned to the fact that they have no choice in the matter.

The second, less well known but just as powerful, argument is that we are not only bound by other people’s data; we are bound by other people’s consent.  In an era of machine learning-driven group profiling, this effectively renders my denial of consent meaningless. Even if I withhold consent, say I refuse to use Facebook or Twitter or Amazon, the fact that everyone around me has joined means there are just as many data points about me to target and surveil. The issue is systemic, it is not one where a lone individual can make a choice and opt out of the system. We perpetuate this myth by talking about data as our own individual “oil”, ready to sell to the highest bidder. In reality I have little control over this supposed resource which acts more like an atmospheric pollutant, impacting me and others in myriads of indirect ways. There are more relations – direct and indirect – between data related to me, data about me, data inferred about me via others than I can possibly imagine, let alone control with the tools we have at our disposal today.

Because of this, we need a social, systemic approach to deal with our data emissions. An environmental approach to data rights as I’ve argued previously. But first let’s all admit that the line of inquiry defending pervasive surveillance in the name of “individual freedom” and individual consent gets us nowhere closer to understanding the threats we are facing.

Martin Tisné argues for an “environmental” approach to data rights: “Data isn’t the new oil, it’s the new CO2.”

Lest one think that we couldn’t/shouldn’t have seen this (and related issues like over dependence on algorithms, the digital divide, et al.) coming, see also Paul Baran‘s prescient 1968 essay, “On the Future Computer Era,” one of the last pieces he did at RAND, before co-leading the spin-off of The Institute for the Future.

* Mike Loukides, Ethics and Data Science

###

As we ponder privacy, we might recall that it was on this date in 1981 that IBM released IBM model number 5150– AKA the IBM PC– the original version and progenitor of the IBM PC compatible hardware platform. Since the machine was based on open architecture, within a short time of its introduction, third-party suppliers of peripheral devices, expansion cards, and software proliferated; the influence of the IBM PC on the personal computer market was substantial in standardizing a platform for personal computers (and creating a market for Microsoft’s operating system– first PC DOS, then Windows– on which the PC platform ran).  “IBM compatible” became an important criterion for sales growth; after the 1980s, only the Apple Macintosh family kept a significant share of the microcomputer market without compatibility with the IBM personal computer.

IBM PC source

 

Written by LW

August 12, 2019 at 1:01 am

“Big Data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it”*…

 

a-day-in-data-1200

 

You’ve probably heard of kilobytes, megabytes, gigabytes, or even terabytes.

These data units are common everyday amounts that the average person may run into. Units this size may be big enough to quantify the amount of data sent in an email attachment, or the data stored on a hard drive, for example.

In the coming years, however, these common units will begin to seem more quaint – that’s because the entire digital universe is expected to reach 44 zettabytes by 2020.

If this number is correct, it will mean there are 40 times more bytes than there are stars in the observable universe…

The stuff of dreams, the stuff of nightmares: “How Much Data is Generated Each Day?

* Dan Ariely

###

As we revel in really, really big numbers, we might spare a thought for Edgar Frank “Ted” Codd; he died on this date in 2003.  A distinguished computer scientist who did important work on cellular automata, he is best remembered as the father of computer databases– as the person who laid the foundation for for relational databases, for storing and retrieving information in computer records.

150px-Edgar_F_Coddsource

 

Written by LW

April 18, 2019 at 1:01 am

“I’ll let you be in my dreams if I can be in yours”*…

 

Lyrics

 

From Glenn Macdonald (in his capacity as Spotify’s genre taxonomist– or as he put’s it “mechanic of the spiritual compases of erratic discovery robots that run on love”)

This is a mapping of genres to words, and words to genres, using words that are used distinctively in the titles of songs. A genre’s words are ranked by how disproportionately they appear in that genre’s songs’ titles compared to all songs. A word’s genres are ranked by the position of that word in each genre’s word list. 1525 genres and 4712 words qualify.

Visit “Genres in Their Own Words”  And while you’re there, explore the genre map and the other nifty resources at Glenn’s site, Every Noise At Once.

* Bob Dylan

###

As we slip on the headphones, we might spare a thought for Sir George Henry Martin; he died on this date in 2016.  A record producer, arranger, composer, conductor, audio engineer, and musician, Martin began his career as a producer of comedy and novelty records in the early 1950s, working with Peter SellersSpike Milligan, and Bernard Cribbins, among others.  In 1962, while working at EMI/Parlophone, Martin was so impressed by Brian Epstein’s enthusiasm, that he agreed to record the Beatles before seeing or hearing them (and despite the fact that they’d been turned down by Decca).

Martin went on to produce 23 number ones on the Billboard Hot 100 chart, 19 of which were by The Beatles.  Indeed, Paul McCartney referred to Martin as “the fifth Beatle.”  He also produced chart topping hits for McCartney (“Say Say Say” with Michael Jackson and “Ebony and Ivory” with Stevie Wonder), Elton John (“Candle in the Wind”) and America (“Sister Golden Hair”).

220px-Beatles_and_George_Martin_in_studio_1966

George Harrison, Paul McCartney, George Martin, and John Lennon in the studio in 1966

 

Written by LW

March 8, 2019 at 1:01 am

“Induction for deduction, with a view to construction”*…

 

Mushroom cloud from the world’s first successful hydrogen bomb test, Nov. 1, 1952

At RAND in 1954, Armen A. Alchian conducted the world’s first event study to infer the fissile fuel material used in the manufacturing of the newly-developed hydrogen bomb. Successfully identifying lithium as the fissile fuel using only publicly available financial data, the paper was seen as a threat to national security and was immediately confiscated and destroyed…

How a bench researcher used publicly-available market data to unlock the secret of the H Bomb: “The Stock Market Speaks: How Dr. Alchian Learned to Build the Bomb” (pdf).

* Auguste Compte (attributed by John Arthur Thomson in a quote at heading of the chapter “Scientific Method,” in his Introduction to Science

###

As we comb the columns, we might recall that it was on this date in 1883 that the S.S. Daphne sank moments after her launching at the shipyard of Alexander Stephen and Sons in Glasgow.  The 500-ton steamer went down with 200 men on board– all of them working to finish her before the shipyard closed for the Glasgow Fair.  Only 70 were saved.

 source

 

Written by LW

July 3, 2017 at 1:01 am

“Representation plus interpretation to develop an idea”*…

 

William Playfair’s trade-balance time-series chart, published in his Commercial and Political Atlas, 1786

We’ve celebrated before the formative contributions of Florence Nightingale to data visualization; as noted then, she was building on the earlier work of William Playfair.  But as as Playfair was pioneering new ways to communicate complex data, he was himself building on prior efforts…

The idea of visualizing data is old: After all, that’s what a map is—a representation of geographic information—and we’ve had maps for about 8,000 years. But it was rare to graph anything other than geography. Only a few examples exist: Around the 11th century, a now-anonymous scribe created a chart of how the planets moved through the sky. By the 18th century, scientists were warming to the idea of arranging knowledge visually. The British polymath Joseph Priestley produced a “Chart of Biography,” plotting the lives of about 2,000 historical figures on a timeline. A picture, he argued, conveyed the information “with more exactness, and in much less time, than it [would take] by reading.”

Still, data visualization was rare because data was rare. That began to change rapidly in the early 19th century, because countries began to collect—and publish—reams of information about their weather, economic activity and population. “For the first time, you could deal with important social issues with hard facts, if you could find a way to analyze it,” says Michael Friendly, a professor of psychology at York University who studies the history of data visualization. “The age of data really began.”

An early innovator was the Scottish inventor and economist William Playfair. As a teenager he apprenticed to James Watt, the Scottish inventor who perfected the steam engine. Playfair was tasked with drawing up patents, which required him to develop excellent drafting and picture-drawing skills. After he left Watt’s lab, Playfair became interested in economics and convinced that he could use his facility for illustration to make data come alive.

“An average political economist would have certainly been able to produce a table for publication, but not necessarily a graph,” notes Ian Spence, a psychologist at the University of Toronto who’s writing a biography of Playfair. Playfair, who understood both data and art, was perfectly positioned to create this new discipline…

The Surprising History of the Infographic.”

* Francesco Franchi, defining inforgraphics

###

As we make it clear, we might note that today begins National Canned Luncheon Meat Week, “celebrated” the first week of July each year.

email readers click here for video

 

“To clarify, ADD data”*…

 

1939 World’s Fair– The World of Tomorrow–  under construction, on the site of a former Queens (New York City) wetland

Who dreams of files? Well, I do, to be honest. And I imagine Herman Melville, Emily Dickinson, Franz Kafka, and Le Corbusier did, too. It’s not only the files and cabinets themselves that enchant, but their epistemological and political promise; just think of what you can do with all that data! The dream has survived as a collective aspiration for well over a century — since we had standardized cards and papers to file, and cabinets to put them in — and is now expressed in fetishized data visualization and fantasies about “smart cities” and “urban science.” Record-keeping and filing were central to the World of Tomorrow and its urban imaginary, too…

Shannon Mattern on the way in which the 1939 World’s Fair anticipated our current obsession with urban data science and “smart” cities: “Indexing the World of Tomorrow.”

[TotH to Rebecca Onion]

* Edward Tufte

###

As we dream of spires, we might spare a thought for Andreas Felix von Oefele; he died on this date in 1780. A historian and author (most notably of the 10 volume work Lebensgeschichten der gelehrtesten Männer Bayerns, “Life stories of the most learned men of Bavaria”), von Oefele was the first “Electoral Councillor, Bibliothecarius and Antiquarius”–  the first head of the Bavarian Court and State Library and Secret Archives.

 source

 

Written by LW

February 17, 2016 at 1:01 am

“‘Begin at the beginning,’ the King said gravely, ‘and go till you come to the end; then stop'”*…

 

Index cards are mostly obsolete nowadays. We use them to create flash cards, write recipes, and occasionally fold them up into cool paper airplanes. But their original purpose was nothing less than organizing and classifying every known animal, plant, and mineral in the world. Later, they formed the backbone of the library system, allowing us to index vast sums of information and inadvertently creating many of the underlying ideas that allowed the Internet to flourish…

How Carl Linnaeus, the author of Systema Naturae and father of modern taxonomy, created index cards… and how they enabled libraries as we know them, and in the process, laid the groundwork for the Web: “How the Humble Index Card Foresaw the Internet.”

* Lewis Carroll (Alice’s Adventures in Wonderland)

###

As we do ’em up Dewey style, we might recall that it was on this date in 1991 that the handwritten script of the first half of the original draft of Huckleberry Finn, which included Twain’s own handwritten corrections, was recovered.  Missing for over a hundred years, it was found by a 62-year old librarian in Los Angeles, who discovered it as sorted through her grandfather’s papers sent to her from upstate New York.  Her grandfather, james Gluck, a Buffalo lawyer and collector of rare books and manuscripts, to whom Twain sent the manuscript in 1887, had requested the manuscript for the town’s library, now called the Buffalo and Erie County Public Library (where the second half of the manuscript has been all along).

Gluck apparently took the first half from the library, intending to have it bound, but failed to return it.  He died the following year; and the manuscript, which had no library markings, was turned over to his widow by the executors of the estate.  She eventually moved to California to be near her daughter, taking the trunk containing the manuscript went with her.  It was finally opened by her granddaughter, Barbara Testa.

 source

 

Written by LW

February 13, 2016 at 1:01 am

%d bloggers like this: