(Roughly) Daily

Posts Tagged ‘statistics

“Torture the data, and it will confess to anything”*…

Source: @piechartpirate

Add movement to a bar chart, and you’ve got yourself an audience-pleaser. These so-called “bar chart races” are not popular with data visualization experts– but what do experts know?…

I’m not a betting man. But I do enjoy a good bar chart race — a popular way to visually display and compare changing data over time. Bars lengthen and shorten as time ticks away; contenders accordingly hop over each other to switch places in the ranking. Will your favorite keep their lead? Look at that surprise challenger rush to the front! Meanwhile, furious battles are waged for the middle and even the lower spots on the list.

Bar chart races are a spectacular way to animate certain types of information, but the so-called dataviz community is skeptical. Many data visualization specialists complain that bar chart races are like a sugar rush: a lot of entertainment, but very little analysis. Big on grabbing attention, small on conveying causality. Instead of good seats at the data ballet, you get standing room only at the information dog track.

Well, all that may be true. But when is the last time you’ve been glued to a statistic about global coffee production? Bar chart races are fun to watch, not least because you can pick a favorite early on and get to see them win — or lose. In other words, you’re emotionally invested in the animation in a way that’s lacking from static stats.

Bar chart races are used for just about any dataset that can be quantified over time: best-selling game consoles, most trusted brands, highest grossing movies…

Any dataset that can be quantified over time can be turned into a contest that is both exciting and (a little bit) enlightening: from @VeryStrangeMaps, 10 examples of “Bar chart races: short on analysis, but fun to watch,” for example…

Ronald Coase

###

As we ruminate on representation, we might check our watches: it was on this date in 1918 that the Standard Time Act (AKA, the Calder Act) became effective. Passed by Congress earlier in the year, it implemented across the U.S. both Standard time (the creation of time zones anchored in UTC, the successor to GMT) and Daylight Saving Time.

U.S. Time Zones (somewhat revised from the original division)

source

“No structure, even an artificial one, enjoys the process of entropy. It is the ultimate fate of everything, and everything resists it.”*…

A 19th-century thought experiment that motivates physicists– and information scientists– still…

The universe bets on disorder. Imagine, for example, dropping a thimbleful of red dye into a swimming pool. All of those dye molecules are going to slowly spread throughout the water.

Physicists quantify this tendency to spread by counting the number of possible ways the dye molecules can be arranged. There’s one possible state where the molecules are crowded into the thimble. There’s another where, say, the molecules settle in a tidy clump at the pool’s bottom. But there are uncountable billions of permutations where the molecules spread out in different ways throughout the water. If the universe chooses from all the possible states at random, you can bet that it’s going to end up with one of the vast set of disordered possibilities.

Seen in this way, the inexorable rise in entropy, or disorder, as quantified by the second law of thermodynamics, takes on an almost mathematical certainty. So of course physicists are constantly trying to break it.

One almost did. A thought experiment devised by the Scottish physicist James Clerk Maxwell in 1867 stumped scientists for 115 years. And even after a solution was found, physicists have continued to use “Maxwell’s demon” to push the laws of the universe to their limits…

A thorny thought experiment has been turned into a real experiment—one that physicists use to probe the physics of information: “How Maxwell’s Demon Continues to Startle Scientists,” from Jonathan O’Callaghan (@Astro_Jonny)

* Philip K. Dick

###

As we reconsider the random, we might send carefully-calculated birthday greetings to Félix Édouard Justin Émile Borel; he was born on this date in 1871. A mathematician (and politician, who served as French Minister of the Navy), he is remembered for his foundational work in measure theory and probability. He published a number of research papers on game theory and was the first to define games of strategy.

But Borel may be best remembered for a thought experiment he introduced in one of his books, proposing that a monkey hitting keys at random on a typewriter keyboard will – with absolute certainty – eventually type every book in France’s Bibliothèque Nationale de France. This is now popularly known as the infinite monkey theorem.

source

“Losing my religion”*…

Shifting religious affiliations in the U.S. have generated lots of comment (e.g., Friday’s New York Times: “The Christian Right Is in Decline, and It’s Taking America With It“). It’s worth taking a comprehensive look at the data on which those takes are based; there’s even more to see…

Seven in ten Americans (70%) identify as Christian, including more than four in ten who identify as white Christian and more than one-quarter who identify as Christian of color. Nearly one in four Americans (23%) are religiously unaffiliated, and 5% identify with non-Christian religions.

The most substantial cultural and political divides are between white Christians and Christians of color. More than four in ten Americans (44%) identify as white Christian, including white evangelical Protestants (14%), white mainline (non-evangelical) Protestants (16%), and white Catholics (12%), as well as small percentages who identify as Latter-day Saint (Mormon), Jehovah’s Witness, and Orthodox Christian. Christians of color include Hispanic Catholics (8%), Black Protestants (7%), Hispanic Protestants (4%), other Protestants of color (4%), and other Catholics of color (2%). The rest of religiously affiliated Americans belong to non-Christian groups, including 1% who are Jewish, 1% Muslim, 1% Buddhist, 0.5% Hindu, and 1% who identify with other religions. Religiously unaffiliated Americans comprise those who do not claim any particular religious affiliation (17%) and those who identify as atheist (3%) or agnostic (3%).

Over the last few decades, the proportion of the U.S. population that is white Christian has declined by nearly one-third. As recently as 1996, almost two-thirds of Americans (65%) identified as white and Christian. By 2006, that had declined to 54%, and by 2017 it was down to 43%. The proportion of white Christians hit a low point in 2018, at 42%, and rebounded slightly in 2019 and 2020, to 44%. That tick upward indicates the decline is slowing from its pace of losing roughly 11% per decade.

The slight increase in white Christians between 2018 and 2020 was driven primarily by an uptick in the proportion of white mainline (non-evangelical) Protestants and a stabilization in the proportion of white Catholics. Since 2007, white mainline (non-evangelical) Protestants have declined from 19% of the population to a low of 13% in 2016, but the last three years have seen small but steady increases, up to 16% in 2020. White Catholics have also declined from a high point of 16% of the population in 2008, and their low point of 11% occurred in 2018. It is unclear if the bump back up to 12% in 2020 indicates a new trend.

Since 2006, white evangelical Protestants have experienced the most precipitous drop in affiliation, shrinking from 23% of Americans in 2006 to 14% in 2020. That proportion has generally held steady since 2017 (15% in 2017, 2018, and 2019).

Disaffiliating white Christians have fueled the growth of the religiously unaffiliated during this period. Only 16% of Americans reported being religiously unaffiliated in 2007; this proportion rose to 19% by 2012, and then gained roughly a percentage point each year from 2012 to 2017. Reflecting the patterns above, the proportion of religiously unaffiliated Americans hit a high point of 26% in 2018 but has since slightly declined, to 23% in 2020.

The increase in proportion of religiously unaffiliated Americans has occurred across all age groups but has been most pronounced among young Americans. In 1986, only 10% of those ages 18–29 identified as religiously unaffiliated. In 2016, that number had increased to 38%, and declined slightly in 2020, to 36%.

Americans ages 18–29 are the most religiously diverse age group. Although a majority (54%) are Christian, only 28% are white Christians (including 12% who are white mainline Protestants, 8% who are white Catholics, and 7% who are white evangelical Protestants), while 26% are Christians of color (including 9% who are Hispanic Catholics, 5% who are Hispanic Protestants, 5% who are Black Protestants, 2% who are multiracial Christians, 2% who are AAPI Christians, and 1% who are Native American Christians). More than one-third of young Americans (36%) are religiously unaffiliated, and the remainder are Jewish (2%), Muslim (2%), Buddhist (1%), Hindu (1%), or another religion (1%).

Americans ages 65 and older are the only group whose religious profile has changed significantly since 2013. Among Americans 65 and older, the proportion of white evangelical Protestants dropped from 26% in 2013 to 22% in 2020, and the proportion of white Catholics dropped from 18% in 2013 to 15% in 2020. By contrast, the proportion of religiously unaffiliated seniors increased from 11% in 2013 to 14% in 2020.

White evangelical Protestants are the oldest religious group in the U.S., with a median age of 56, compared to the median age in the country of 47. White Catholics and Unitarian Universalists have median ages of 54 and 53 years old, respectively. Black Protestants and white mainline Protestants have a median age of 50. All other groups have median ages below 50: Jehovah’s Witnesses (49), Jewish Americans (48), Latter-day Saints (47), Orthodox Christians (42), Hispanic Catholics (42), Hispanic Protestants (39), religiously unaffiliated people (38), Buddhists (36), Hindus (36), and Muslims (33). In the youngest groups, one-third of Hindu (33%) and Buddhist (34%) Americans and 42% of Muslim Americans are in the 18–29 age category.

Delving into the data of devotion: “The American Religious Landscape in 2020.

* R.E.M.

###

As we ponder piety, we might send evangelical birthday greetings to Bardaisan; he was born on this date in 154. A scientist, scholar, astrologer, philosopher, hymnographer, and poet, he was the first known Syriac literary author. A key figure among the Gnostics, he founded the Bardaisanites and was central to the Christianization of Rome (indeed, he is said to have converted prince Abgar IX).

source

“An imbalance between rich and poor is the oldest and most fatal ailment of all republics”*…

… so, how we measure it matters…

In 2015, Greece, Thailand, Israel, and the UK were equally unequal. That is, all four countries had the same Gini coefficient, a common measure of income inequality.

The number suggests that the spread of incomes in the four nations was the same. However, a close look at the poorest and wealthiest in those societies reveals a very different picture. The ratio between income held by the richest 10% and the poorest 10% ranged significantly, from 13.8 in Greece to 4.2 in the UK. 

The fact is, just because the Gini coefficient is so well known doesn’t mean it’s a particularly useful measurement. Its appeal comes from its simplicity—a number between 0 and 1 that can encapsulate a complex distribution in a single figure—as well as its popularity. It is also regularly published and updated by powerful international organizations like the OECD, the World Bank, and the International Monetary Fund

However, it has a number of serious limitations. So many, in fact, that the World Inequality Database, one of the world’s leading sources of income inequality data, steers clear. And it’s not alone. While some economists defend the Gini coefficient’s continued use, most agree that as a way to understand income inequality, it’s insufficient on its own…

A primer on the dominant measure of economic inequality, and on some alternatives/supplements to it: “Gini coefficient: An introduction.”

* Plutarch

###

As we aim to understand, we might note that today is the Summer Solstice, the day on which the earth’s north pole is maximally tilted toward sun, and there are more hours of daylight than on any other day of the year (in the Northern Hemisphere; in the Southern, it is the Winter Solstice, the shortest day). The June solstice is the only day of the year when all locations inside the Arctic Circle experience a continuous period of daylight for 24 hours. And perhaps more immediately, it is the “official” start of Summer.

(The 21st is the traditional date; in the event, the solstice falls on the 20th, 21st, or 22nd– this year, on the 20th… still, the traditional date is the one folks tend to mark.)

Not coincidentally, today is also National Daylight Appreciation Day.

source

“Gentlemen, you need to add armor-plate where the holes aren’t, because that’s where the holes were on the airplanes that didn’t return”*…

Diagram of bullet-holes in WWII bombers that returned

Allied bombers were key to Britain’s air offensive against Germany during the second world war. As such, the RAF wanted to armour their bombers to prevent them from being shot down. But armour is heavy – you cannot reinforce an entire bomber and still have it fly. So statistician Abraham Wald was asked to advise on where armour should be placed on a bomber.

After each wave of bombing, every returning aircraft was meticulously examined and a note was made of where each aircraft had sustained damage by the Germans. The image [above] conceptualises what Wald’s data might have looked like visually.

So what was Wald’s advice? Where should armour be added?

He essentially advised the RAF to add armour to places where you do not find bullet holes. Wait… what?!

Wald wisely understood that the data was based only on planes that survived. The planes that did not survive were likely to have sustained damage on the areas where we do not observe bullet holes – such as around the engine or cockpit…

Making better decisions: one of the most prevalent– and insidious– forms of selection bias, survivorship bias, illustrated: “How to armour a WWII bomber.”

See also: “How to avoid being duped by survivorship bias.”

###

As we think clearly, we might send productive birthday greetings to W. Edwards Deming; he was born on this date in 1900. An engineer, statistician, professor, author, lecturer, and management consultant, he helped develop the sampling techniques still used by the U.S. Department of the Census and the Bureau of Labor Statistics.

But he is better remembered as the champion of statistically-based production management techniques that first gained traction in post-WWII Japan, where many credit Deming as a key ingredient in what has become known as the Japanese post-war economic miracle of 1950 to 1960, when Japan rose from the ashes of war onto the its path to becoming the second-largest economy in the world– through processes shaped by the ideas Deming taught. In 1951, the Japanese government established the Deming Prize in his honor.

While his impact in Japan (finally) brought him to the attention of business leaders in the U.S., he was only just beginning to win widespread recognition in the U.S. at the time of his death in 1993.

source

%d bloggers like this: