6 - See the data
Silicon Valley diversityMark Zuckerberg, Evan Spiegel, Tim Cook. What have they all got in common? Yeah - apart from being rich enough to use iPhones as toilet paper.
White dudes the lot of them. But then again I just picked these three names. What do I know? Let’s see what the data says.
Getting the dataI want to show you how easy it is to get hold of data of this kind to play with yourself. You can download a csv file at Kaggle And then load it into Python using a thing called a Pandas data frame.
There are 3960 rows in this dataset, one row for each combination of our four hierarchical variables: Company Race Gender Job Category
It’s hard to figure out from the data alone what the gender balance in Silicon Valley is.
Representing the data hierarchicallyRather than just having counts for each specific subset, we may want to find out how many men there are in total at 23andMe.
We can use a data type called a tree to represent how Silicon Valley workers break down into subcategories.
Let’s create a tree where the 354,964 workers are firstly divided into companies, and then into gender groups. Here’s the start of the tree:
This begins to show us that 23andMe is unusual in having more female than male workers.
Representing the data visuallyHowever, we can go one better. Humans aren’t great with comparing proportions of numbers. We can visually represent the numbers as areas using a treemap. Here is the treemap of the previous tree:
Because 23andMe’s 297 workers gives them such a small area on the horizontal axis, they don’t even warrant a name on the diagram. But they are the unusual 50/50 blue-red split second from the left.
This diagram says “patriarchy” much better than a list of data could.
Now, off to find a Silicon Valley summer internship! Which company wants to expand their blue box?