Randy Au
Randy Au writes the weekly Counting Stuff newsletter, which covers topics in data science, quantiative UX Research, with occasional excursions into other fun nerdy topics.
There’s been a new post every Tuesday morning since 2020. There are also posts for paid subscribers every couple of weeks towards the end of the week.
Randy can be found on Mastodon or Discord (see below) while Twitter is exploding.
Join a Discord for data folk to have fun
Approaching Significance is a Discord for data folk to hang out in, with the explicit goal of having roughly 50/50 split between data conversations and non-data fun conversations like hobbies, cute pets, and food.
Come join and meet all the friendly people there.
Supporting the newletter
The newsletter has a free post every Tuesday. Every 2-3 Thursdays there’ll be a post for paid subscribers. There’s other ways to support my work if you do not want to subscribe.
- Become a paid subscriver It’s only $5/mo or $50/yr
- Tweet me with comments, feedback, or questions
- A small donation at Ko-fi
- Buy some swag
Counting Stuff Posts by Topic
Here’s a manually curated list of archived posts by topic, ordered somewhat arbitrarily. The list is trying to organize the more evergreen posts for easy reference.
The full archives are available here. The full archives has a search feature.
Most interesting
- Data Cleaning IS Analysis, Not Grunt Work
- Go collect some $#(&% data
- Why’s it hard to teach data cleaning?
- Staying afloat as a new-ish solo data scientist
- What if you were an evil data scientist?
- Learning SQL 201: Optimizing Queries, Regardless of Platform
- Succeeding as a data scientist in small companies/startups
- It’s that DS meme plane with dots again
- False Discovery Rates in A/B tests
- Reflecting on how system complexity grows
- Celebrating everyone counting things
- That time I participated in Nielsen’s TV Ratings
- Hidden treasure in the timezone database
- You’re probably on a cutting edge
- Let’s talk a bit about giving interviews
- We should phase the ““SQL Interview”” out
- Stories from the last downturn
- Self-narrating some usability testing for others
- How to measure a subcontinent
- Want a DS project? There’s health insurance data out there
- Coming up with a talk proposal
- Staying Sharp in Data Science
- How do we actually ““pull stories out of data””?
- Everything is on fire and you should contribute
- We need to calibrate our internal achievement scales too
- Data science has a tool obsession
- There’s dashboards all around us that are USED
- Learning to go from creating to editing
- Data scientist, working without data
- DS work doesn’t have to be purely discovery work
- The reasonable(?) effectiveness of data analysis
Paid subscriber posts
- Watching data science in a fraud lawsuit filing is fun!
- Ugh, the ““ai war””(🤮) or whatever is upon us
- Mini Recap: Data Mishaps Night
- 😔 Everyone’ll have an LLM…
- Single panes of glass are horrible
- 🙄 The AI is coming for my UX job
Guest posts
Measures and Measurement
- How do we know our thermometers are correct?
- Hate leap seconds? Imagine a negative one
- What goes into this ““Heat Index”” thing?
- Paper measurements are an endless pit of mystery
- Measuring lengths, to somewhat absurd lengths
- Synchronicity
- Let’s Talk Rice Measurements
- Dates, Times, Calendars— The Universal Source of Data Science Trauma
- Measuring ““here””, coordinate systems for the Earth
- What’s up with Readability formulas?
- A couple of weeks wading in contextless metrics
- Square footage is so broken and weird
- How the heck does one measure color?
Learning and Practice
- Go collect some $#(&% data
- You, too, should run a few machines
- Working With Moon Eclipses Part 2
- Challenge: Predicting [Lunar] Eclipses
- Practicing data prep with Wikipedia data
- More Learning in (Semi)-Public
- Learning as Performance
- Data Literacy Via COVID-19
- Counting is hard, 2019-nCoV edition
- Not everyone needs real-time analytics, including you
- Data Science Practice 101: Always Leave An Analysis Paper Trail
- Can we stop with the SQL JOINs venn diagrams insanity?
- Want a DS project? There’s health insurance data out there
- Staying Sharp in Data Science
- Learning from running out of memory all the time
Jobs, roles, careers
- Ways to become a Quantitative UX Researcher, w/o a PhD
- Interview season is upon us
- Muddling 1/3rd through a career
- You should pick your org chart when looking at a position
- The gap between data science, and UX research
- Working with a non-DS manager
- Old dog revisits the DS job market out of curiosity
Methods/Techniques/Tools
- Measuring broader impact is extremely hard
- Communicating changes with percentages is surprisingly hard
- Why we reluctantly work with regex
- Churn is hard
- Rejecting the null based on a chart
- My favorite file format is ~50 yrs old
- MapReduce for normal folk who don’t need it anymore
- Reading between the rows
- Real Workflows in Data Science
- Sessions for analysis, the eternal fiction
- Fighting Confirmation Bias
- The Many Ways of Learning Git
- Simple Visualizations Are Pretty Darned Great
- The Epic Data Fetch Quest
- Interpreting Email Analytics is Handwavy
- Email Analytics: More than you ever need to know
- The User-Agent — That Crazy String Underpinning a Bunch of Analytics
- Common Data Science Trap— Getting Systems To Agree
- Character Encoding, Part 3 of 3 — Gotchas while working with Unicode
- Character Encodings — The Pain That Won’t Go Away, Part 2/3: Unicode
- Character Encodings —The Pain That Won’t Go Away, Part 1/3: Non-Unicode
- Dates, Times, Calendars— The Universal Source of Data Science Trauma
- Two Stories About Labeling Data by Hand — It Still Works
- Not everyone needs real-time analytics, including you
- It’s OK to use spreadsheets in data science
- Trap DS Projects: Beware of “Easy” Segmentation Projects
- The First Question I Have For Every Data Request
- Data Science-ing for Insight — Embrace Your Domain Expert Partners
- Data Science foundations: Know your data. Really, really, know it
- Data Science in the Trenches: Living w/ Small n
- The Many Faces of Production
- Handling shifting survey questions
- We have the power to be wrong with extreme confidence
- False Discovery Rates in A/B tests
- Discussing AI is such a confusing mess
- What Sorts of Questions Quantitative UX Researchers get
- Comparing Quant UX Researchers w/ Data Scientists
- We take our units of analysis for granted
- Self-narrating some usability testing for others
- Data driven doesn’t mean data is driving
- Ways to data-drive yourself into the ground!
- There’s dashboards all around us that are USED
- Internalizing baselines is hard
- Data scientist, working without data
- DS work doesn’t have to be purely discovery work
- The mythical single source of truth
Meta about DS
- Data Cleaning IS Analysis, Not Grunt Work
- Research is a team sport, even the analysis bits
- DS jargon is just everyone else’s jargon
- Every org has a ““literature””
- We should treat data science as a craft
- The delicate art of making ourselves wrong
- Being a broad-spectrum data scientist
- Why’s it hard to teach data cleaning?
- Just where is the minimal stats bar for data science?
- Games, a playground for learning DS fundamentals
- The only path to be a data scientist is to be human
- We all have our ways of picking up data to play with
- The best parts of data science isn’t even the tech
- (Maybe) Adopting New Tech
- Let’s Get Intentional About Documentation
- What if you were an evil data scientist?
- Smashing Dashboards and Ikea Together
- I’d like more people to join the broader data community
- In search of ““Good Enough”” data science
- Dashboards aren’t my job, until they are…
- SME Should Stand for Subject Matter Experience
- Why we are so tempted to go out of lane?
- 10x data scientist is luckily not ““a thing”” let’s all work to keep it this way
- Be Yourself: The Data Scientists You See In Public Are Not Representative
- Balancing Who Handles Data Inconsistency
- It’s All About Trust: Views on opening up data to your org
- Data Science foundations: Know your data. Really, really, know it
- Succeeding as a data scientist in small companies/startups
- What Data Folk Were Saying about Zillow
- Reflecting upon reflecting
- Skill Windows into the Data World
- Making room for mistakes
- Data Management is Context Management
- Building Discords and Community
- Paper reading time - Forgetting in Data Science
- Dashboards don’t break themselves
- Optimizing for personal portability
- Making do with the infra we got
- Data driven doesn’t mean data is driving
- The world isn’t as polished as it appears
- We need to help each other write better
- Product Work is Intention Hunting Work
- Seeing the data science work all around us
- Reinforcing our data friend connections while we can
- Heating systems, black boxes, and knowing things about systems
- How do we actually ““pull stories out of data””?
- Everything is on fire and you should contribute
- It’s sorta odd that data science is as open as it is
- We all make for a good conference experience
- We need to calibrate our internal achievement scales too
- Data science has a tool obsession
- Everyone You’ll Ever Meet Knows Something You Don’t
- When are we just speaking for a model
- The reasonable(?) effectiveness of data analysis
Time
- Hate leap seconds? Imagine a negative one
- Synchronicity
- Time is annoying? Time durations are worse!
- Dates, Times, Calendars— The Universal Source of Data Science Trauma
- We might not see leap seconds after 2035 🤯
SQL
- Learning SQL 201: Optimizing Queries, Regardless of Platform
- Can we stop with the SQL JOINs venn diagrams insanity?
- We should phase the ““SQL Interview”” out
Organization, communication, soft skills
- When scaling yourself goes a bit too far
- Scaling yourself
- Staying afloat as a new-ish solo data scientist
- Navigating working with other teams
- Showing value as a support data scientist
- Helping others deal with uncertainty and risk
- Making the best of having too many meetings
- Learning to push back
- Being the Eyes of Your Organization
- Balancing Who Handles Data Inconsistency
- The utility of an unwatched dashboard
- Views on dashboards being answers, or not
- Surviving planning from the bottom up
- Planning quantitative work
- Let’s talk a bit about giving interviews
- The world isn’t as polished as it appears
- Working with the rhythms of the business
- Learning to go from creating to editing
Setting Metrics
- Measuring broader impact is extremely hard
- Where should a metric be””?
- It’s Goodhart’s Law again
- The Metrics Meta-game
UX
- A Busted Kitchen and Breaking User Journeys
- Questions that make Quant UXRs excited
- Being new to the UX part of ““Quant UXR”” - Overall Process
- Becoming a Quantitative UX Researcher is messy
- How I wound up being a Quantitative UX Researcher
- Comparing Quant UX Researchers w/ Data Scientists
- Data Management is Context Management
- Self-narrating some usability testing for others
- When your product’s like a tax form
- We should watch people dissect and build products
Other
- Doing better with Excel
- Vacation is the power say ““no”” by not even being there
- Our lives are nested orders of operation puzzles
- Personal growth and brands
- Let’s play with some Library of Congress data!
- Practical benefits of knowing how obscure systems work
- Paper dive - Replication is even harder
- Hobby trains and actively using failure as a strategy
- Dumb Stonks, Prisoner’s Dilemma, and Artists
- The Complexity Makin’ Goods
- Path dependency is important to know
- Almost a year of data newslettering
- Running small conferences is quirky work
- Audio silliness in the era of videoconferencing
- Side gigs and avoiding them
- Some Gamedev and Shoddy Data Arguments
- Making Fair Games of Go
- Emulators, Machine Translation, Self-service skills
- It’s OK to use spreadsheets in data science
- Reflecting on how system complexity grows
- A weekend playing with DALL-E Mini
- Caching is our friend, until it isn’t
- Writing posts, maybe as a guest!
- New year, new chances to get together