Welcome to the eighty first ever issue of Monday Morning Data Science from the Fred Hutch Data Science Laboratory. We are excited to show you what we have been working on (Fresh from the Lab), plus links that we think you would be interested in (Our Weekly Bookmarks Bar). Part of the purpose of this newsletter is to start conversations, so if you have a question or there is something you would like to share with us please let us know by responding directly to this email.
Our Weekly Bookmarks Bar
[Blog Post: An Introduction to Python for R Users] Rebecca Barter discusses her transition from R to Python for data science, emphasizing that while R is excellent for data wrangling and visualization, Python is essential for machine learning and collaboration with software engineers. She offers a beginner-friendly guide on setting up Python and its key libraries, like pandas, comparing their functionalities with R's tidyverse. Her goal is to help R users adapt to Python efficiently, leveraging their existing knowledge to ease the learning curve.
[Blog Post: PMFs and PDFs] Allen Downey addresses a common question about the differences between probability mass functions (PMFs) and probability density functions (PDFs). He explains that PMFs and PDFs are normalized differently, making them incomparable on the same scale, and demonstrates methods to compare them accurately by normalizing and using cumulative distribution functions (CDFs). Downey illustrates this with a Bayesian updating example involving baseball batting averages, showing both theoretical and numerical approaches to understand the distributions better.
As always you can contact us by replying directly to this email, you can contact the Data Science Lab at data@fredhutch.org, or you are welcome to join us on the Fred Hutch Data Slack Workspace. For more information about the Fred Hutch Data Science Lab, visit our website: https://hutchdatascience.org/. See you next week!
- The Fred Hutch Data Science Laboratory