As data scientists, we’re often most excited about the final layer of analysis. Once all the data are cleaned and stored in a format readable by our favorite programming language (Python, R, STATA, etc.), the most fun part of our work is when we’re finding counter-intuitive causations with statistical methods. If you can prove that the mutual presence of McDonalds really does prevent wars between countries or that an increase in diversity really does boost business profitability, that is a good day.

When you import or load data into R, the data are stored in random-access memory (RAM). This is the memory that is deleted when you close R or shut off your computer. It’s very fast but temporary. If you save your data, it is saved to your hard drive. But when you open R again and load the data, once again it is loaded into RAM. While many newer computers come with lots of RAM (such as 16 GB), it’s not an infinite amount. When you open RStudio, you’re using RAM even if no data is loaded. Open a web browser or any other program and they too are loaded into RAM.