Python

Bootstrap Estimates of Confidence Intervals

Bootstrapping is a statistical procedure that utilizes resampling (with replacement) of a sample to infer properties of a wider population.

Logistic Regression Four Ways with Python

Logistic regression is a predictive analysis that estimates/models the probability of event occurring based on a given dataset. This dataset contains both independent variables, or predictors, and their corresponding dependent variables, or responses.

Getting Started with the Kruskal-Wallis Test

One of the most well-known statistical tests to analyze the differences between means of given groups is the ANOVA (analysis of variance) test. While ANOVA is a great tool, it assumes that the data in question follows a normal distribution. What if your data doesn’t follow a normal distribution or if your sample size is too small to determine a normal distribution? That’s where the Kruskal-Wallis test comes in.

Getting Started with Web Scraping in Python

"Web scraping," or "data scraping," is simply the process of extracting data from a website. This can, of course, be done manually: You could go to a website, find the relevant data or information, and enter that information into some data file that you have stored locally. But imagine that you want to pull a very large dataset or data from hundreds or thousands of individual URLs. In this case, extracting the data manually sounds overwhelming and time-consuming.

Getting Started with pandas in Python

The pandas package is an open-source software library written for data analysis in Python. Pandas allows users to import data from various file formats (comma-separated values, JSON, SQL, fits, etc.) and perform data manipulation operations, including cleaning and reshaping the data, summarizing observations, grouping data, and merging multiple datasets. In this article, we'll explore briefly some of the most commonly used functions and methods for understanding, formatting, and vizualizing data with the pandas package.

A Guide to Python in QGIS

This post is something I’ve been thinking about writing for a while. I was inspired to write it by my own trials and tribulations, which are still ongoing, while working with the QGIS API, trying to programmatically do stuff in QGIS instead of relying on available widgets and plugins. I have spent, and will probably continue to spend, many hours scouring the internet and especially Stack Overflow looking for answers of how to use various classes, methods, attributes, etc.

How to Create and Export Print Layouts in Python for QGIS 3

I've been struggling off and on for literally months trying to create and export a print layout using Python for QGIS 3. Or PyQGIS 3 for short. I have finally figured out may of the ins and outs of the process and hopefully this will serve as a guide to save someone else a lot of effort and time.

How to Apply a Graduated Color Symbology to a Layer Using Python for QGIS 3

I was recently working on a project in QGIS 3 with a member of UVA Health's Oncology department. This person wanted to take a set of patient data (after identifying info had been removed) and after doing some other stuff, apply a graduated color scheme to the results, shading them from light to dark based on intensity.

You can find a sample dataset for this project here:

https://github.com/epurpur/PyQGIS-Scripts/blob/master/TestZipCodes.zip

How to Use the Field Calculator in Python for QGIS 3

Recently, I have taken the dive into python scripting in QGIS. QGIS is a really nice open source (and free!) alternative to ESRI's ArcGIS. While QGIS is a little quirky and generally not quite as user friendly as ArcGIS, it still provides nearly the same functionality. Personally, I've become a fan of it and now have even taught a short, 1 credit course in the University of Virginia's Batten School of Public Policy titled: GIS for Public Policy.