SQL Style for Data Scientists - Query Syntax

Here are some general rules that I try to follow when writing queries in a SQL database. There will always be exceptions, but you might want to keep these in mind when you're working in your own SQL environment.
Read More

SQL Style for Data Scientists - Introduction

Opinions on SQL style are freely available, but they often vary, and I sometimes find it hard to keep them all straight. For that reason, I have begun to compile a basic set of style guidelines that I hope to commit to memory and use when I'm coding in SQL.
Read More

Multiprocessing in Python

Parallel computing is a powerful tool, and Python makes it easy. If you are dealing with a long-running program, Python's multiprocessing package might what you need.
Read More

Big Data, Little Laptop

Distributed computing is a powerful thing, but the use of traditional computing systems to store, process, and analyze data is often unavoidable. When the size of your data is so great that your laptop slows to a crawl, or when you need to develop and test your code locally before deploying it in a distributed environment, a few simple Python tools and methods might be all you need...
Read More

Connecting to a SQL Database with Python

SQL is everywhere, and if you are doing any sort of analysis in an enterprise setting, it is more likely than not that you will need to access a SQL database for at least some of your data. With the pandas library, extracting data from a SQL database in a Jupyter notebook is almost trivial, but before we can extract the data, we need to establish a connection to the database.
Read More