Datacamp has put out a really cool cheat sheet for Pandas — everybody’s favourite Python data science library.
The fast, flexible, and expressive Pandas data structures are designed to make real-world data analysis significantly easier, but this might not be immediately the case for those who are just getting started with it. Exactly because there is so much functionality built into this package that the options are overwhelming.
That’s where this cheat sheet might come in handy.
It’s a quick guide through the basics of Pandas that you will need to get started on wrangling your data with Python.
Google has just released a series of videos to teach machine learning.
The first step is, however, installing and playing with Anaconda — a completely free Python distribution (including for commercial use and redistribution). It includes more than 400 of the most popular Python packages for science, math, engineering, and data analysis.
Choose the command line installer (on OSX) — it will save you a LOT of bother.
Installing Anaconda also means getting to know and love Conda — a package manager application that quickly installs, runs, and updates packages and their dependencies. It seems to be like pip, but better?
Conda has a test drive, which I am now trying out. Notes as I go along —
- Step one failed. I needed to try reinstalling using the command line installer. Chrome blocks the download as malicious, so I got the file using curl. Now running the installation. I had to edit .bash_profile to edit the PATH variable to include the conda directory. Everything seems to be working now.
- I ran through the test drive in about half the suggested time. The most useful thing was this conda cheat sheet I downloaded.Key commands:
Create an environment
conda create -n snowflakes biopython
Switch to the environment
source activate snowflakes
Remove an environment
conda remove -n snowflakes --all
Install a new package to an environment
conda install -n snowflakes beautiful-so up
- Now creating an environment — calling it datalab –and installing the scikit-learn package
conda create -n datalab scikit-learn