Key Points
Introduction to Python |
|
Python basics |
|
Python control structures |
|
Creating re-usable code |
|
Processing data from a file |
|
Dates and Time |
|
Processing JSON data |
|
Reading data from a file using Pandas |
|
Extracting row and columns |
|
Data Aggregation using Pandas |
|
Joining Pandas Dataframes |
|
Wide and long data formats |
|
Data visualisation using Matplotlib |
|
Accessing SQLite Databases |
|
Glossary
- 0-based indexing
- is a way of assigning indices to elements in a sequential, ordered data structure starting from 0, i.e. where the first element of the sequence has index 0.
- attribute
- a property of an object that can be viewed, accessed with a
.
but no()
ex:df.dtypes
- boolean
- a data type that can be
True
orFalse
- cast
- the process of changing the type of a variable, in python the data type names operate as functions for casting, for example
int(3.5)
- CSV (file)
- is an acronym which stands for Comma-Separated Values file. CSV files store tabular data, either numbers, strings, or a combination of the two, in plain text with columns separated by a comma and rows by the carriage return character.
- database
- is an organized collection of data.
- DataFrame
- is a two-dimensional labeled data structure with columns of (potentially) different type.
- data structure
- is a particular way of organizing data in memory.
- data type
- is a particular kind of item that can be assigned to a variable, defined by
by the values it can take, the programming language in use and the operations
that can be performed on it. examples:
int
(integer),str
(string), float, boolean, list - docstring
- is an recommended documentation string to describe what a Python function does.
- faceting
- is the act of plotting relationships between set variables in multiple subsets of the data with the results appearing as different panels in the same figure.
- float
- is a Python data type designed to store positive and negative decimal numbers by means of a floating point representation.
- function
- is a group of related statements that perform a specific task.
- integer
- is a Python data type designed to store positive and negative integer numbers.
- interactive mode
- is an online mode of operation in which the user writes the commands directly on the command line one-by-one and execute them immediately by pressing a button on the keyword, usually Enter.
- join key
- is a variable or an array representing the column names over which pandas.DataFrame.join() merge together columns of different data sets.
- library
- is a set of functions and methods grouped together to perform some specific sort of tasks.
- list
- is a Python data structure designed to contain sequences of integers, floats, strings and any combination of the previous. The sequence is ordered and indexed by integers, starting from 0. Elements of a list can be accessed by their index and can be modified.
- loop
- is a sequence of instructions that is continually repeated until a condition is satisfied.
- method
- a function that is specific to a type of data, accessed via
.
and requires()
to run, for exampledf.sum()
- NaN
- is an acronym for Not-a-Number and represents that either a value is missing or the calculation cannot output any meaningful result.
- None
- is an object that represents no value.
- scripting mode
- is an offline mode of operation in which the user writes the commands to be executed in a text file (with .py extension for Python) which is then compiled or interpreted to run the program. Notes that Python interprets script on run-time and compiles a binary version of the program to speed up the execution time.
- sequential (data structure)
- is an ordered group of objects stored in memory which can be accessed specifying their index, i.e. their position, in the structure.
- string
- is a Python data type designed to store sequences of characters.
- tuple
- is a Python data structure designed to contain sequences of integers, floats, strings and any combination of the previous. The sequence is ordered and indexed by integers, starting from 0. Elements of a tuple can be accessed by their index but cannot be modified.
- variable
- a named quantity that can store a value, a variable can store any type, but always one type for a given value.
Jupyter Notebook Hints
Esc
will take you into command mode where you can navigate around your notebook with arrow keys.
While in command mode:
- A to insert a new cell above the current cell,
- B to insert a new cell below.
- M to change the current cell to Markdown,
- Y to change it back to code
- D + D (press the key twice) to delete the current cell
- Enter will take you from command mode back into edit mode for the given cell.
while in edit mode:
- Ctrl + Shift + - will split the current cell into two from where your cursor is.
- Shift + Enter will run the current cell
Full Shortcut Listing
Cmd + Shift + P (or Ctrl + Shift + P on Linux and Windows)