pandas_notes
Table of Contents
common pandas data types
data type | description | supports missing values |
---|---|---|
float | The NumPy float type | Yes |
int | The NumPy integer type | No |
'Int64' | pandas nullable integer type | Yes |
object | The NumPy type for storing strings (and mixed types) | |
'category' | pandas categorical type | Yes |
bool | The NumPy Boolean type | No. None becomes False, np.nan becomes True. |
'boolean' | pandas nullable Boolean type | Yes |
datetime64[ns] | The NumPy date type | Yes (NaT) |
Ref:- (Pandas 1.x Cookbook, by Matt Harrison and Theodore Petrou, second edition, published in 2020) → Chapter 1 → page-7
What packages does pandas depend on?
- Recommended dependencies - https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html#recommended-dependencies
- Optional dependencies - https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html#optional-dependencies
links I came across
- https://github.com/pandas-dev/pandas/releases - pandas release history
- http://pandas.pydata.org/pandas-docs/stable/getting_started/install.html - pandas installation page. Contains instructions to install pandas in various ways.
- http://pandas.pydata.org/pandas-docs/stable/ - pandas official documentation
- https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html - good reference to learn about iloc, slicing ranges
- DuckDB can run SQL queries directly on Parquet files and automatically take advantage of the advanced features of the Parquet format.
- https://duckdb.org/2021/06/25/querying-parquet.html - gives more details
- tags | pandas, streaming
pandas.errors.EmptyDataError
Sample code to generate EmptyDataError exception
>>> import pandas as pd >>> from io import StringIO >>> empty = StringIO() >>> pd.read_csv(empty) Traceback (most recent call last): ... pandas.errors.EmptyDataError: No columns to parse from file
Ref: https://pandas.pydata.org/pandas-docs/dev/reference/api/pandas.errors.EmptyDataError.html - Got the initial version of the code from here.
pandas.io.common.EmptyDataError is deprecated
Instead of
pandas.io.common.EmptyDataError
use
pandas.errors.EmptyDataError
data point | as of pandas 1.1.2, pandas.io.common.EmptyDataError does not work.
Ref:-
- We are adding a standard public module for all pandas exceptions & warnings pandas.errors. (GH14800). Previously these exceptions & warnings could be imported from pandas.core.common or pandas.io.common. These exceptions and warnings will be removed from the *.common locations in a future release. (GH15541)
- https://pandas.pydata.org/docs/reference/api/pandas.errors.EmptyDataError.html - documentation from latest stable version
- https://pandas.pydata.org/pandas-docs/version/1.4/reference/api/pandas.errors.EmptyDataError.html - documentation from version 1.4
- https://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.errors.EmptyDataError.html - documentation from version 0.20
assert_frame_equal
Instead of
from pandas.util.testing import assert_frame_equal
use
from pandas.testing import assert_frame_equal
data point:
Using from pandas.util.testing import assert_frame_equal
in pandas 1.1.2, I get
FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
pandas_notes.txt · Last modified: 2024/10/19 20:31 by raju