Ref:
Ref:
Ref:
Ref:
- pandas.DataFrame()
- pandas.Series()
- pandas.read_csv()
- pandas.DataFrame.shape
- pandas.DataFrame.head
- pandas.read_excel()
- pandas.to_csv()
- pandas.to_excel()
Ref:
- pandas.iloc(): 类似于Excel中的Cell函数,将其看做Matrix
- pandas.loc()
一、基本概念
- class
-
Parameters: data : 数据主体部分,numpy ndarray (structured or homogeneous), dict, or DataFrame
Dict can contain Series, arrays, constants, or list-like objects
Changed in version 0.23.0: If data is a dict, argument order is maintained for Python 3.6 and later.
index : 行名称,默认 0, 1, 2, ..., n, Index or array-like
Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided
columns : 列名称,默认 0, 1, 2, ..., n, Index or array-like
Column labels to use for resulting frame. Will default to RangeIndex (0, 1, 2, …, n) if no column labels are provided
dtype : 数据类型,dtype, default None
Data type to force. Only a single dtype is allowed. If None, infer
copy : boolean, default False
Copy data from inputs. Only affects DataFrame / 2d ndarray input
pandas.
DataFrame
(data=None, index=None, columns=None, dtype=None, copy=False)
data[1:,0] means the first column, data[0,1:] means the first row.
>>> import numpy as np>>> import pandas as pd>>> data = np.array([ ['','Col1','Col2'], ['Row1',1,2], ['Row2',3,4] ])>>> print(pd.DataFrame(data=data[1:,1:], index=data[1:,0], columns=data[0,1:])) Col1 Col2Row1 1 2Row2 3 4
or
>>> data = np.array([ [1,2], [3,4]])>>> print(pd.DataFrame(data=data, index=['Row1','Row2'], columns=['Col1','Col2'])) Col1 Col2Row1 1 2Row2 3 4
Ref:
Ref:
Ref:
二、相关方法:
DataFrame.apply(func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds)
Apply a funciton along an axis of the DataFrame. (类似Excel中对一列或者一行数据进行摸个函数的处理)
Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).
Ref:
Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
Returns object containing counts of unique values.
The resulting object will be in desceding order so that the first element is the most frequent-occurring element. Excludes NA values by default.
: 可以将 Str 通过 转为文件缓存,可以直接用此方法
>>> from io import StringIO>>> a = '''A, B, C1,2,34,5,67,8,9'''>>> a'\nA, B, C\n1,2,3\n4,5,6\n7,8,9\n'>>> data = pd.read_csv(StringIO(a))>>> data A B C0 1 2 31 4 5 62 7 8 9