데이터 유형을 기반으로 팬더 데이터 프레임 열 목록 가져 오기

Programing

데이터 유형을 기반으로 팬더 데이터 프레임 열 목록 가져 오기

crosscheck 2020. 5. 31. 10:05

데이터 유형을 기반으로 팬더 데이터 프레임 열 목록 가져 오기

다음 열이있는 데이터 프레임이있는 경우 :

1. NAME                                     object
2. On_Time                                      object
3. On_Budget                                    object
4. %actual_hr                                  float64
5. Baseline Start Date                  datetime64[ns]
6. Forecast Start Date                  datetime64[ns]

말하고 싶습니다 : 여기 데이터 프레임이 있습니다 .Object 유형 또는 DateTime 유형의 열 목록을 알려주십시오.

숫자 (Float64)를 소수점 이하 두 자리로 변환하는 함수가 있으며이 유형의 데이터 프레임 열 목록을 특정 유형으로 사용 하고이 함수를 통해 실행하여 모두 2dp로 변환하고 싶습니다.

아마도:

For c in col_list: if c.dtype = "Something"
list[]
List.append(c)?

특정 유형의 열 목록을 원하면 다음을 사용할 수 있습니다 groupby.

>>> df = pd.DataFrame([[1, 2.3456, 'c', 'd', 78]], columns=list("ABCDE"))
>>> df
   A       B  C  D   E
0  1  2.3456  c  d  78

[1 rows x 5 columns]
>>> df.dtypes
A      int64
B    float64
C     object
D     object
E      int64
dtype: object
>>> g = df.columns.to_series().groupby(df.dtypes).groups
>>> g
{dtype('int64'): ['A', 'E'], dtype('float64'): ['B'], dtype('O'): ['C', 'D']}
>>> {k.name: v for k, v in g.items()}
{'object': ['C', 'D'], 'int64': ['A', 'E'], 'float64': ['B']}

pandas v0.14.1부터 select_dtypes()dtype별로 열을 선택 하는 데 활용할 수 있습니다.

In [2]: df = pd.DataFrame({'NAME': list('abcdef'),
    'On_Time': [True, False] * 3,
    'On_Budget': [False, True] * 3})

In [3]: df.select_dtypes(include=['bool'])
Out[3]:
  On_Budget On_Time
0     False    True
1      True   False
2     False    True
3      True   False
4     False    True
5      True   False

In [4]: mylist = list(df.select_dtypes(include=['bool']).columns)

In [5]: mylist
Out[5]: ['On_Budget', 'On_Time']

를 사용 dtype하면 원하는 열의 데이터 유형이 제공됩니다.

dataframe['column1'].dtype

모든 열의 데이터 유형을 한 번에 알고 싶다면 복수형을 dtypesdtype 로 사용할 수 있습니다 .

dataframe.dtypes

You can use boolean mask on the dtypes attribute:

In [11]: df = pd.DataFrame([[1, 2.3456, 'c']])

In [12]: df.dtypes
Out[12]: 
0      int64
1    float64
2     object
dtype: object

In [13]: msk = df.dtypes == np.float64  # or object, etc.

In [14]: msk
Out[14]: 
0    False
1     True
2    False
dtype: bool

You can look at just those columns with the desired dtype:

In [15]: df.loc[:, msk]
Out[15]: 
        1
0  2.3456

Now you can use round (or whatever) and assign it back:

In [16]: np.round(df.loc[:, msk], 2)
Out[16]: 
      1
0  2.35

In [17]: df.loc[:, msk] = np.round(df.loc[:, msk], 2)

In [18]: df
Out[18]: 
   0     1  2
0  1  2.35  c

use df.info() where df is a pandas datafarme

df.select_dtypes(['object'])

This should do the trick

If you want a list of only the object columns you could do:

non_numerics = [x for x in df.columns \
                if not (df[x].dtype == np.float64 \
                        or df[x].dtype == np.int64)]

and then if you want to get another list of only the numerics:

numerics = [x for x in df.columns if x not in non_numerics]

The most direct way to get a list of columns of certain dtype e.g. 'object':

df.select_dtypes(include='object').columns

For example:

>>df = pd.DataFrame([[1, 2.3456, 'c', 'd', 78]], columns=list("ABCDE"))
>>df.dtypes

A      int64
B    float64
C     object
D     object
E      int64
dtype: object

To get all 'object' dtype columns:

>>df.select_dtypes(include='object').columns

Index(['C', 'D'], dtype='object')

For just the list:

>>list(df.select_dtypes(include='object').columns)

['C', 'D']

I came up with this three liner.

Essentially, here's what it does:

Fetch the column names and their respective data types.
I am optionally outputting it to a csv.

inp = pd.read_csv('filename.csv') # read input. Add read_csv arguments as needed
columns = pd.DataFrame({'column_names': inp.columns, 'datatypes': inp.dtypes})
columns.to_csv(inp+'columns_list.csv', encoding='utf-8') # encoding is optional

This made my life much easier in trying to generate schemas on the fly. Hope this helps

for yoshiserry;

def col_types(x,pd):
    dtypes=x.dtypes
    dtypes_col=dtypes.index
    dtypes_type=dtypes.value
    column_types=dict(zip(dtypes_col,dtypes_type))
    return column_types

I use infer_objects()

Docstring: Attempt to infer better dtypes for object columns.

Attempts soft conversion of object-dtyped columns, leaving non-object and unconvertible columns unchanged. The inference rules are the same as during normal Series/DataFrame construction.

df.infer_objects().dtypes

참고URL : https://stackoverflow.com/questions/22470690/get-list-of-pandas-dataframe-columns-based-on-data-type

'Programing' 카테고리의 다른 글

웹킷 변환의 웹킷 전환시 깜박임 방지 (0)	2020.05.31
IE에만 스타일 적용 (0)	2020.05.31
C에서 포인터에 대한 포인터는 어떻게 작동합니까? (0)	2020.05.31
HttpResponseMessage 객체에 내용을 넣습니까? (0)	2020.05.31
Node.js 파일 확장명 가져 오기 (0)	2020.05.31

현재글데이터 유형을 기반으로 팬더 데이터 프레임 열 목록 가져 오기

crosscheck

데이터 유형을 기반으로 팬더 데이터 프레임 열 목록 가져 오기

데이터 유형을 기반으로 팬더 데이터 프레임 열 목록 가져 오기

'Programing' 카테고리의 다른 글

'Programing'의 다른글

티스토리툴바

데이터 유형을 기반으로 팬더 데이터 프레임 열 목록 가져 오기

데이터 유형을 기반으로 팬더 데이터 프레임 열 목록 가져 오기

'Programing' 카테고리의 다른 글

'Programing'의 다른글

관련글

티스토리툴바