python - Dataframes with RangeIndex vs.Int64Index - Why?

2024/10/3 8:16:45

EDIT: I have just found a line in my code that changes my df from a RangeIndex to a numeric Int64Index. How and why does this happen?

Before this line all my df are type RangeIndex. After this line of code df_new changes to type Int64Index which is a Range Index instead of a Numeric Index.

# remove rows with DMT, no lumninance data
df_new = df_new[df_new.Person != 'DMT']

Can anyone explain the following?

Int64Index and RangeIndex

"Warning Indexing on an integer-based Index with floats has been clarified in 0.18.0, for a summary of the changes, see here. Int64Index is a fundamental basic index in pandas. This is an Immutable array implementing an ordered, sliceable set. Prior to 0.18.0, the Int64Index would provide the default index for all NDFrame objects. RangeIndex is a sub-class of Int64Index added in version 0.18.0, now providing the default index for all NDFrame objects. RangeIndex is an optimized version of Int64Index that can represent a monotonic ordered set. These are analogous to Python range types." [from https://pandas.pydata.org/pandas-docs/stable/advanced.html#int64index-and-rangeindex]

  1. What why does index type change from RangeIndex to Int64Index?
  2. What are the key or important differences between working with the dataframes with the two different types of indexes? (RangeIndex & Int64Index)

    type(df_val.index)

    pandas.core.indexes.range.RangeIndex

    type(df_new.index)

    pandas.core.indexes.numeric.Int64Index

Answer

As per the pandas documentation

RangeIndex is a memory-saving special case of Int64Index limited to representing monotonic ranges. Using RangeIndex may in some instances improve computing speed.

Parameters: start : int (default: 0), or other RangeIndex instance.

If int and “stop” is not given, interpreted as “stop” instead.

stop : int (default: 0)

Int64Index is a special case of Index with purely integer labels.

step : int (default: 1)

Parameters: data : array-like (1-dimensional)

Output of RangeIndex from my own code:

RangeIndex(start=0, stop=4622, step=1). In my program there are 4622 number of observation.

Int64Index([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,

        ...934, 935, 936, 937, 938, 939, 940, 941, 942, 943],dtype='int64', name='user_id', length=943)

No. of observation: 943

https://en.xdnf.cn/q/70750.html

Related Q&A

Uniform Circular LBP face recognition implementation

I am trying to implement a basic face recognition system using Uniform Circular LBP (8 Points in 1 unit radius neighborhood). I am taking an image, re-sizing it to 200 x 200 pixels and then splitting …

SQLAlchemy declarative one-to-many not defined error

Im trying to figure how to define a one-to-many relationship using SQLAlchemys declarative ORM, and trying to get the example to work, but Im getting an error that my sub-class cant be found (naturally…

Convert numpy.array object to PIL image object

I have been trying to convert a numpy array to PIL image using Image.fromarray but it shows the following error. Traceback (most recent call last): File "C:\Users\Shri1008 SauravDas\AppData\Loc…

Scheduling celery tasks with large ETA

I am currently experimenting with future tasks in celery using the ETA feature and a redis broker. One of the known issues with using a redis broker has to do with the visibility timeout:If a task isn’…

How to read out scroll wheel info from /dev/input/mice?

For a home robotics project I need to read out the raw mouse movement information. I partially succeeded in this by using the python script from this SO-answer. It basically reads out /dev/input/mice a…

Tell me why this does not end up with a timeout error (selenium 2 webdriver)?

from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWaitbrowser = webdriver.Firefox()browser.get("http://testsite.com")element = WebDriverWait(browser, 10).until…

PEP 8: comparison to True should be if cond is True: or if cond:

PyCharm is throwing a warning when I do np.where(temp == True)My full code:from numpy import where, arraya = array([[0.4682], [0.5318]]) b = array([[0.29828851, 0., 0.28676873, 0., 0., 0., 0., 0.288014…

Getting the title of youtube video in pytube3?

I am trying to build an app to download YouTube videos in python using pytube3. But I am unable to retrieve the title of the video. Here goes my code: from pytube import YouTube yt = YouTube(link) prin…

pandas - concat with columns of same categories turns to object

I want to concatenate two dataframes with category-type columns, by first adding the missing categories to each column.df = pd.DataFrame({"a": pd.Categorical(["foo", "foo"…

Python convert Excel File (xls or xlsx) to/from ODS

Ive been scouring the net to find a Python library or tool that can converts an Excel file to/from ODS format, but havent been able to come across anything. I need the ability to input and output data …