Leaving rows with a giving value in column

2024/10/15 1:24:36

In my dataset I have 3 columns (x,y) and VALUE. It's looking like this(sorted already):

x , y ,value
1 , 1 , 12
2 , 2 , 12
4 , 3 , 12
1 , 1 , 11
2 , 2 , 11
4 , 3 , 11
1 , 1 , 33
2 , 2 , 33
4 , 3 , 33

I need to get those rows where, distance bewteen them (in X and Y column) is <= 1 , lets say its my radius. But in same time i need to group and filter only those where Value is equal. I had problems to compare it in one dataset because there was one header, so i have created second dataset with python commands:

x , y ,value
1 , 1 , 12
2 , 2 , 12
4 , 3 , 12
x , y ,value
1 , 1 , 11
2 , 2 , 11
4 , 3 , 11
x , y ,value
1 , 1 , 33
2 , 2 , 33
4 , 3 , 33

I have tried to use this code:

def dist_value_comp(row):                         
x_dist = abs(df['y'] - row['y']) <= 1
y_dist = abs(df['x'] - row['x']) <= 1
xy_dist = x_dist & y_dist
max_value = df.loc[xy_dist, 'value'].max()
return row['value'] == max_valuedf['keep_row'] = df.apply(dist_value_comp, axis=1)
df.loc[df['keep_row'], ['x', 'y', 'value']]


filtered_df = df[df.apply(lambda line: abs(line['x']- line['y']) <= 1, 1)]
for i in filtered_df.groupby('value'):print(i)

Before I have received errors connected with bad data frame, I have repaired it but I have still no results on output. That's how I am creating my new data frame df from df1, if you will have any better idea please put it here, is one have big minus because always prints me the table. And I test it again and this def gives me empty DataFrame.

VALUE1= df1.VALUE.unique()
def separator():lst=[]for VALUE in VALUE1:abc= df1[df1.VALUE==VALUE]print abcreturn lstab=separator()

When I am trying normal dataset df1, I have on output all data without taking into account radius =1

I need to get on my output table like this one:

x , y ,value
1 , 1 , 12
2 , 2 , 12
x , y ,value
1 , 1 , 11
2 , 2 , 11
x , y ,value
1 , 1 , 33
2 , 2 , 33


I am working right now with this code:

filtered_df = df[df.apply(lambda line: abs(line['x']- line['y']) <= 1, 1)]
for i in filtered_df.groupby('value'):print(i)

It seems to be ok(i am taking df1 as input), but when i am looking on the output, its doing nothing because he dont know from what value it should use the radius +/-1, thats the reason i think. In my dataset i have more columns, so lets take into account my 4th and 5th column 'D'&'E', so radius will be taken from this row where is minimum value in column D & E in same time.

x , y ,value ,D ,E
1 , 1 , 12 , 1 , 2
2 , 2 , 12 , 2 , 3
4 , 3 , 12 , 3 , 4
1 , 1 , 11 , 2 , 1
2 , 2 , 11 , 3 , 2
4 , 3 , 11 , 5 , 3
1 , 1 , 33 , 1 , 3
2 , 2 , 33 , 2 , 3
4 , 3 , 33 , 3 , 3

So output result should be same as i want to , but right now i know from what value radius +/-1 in this case should start. Anyone can help me right now? Sorry for misunderstanding !


From what I understand, the order in which you make your operations (filter those with distance <= 1 and grouping them) has no importance.

Here is my take:

#first selection of the lines with right distance
filtered_df = df[df.apply(lambda line: abs(line['x']- line['y']) <= 1, 1)]# Then group
for i in filtered_df.groupby('value'):print(i)# Or do whatever you want

Let me know if you want some explanations on how some part of the code works.


Related Q&A

Python Circular dependencies, unable to link variable to other file

I am working on a program that allows me to directly edit a word document through a tkinter application. I am trying to link the tkinter input from my gui file to my main file so that I can execute my …

how to use xlrd module with python for abaqus

Im working on a script for abaqus where I have to import data from an excel file to put them into my script. I already downloaded the xlrd module and it work well on python interpreter (IDLE), but when…

Property in Python with @property.getter

I have an intresting behaviour for the following code:class MyClass:def __init__(self):self.abc = 10@propertydef age(self):return self.abc@age.getterdef age(self):return self.abc + 10@age.setterdef age…

Foreign Key Access

--------------------------------------------MODELS.PY-------------------------------------------- class Artist(models.Model):name = models.CharField("artist", max_length=50) #will display &…

ValueError: could not broadcast input array from shape (22500,3) into shape (1)

I relied on the code mentioned, here, but with minor edits. The version that I have is as follows:import numpy as np import _pickle as cPickle from PIL import Image import sys,ospixels = [] labels = []…

VGG 16/19 Slow Runtimes

When I try to get an output from the pre-trained VGG 16/19 models using Caffe with Python (both 2.7 and 3.5) its taking over 15 seconds on the net.forward() step (on my laptops CPU).I was wondering if …

Numpy vs built-in copy list

what is the difference below codesbuilt-in list code>>> a = [1,2,3,4] >>> b = a[1:3] >>> b[1] = 0 >>> a [1, 2, 3, 4] >>> b [2, 0]numpy array>>> c …

Scrapy returns only first result

Im trying to scrape data from gelbeseiten.de (yellow pages in germany)# -*- coding: utf-8 -*- import scrapyfrom scrapy.spiders import CrawlSpiderfrom scrapy.http import Requestfrom scrapy.selector impo…

Softlayer getAllBillingItems stopped working?

The following python script worked like a charm last month:Script:import SoftLayer client = SoftLayer.Client(username=someUser, api_key=someKey) LastInvoice = client[Account].getAllBillingItems() print…

Looking for a specific value in JSON file

I have a json file created by a function. The file is looks like this :{"images": [{"image": "/WATSON/VISUAL-REC/../IMAGES/OBAMA.jpg", "classifiers": [{"cla…