Numpy repeat for 2d array

2024/9/27 19:23:54

Given two arrays, say

arr = array([10, 24, 24, 24,  1, 21,  1, 21,  0,  0], dtype=int32)
rep = array([3, 2, 2, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

np.repeat(arr, rep) returns

array([10, 10, 10, 24, 24, 24, 24], dtype=int32)

Is there any way to replicate this functionality for a set of 2D arrays?

That is given

arr = array([[10, 24, 24, 24,  1, 21,  1, 21,  0,  0],[10, 24, 24,  1, 21,  1, 21, 32,  0,  0]], dtype=int32)
rep = array([[3, 2, 2, 0, 0, 0, 0, 0, 0, 0],[2, 2, 2, 0, 0, 0, 0, 0, 0, 0]], dtype=int32)

is it possible to create a function which vectorizes?

PS: The number of repeats in each row need not be the same. I'm padding each result row to ensure that they are of same size.

def repeat2d(arr, rep):# Find the max length of repetitions in all the rows. max_len = rep.sum(axis=-1).max()  # Create a common array to hold all results. Since each repeated array will have # different sizes, some of them are padded with zero.ret_val = np.empty((arr.shape[0], maxlen))  for i in range(arr.shape[0]):# Repeated array will not have same num of cols as ret_val.temp = np.repeat(arr[i], rep[i])ret_val[i,:temp.size] = tempreturn ret_val 

I do know about np.vectorize and I know that it does not give any performance benefits over the normal version.

Answer

So you have a different repeat array for each row? But the total number of repeats per row is the same?

Just do the repeat on the flattened arrays, and reshape back to the correct number of rows.

In [529]: np.repeat(arr,rep.flat)
Out[529]: array([10, 10, 10, 24, 24, 24, 24, 10, 10, 24, 24, 24, 24,  1])
In [530]: np.repeat(arr,rep.flat).reshape(2,-1)
Out[530]: 
array([[10, 10, 10, 24, 24, 24, 24],[10, 10, 24, 24, 24, 24,  1]])

If the repetitions per row vary, we have the problem of padding variable length rows. That's come up in other SO questions. I don't recall all the details, but I think the solution is along this line:

Change rep so the numbers differ:

In [547]: rep
Out[547]: 
array([[3, 2, 2, 0, 0, 0, 0, 0, 0, 0],[2, 2, 2, 1, 0, 2, 0, 0, 0, 0]])
In [548]: lens=rep.sum(axis=1)
In [549]: lens
Out[549]: array([7, 9])
In [550]: m=np.max(lens)
In [551]: m
Out[551]: 9

create the target:

In [552]: res = np.zeros((arr.shape[0],m),arr.dtype)

create an indexing array - details need to be worked out:

In [553]: idx=np.r_[0:7,m:m+9]
In [554]: idx
Out[554]: array([ 0,  1,  2,  3,  4,  5,  6,  9, 10, 11, 12, 13, 14, 15, 16, 17])

flat indexed assignment:

In [555]: res.flat[idx]=np.repeat(arr,rep.flat)
In [556]: res
Out[556]: 
array([[10, 10, 10, 24, 24, 24, 24,  0,  0],[10, 10, 24, 24, 24, 24,  1,  1,  1]])
https://en.xdnf.cn/q/71423.html

Related Q&A

Python Linux route table lookup

I posted Python find first network hop about trying to find the first hop and the more I thought about it, the easier it seemed like it would be a process the routing table in python. Im not a program…

How to compare frequencies/sampling rates in pandas?

is there a way to say that 13Min is > 59S and <2H using the frequency notation in pandas?

Why do I get expected an indented block when I try to run my Python script? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.Closed 5 years ago.Edit the question to include desired behavior, a specific problem or error, and t…

python run command as normal user in a root script

I have a python script that is launched as root, I cant change it. I would like to know if its possible to exectute certain lines of this script (or all the script) as normal user (I dont need to be ro…

Compare values of two arrays in python

How can i check if item in b is in a and the found match item in a should not be use in the next matching? Currently this code will match both 2 in b.a = [3,2,5,4] b = [2,4,2]for i in b:if i in a:prin…

How to count the number of digits in numbers in different bases?

Im working with numbers in different bases (base-10, base-8, base-16, etc). Im trying to count the number of characters in each number. ExampleNumber: ABCDEF Number of digits: 6I know about the method …

Pandas KeyError using pivot

Im new to Python and I would like to use Python to replicate a common excel task. If such a question has already been answered, please let me know. Ive been unable to find it. I have the following p…

Not found: Container localhost does not exist when I load model with tensorflow and flask

I am a newbie research Deeplearning. I load a saved model with tensorflow and made a API with flask but I get error Container localhost does not exist. when I predict, please help me fix it. Thank you.…

Python: simplifying nested FOR loop?

I am wondering if there is a way to simplify the nested loop below. The difficulty is that the iterator for each loop depends on things from the previous loops. Here is the code:# Find the number of co…

NLTK Data installation issues

I am trying to install NLTK Data on Mac OSX 10.9 . The download directory to be set, as mentioned in NLTK 3.0 documentation, is /usr/share/nltk_data for central installation. But for this path, I get …