list of lists to list of tuples without loops or list comprehensions [closed]

2024/11/15 15:21:15

I have a list of lists, say [[1,2], [2,3], [1,3]] to list of tuples [(1,2), (2,3), (1,3)]. This can be accomplished easily by list comprehensions as

[tuple(l) for l in list]

This however will be slow for large lists. So I would like to perform the same using pure numpy operations.

Edit 1 I will try to make it more clear.

I have a function say foo() which will return a python list of lists

def foo(*args):# Do somethingreturn arr

arr will have a list of lists structure arr = [[a,b], [c,d],...]. Each inner list ( e.g [a, b]) will be 2 elements long, and arr will contain a large number of such lists (typically larger than 90,000).

I however, require each inner list to be a tuple, for immutability, like

arr = [(a,b), (c, d),...]

This can be performed using list comprehensions as

def result(arr):return [tuple(l) for l in arr]

However, considering that the list is large, I would avoid this, and use pure numpy functions to accomplish this. (as @hpaulj suggested using arr.view(), see his other method using dict() and zip() in his answer below).

I would like to know if this is feasible or not. If feasible, please tell me how.

Answer

Your sample list, and an array made from it:

In [26]: alist = [[1,2], [2,3], [1,3]]
In [27]: arr = np.array(alist)
In [28]: arr
Out[28]: 
array([[1, 2],[2, 3],[1, 3]])

tolist is a relatively fast way of 'unpacking' an array, but it produces a list of lists - just like we started with:

In [29]: arr.tolist()
Out[29]: [[1, 2], [2, 3], [1, 3]]

So converting that to a list of tuples requires the same list comprehension:

In [30]: [tuple(x) for x in arr.tolist()]
Out[30]: [(1, 2), (2, 3), (1, 3)]
In [31]: [tuple(x) for x in alist]
Out[31]: [(1, 2), (2, 3), (1, 3)]

Now if the array has a compound dtype, the tolist does produce a list of tuples. Conversely, to create a structured array from a list, we need a list of tuples:

In [33]: arr1 = np.array([tuple(x) for x in alist], dtype='i,i')
In [34]: arr1
Out[34]: array([(1, 2), (2, 3), (1, 3)], dtype=[('f0', '<i4'), ('f1', '<i4')])
In [35]: arr1.tolist()
Out[35]: [(1, 2), (2, 3), (1, 3)]

Constructing a structured array from a 2d array, is kind of tricky:

In [37]: arr.view('i,i')
Out[37]: 
array([[(1, 0), (2, 0)],[(2, 0), (3, 0)],[(1, 0), (3, 0)]], dtype=[('f0', '<i4'), ('f1', '<i4')])

astype isn't much better. In fact, more than once I've recommended going the tolist route:

np.array([tuple(x) for x in arr.tolist()],'i,i')

In[33] is one case where where a list of tuples matters. That's because numpy developers have chosen to interpret the tuple as a structure array 'marker'.

I can't think of regular Python cases were a list of tuples is required and a list of lists won't do. Usually the significant difference between tuples and lists is that tuples are immutable. OK, that immutability does matter when constructing dictionary keys (or set elements).

In [42]: dict(zip(alist,['a','b','c']))
....
TypeError: unhashable type: 'list'
In [43]: dict(zip([tuple(x) for x in alist],['a','b','c']))
Out[43]: {(1, 2): 'a', (1, 3): 'c', (2, 3): 'b'}

corrected view conversion to structured array

My earlier attempt at using view was wrong because I used the wrong dtype:

In [45]: arr.dtype
Out[45]: dtype('int64')
In [46]: arr.view('i8,i8')
Out[46]: 
array([[(1, 2)],[(2, 3)],[(1, 3)]], dtype=[('f0', '<i8'), ('f1', '<i8')])
In [47]: arr.view('i8,i8').tolist()
Out[47]: [[(1, 2)], [(2, 3)], [(1, 3)]]

Better - though now I have tuples within lists.

In [48]: arr.view('i8,i8').reshape(3).tolist()
Out[48]: [(1, 2), (2, 3), (1, 3)]

This avoids the list comprehension, but it isn't faster:

In [49]: timeit arr.view('i8,i8').reshape(3).tolist()
21.4 µs ± 51.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [50]: timeit [tuple(x) for x in arr]
6.26 µs ± 5.51 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Time tests for creating dictionary from list of lists vs. list of tuples:

In [51]: timeit dict(zip([tuple(x) for x in alist],['a','b','c']))
2.67 µs ± 21.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [52]: timeit dict(zip(Out[48],['a','b','c']))
1.31 µs ± 5.96 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Obviously you need to do time tests on realistic problems, but this small example suggests the way that those will go. Despite all the talk about numpy operations being fast(er), list comprehensions aren't that bad, especially if the result is going to be a list of Python objects anyways.

https://en.xdnf.cn/q/119397.html

Related Q&A

How can I merge CSV rows that have the same value in the first cell?

This is the file: https://drive.google.com/file/d/0B5v-nJeoVouHc25wTGdqaDV1WW8/view?usp=sharingAs you can see, there are duplicates in the first column, but if I were to combine the duplicate rows, no…

i usually get this error : ValueError: invalid literal for int() with base 10

I have loaded a csv file and as i try to print it i get this error Traceback (most recent call last):File "C:\Users\FSTC\Downloads\spaceproject\main.py", line 389, in <module>world_data…

How to Draw a triangle shape in python? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 1…

DataFrame from list of string dicts with array() values

So I have a list where each entry looks something like this: "{A: array([1]), B: array([2]), C: array([3])}"I am trying to get a dataframe that looks like thisA B C 0 1 2 3 1 4 …

Need Help Making Buttons to perform for loops when you input a number

I am trying to make a button in Maya using Python that when you type in a number the for loop would loop for that many times. For example, I would put 5 in the box so the for loop would loop 5 times re…

Combining multiple conditional expressions in a list comprehension

I utf-8 encode characters like \u2013 before inserting them into SQLite.When I pull them out with a SELECT, they are back in their unencoded form, so I need to re-encode them if I want to do anything w…

Arduino Live Serial Plotting with a MatplotlibAnimation gets slow

I am making a live plotter to show the analog changes from an Arduino Sensor. The Arduino prints a value to the serial with a Baudrate of 9600. The Python code looks as following: import matplotlib.pyp…

Hide lines on tree view - openerp 7

I want to hide all lines (not only there cointaner) in sequence tree view (the default view). I must hide all lines if code != foo but the attrs atribute dont work on tree views, so how can i filter/hi…

Python Append dataframe generated in nested loops

My program has two for loops. I generate a df in each looping. I want to append this result. For each iteration of inner loop, 1 row and 24 columns data is generated. For each iteration of outer loop, …

bError could not find or load main class caused by java.lang.classnotfoundation error

I am trying to read the executable jar file using python. That jar file doesnt have any java files. It contains only class and JSON files. So what I tried is from subprocess import Popen,PIPEjar_locati…