list of lists to list of tuples without loops or list comprehensions [closed]
2024/11/15 15:21:15
I have a list of lists, say [[1,2], [2,3], [1,3]] to list of tuples [(1,2), (2,3), (1,3)]. This can be accomplished easily by list comprehensions as
[tuple(l) for l in list]
This however will be slow for large lists. So I would like to perform the same using pure numpy operations.
Edit 1 I will try to make it more clear.
I have a function say foo() which will return a python list of lists
def foo(*args):# Do somethingreturn arr
arr will have a list of lists structure arr = [[a,b], [c,d],...].
Each inner list ( e.g [a, b]) will be 2 elements long, and arr will contain a large number of such lists (typically larger than 90,000).
I however, require each inner list to be a tuple, for immutability, like
arr = [(a,b), (c, d),...]
This can be performed using list comprehensions as
def result(arr):return [tuple(l) for l in arr]
However, considering that the list is large, I would avoid this, and use pure numpy functions to accomplish this. (as @hpaulj suggested using arr.view(), see his other method using dict() and zip() in his answer below).
I would like to know if this is feasible or not. If feasible, please tell me how.
Answer
Your sample list, and an array made from it:
In [26]: alist = [[1,2], [2,3], [1,3]]
In [27]: arr = np.array(alist)
In [28]: arr
Out[28]:
array([[1, 2],[2, 3],[1, 3]])
tolist is a relatively fast way of 'unpacking' an array, but it produces a list of lists - just like we started with:
In [29]: arr.tolist()
Out[29]: [[1, 2], [2, 3], [1, 3]]
So converting that to a list of tuples requires the same list comprehension:
In [30]: [tuple(x) for x in arr.tolist()]
Out[30]: [(1, 2), (2, 3), (1, 3)]
In [31]: [tuple(x) for x in alist]
Out[31]: [(1, 2), (2, 3), (1, 3)]
Now if the array has a compound dtype, the tolist does produce a list of tuples. Conversely, to create a structured array from a list, we need a list of tuples:
In [33]: arr1 = np.array([tuple(x) for x in alist], dtype='i,i')
In [34]: arr1
Out[34]: array([(1, 2), (2, 3), (1, 3)], dtype=[('f0', '<i4'), ('f1', '<i4')])
In [35]: arr1.tolist()
Out[35]: [(1, 2), (2, 3), (1, 3)]
Constructing a structured array from a 2d array, is kind of tricky:
astype isn't much better. In fact, more than once I've recommended going the tolist route:
np.array([tuple(x) for x in arr.tolist()],'i,i')
In[33] is one case where where a list of tuples matters. That's because numpy developers have chosen to interpret the tuple as a structure array 'marker'.
I can't think of regular Python cases were a list of tuples is required and a list of lists won't do. Usually the significant difference between tuples and lists is that tuples are immutable. OK, that immutability does matter when constructing dictionary keys (or set elements).
In [42]: dict(zip(alist,['a','b','c']))
....
TypeError: unhashable type: 'list'
In [43]: dict(zip([tuple(x) for x in alist],['a','b','c']))
Out[43]: {(1, 2): 'a', (1, 3): 'c', (2, 3): 'b'}
corrected view conversion to structured array
My earlier attempt at using view was wrong because I used the wrong dtype:
In [45]: arr.dtype
Out[45]: dtype('int64')
In [46]: arr.view('i8,i8')
Out[46]:
array([[(1, 2)],[(2, 3)],[(1, 3)]], dtype=[('f0', '<i8'), ('f1', '<i8')])
In [47]: arr.view('i8,i8').tolist()
Out[47]: [[(1, 2)], [(2, 3)], [(1, 3)]]
Better - though now I have tuples within lists.
In [48]: arr.view('i8,i8').reshape(3).tolist()
Out[48]: [(1, 2), (2, 3), (1, 3)]
This avoids the list comprehension, but it isn't faster:
In [49]: timeit arr.view('i8,i8').reshape(3).tolist()
21.4 µs ± 51.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [50]: timeit [tuple(x) for x in arr]
6.26 µs ± 5.51 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Time tests for creating dictionary from list of lists vs. list of tuples:
In [51]: timeit dict(zip([tuple(x) for x in alist],['a','b','c']))
2.67 µs ± 21.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [52]: timeit dict(zip(Out[48],['a','b','c']))
1.31 µs ± 5.96 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Obviously you need to do time tests on realistic problems, but this small example suggests the way that those will go. Despite all the talk about numpy operations being fast(er), list comprehensions aren't that bad, especially if the result is going to be a list of Python objects anyways.
This is the file: https://drive.google.com/file/d/0B5v-nJeoVouHc25wTGdqaDV1WW8/view?usp=sharingAs you can see, there are duplicates in the first column, but if I were to combine the duplicate rows, no…
I have loaded a csv file and as i try to print it i get this error
Traceback (most recent call last):File "C:\Users\FSTC\Downloads\spaceproject\main.py", line 389, in <module>world_data…
Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 1…
So I have a list where each entry looks something like this:
"{A: array([1]), B: array([2]), C: array([3])}"I am trying to get a dataframe that looks like thisA B C
0 1 2 3
1 4 …
I am trying to make a button in Maya using Python that when you type in a number the for loop would loop for that many times. For example, I would put 5 in the box so the for loop would loop 5 times re…
I utf-8 encode characters like \u2013 before inserting them into SQLite.When I pull them out with a SELECT, they are back in their unencoded form, so I need to re-encode them if I want to do anything w…
I am making a live plotter to show the analog changes from an Arduino Sensor. The Arduino prints a value to the serial with a Baudrate of 9600. The Python code looks as following: import matplotlib.pyp…
I want to hide all lines (not only there cointaner) in sequence tree view (the default view).
I must hide all lines if code != foo but the attrs atribute dont work on tree views, so how can i filter/hi…
My program has two for loops. I generate a df in each looping. I want to append this result. For each iteration of inner loop, 1 row and 24 columns data is generated. For each iteration of outer loop, …
I am trying to read the executable jar file using python. That jar file doesnt have any java files. It contains only class and JSON files.
So what I tried is
from subprocess import Popen,PIPEjar_locati…