Best practice for using common subexpression elimination with lambdify in SymPy

2024/10/12 2:21:39

I'm currently attempting to use SymPy to generate and numerically evaluate a function and its gradient. For simplicity, I'll use the following function as an example (keeping in mind that the real function is much lengthier):

import sympy as sp
def g(x):return sp.cos(x) + sp.cos(x)**2 + sp.cos(x)**3

It's easy enough to numerically evaluate this function and its derivative:

import numpy as np
g_expr = sp.lambdify(x,g(x),modules='numpy')
dg_expr = sp.lambdify(x,sp.diff(g(x)),modules='numpy')print g_expr(np.linspace(0,1,50))
print dg_expr(np.linspace(0,1,50))

For my real function, however, lambdify is slow, both in terms of generating the numerical function and in its evaluation. As many elements in my function are similar, I'd like to use common subexpression elimination (cse) within lambdify to speed this process up. I know that SymPy has a built-in function to perform cse,

>>> print sp.cse(g(x))
([(x0, cos(x))], [x0**3 + x0**2 + x0])

but don't know what syntax to use in order to utilize this result in my lambdify function (where I'd still like to use x as my input argument):

>>> g_expr_fast = sp.lambdify(x,sp.cse(g(x)),modules='numpy')
>>> print g_expr_fast(np.linspace(0,1,50))
Traceback (most recent call last):File "test3.py", line 34, in <module>print g_expr1(nx1)File "<string>", line 1, in <lambda>
NameError: global name 'x0' is not defined

Any help on how to use cse in lambdify would be appreciated. Or, if there are better ways to speed up my gradient calculations, I'd appreciate hearing those as well.

In case it's relevant, I'm using Python 2.7.3 and SymPy 0.7.6.

Answer

The speed of the calculations can be increased:

  • A bit (36x improvement in my example) by extracting the common factors and not repeating calculations.
  • A bigger speedup (up to 100x-1000x) can be obtained with for example by creating an extension module with numba.

Foreword

I assume that this is "calculate function in sympy once and use it many times in different project later" type of case. Therefore, there is some manual copy-pasting and creating of files included. However it is possible to automate the creation of the new files with the functions, and also the compilation step, but I leave that out from this now.

I had a similar problem and I did some benchmarking on different methods. The function I've used is quite long (len(str(expr)) = 45857) and cse(expr) decomposes it into 72 subexpressions. It is too long to copy-paste here but here are the steps to make up to 100x-1000x speed increase to functions created with sympy.

Benchmarks

A) Evaluating single float
Time to evaluate the function with one float value for each parameter. Using timeit myfunc(*params).

  • Baseline: lambdify with modules="numpy": 277µs
  • (1) copy-paste of str(expr) to function definition: 275µs (no difference)
  • (2) Copy-paste of expression after cse: 8.2 µs (33x improvement)
  • (3) Copy-paste of expression after cse(optimizations="basic"): 7.6µs (36x improvement)
  • (4) Use numba to compile the code to func_numba_f(): 0.25µs (1090x improvement)
  • (5) Use the sympy autowrap: 0.47 µs (589x improvement)

B) Evaluating np.array of 1000 floats

  • (1) copy-paste of str(expr) to function definition: 15100 µs | 15.1µs per value
  • (2) Copy-paste of expression after cse: 493 µs | 0.49µs per value (31x improvement)
  • (3) Copy-paste of expression after cse(optimizations="basic"): 413µs | 0.41µs per value (37x improvement)
  • (4) Use numba to compile the code to func_numba_arr(): 114µs | 0.11µs per value (132x improvement)
  • (5) sympy autowrap with np.vectorize: 480µs | 0.48µs per value (31x improvement)

(1) Copy-paste of str(expr)

  • Just copy paste the expression string into new function, and return the value.
  • Save the function in another file.

(2) Copy-paste of expression after cse

  • Idea: make the code much shorter by identifying common parts.
  • First, copy-paste the common parts:
repl, redu = cse(K)
for variable, expr in repl:print(f"{variable} = {expr}")
  • Then, copy-paste the return value: print(redu[0]).
  • Create another file, and paste into function definition

(3) Copy-paste of expression after cse(optimizations="basic")

  • Same as (2) but with optimizations="basic"
  • This creates slightly shorted code than (2)

(4) Use numba to compile the code

  • Use the numba.pycc.CC to compile the code. So, create a function with copy-pasting as in (3)
  • Then, create src_mymodule.py with code
from numba.pycc import CCcc = CC("my_numba_module")@cc.export("func_numba_f", "f8(f8, f8, f8, f8, f8)")
@cc.export("func_numba_arr", "f8[:](f8[:],f8[:],f8[:],f8[:],f8[:])")
def myfunc(x1, x2, x3, x4, x5):# your function definition herereturn valueif __name__ == "__main__":cc.compile()
  • In function func_numba_f() there are five float valued input variables and one float valued output variable. The f8 means float.
  • The func_numba_arr() is the version that handle np.arrays with dtype="float64" or dtype="float32", depending what you use to compile it.
  • You then compile the code once by running python src_mymodule.py. This will create my_numba_module.cp38-win_amd64.pyd or similar. It can be only used with the same python version and bitness as in the filename.
  • Then, in another python file, you would import the functions and use them, like:
from my_numba_module import func_numba_f, func_numba_arrout = func_numba_f(4,3,2,1,100)# or:
args = [np.array([x]*N, dtype='float64') for x in (4,3,2,1,100)]
out_arr = func_numba_arr(*args)

(5) Use the sympy autowrap

  • This was easy. After installing and configuring Cython, I needed two lines of code to create a function with autowrap.
from sympy.utilities.autowrap import autowrap
func = autowrap(expr, backend='cython')
  • By specifying the temp_dir argument, it saves all the source files (.c, .h, .pyx) and a .pyd (win) / .so (unix) file that can be used to import the function later on with (assuming the temp_dir is in sys.path):
from wrapper_module_1 import autofunc_c
  • This was by far the easiest method although the produced C-code is not highly optimized. The output of autowrap could be renamed in additional steps, if needed.
  • The function accepts scalars only but can be vectorized with np.vectorize
func = np.vectorize(func)
https://en.xdnf.cn/q/69699.html

Related Q&A

Determine if a text extract from spacy is a complete sentence

We are working on sentences extracted from a PDF. The problem is that it includes the title, footers, table of contents, etc. Is there a way to determine if the sentence we get when pass the document t…

Drawing labels that follow their edges in a Networkx graph

Working with Networkx, I have several edges that need to be displayed in different ways. For that I use the connectionstyle, some edges are straight lines, some others are Arc3. The problem is that eve…

randomly choose 100 documents under a directory

There are about 2000 documents under the directory. I want to randomly select some documents and copy them to a new directory automatically.Some relevant information about generating one document name …

Oauth client initialization in python for tumblr API using Python-oauth2

Im new to Oauth. In the past for twitter applications written in Python i used python-oauth2 library to initialize client like this:consumer = oauth.Consumer(key = CONSUMER_KEY, secret = CONSUMER_SECRE…

Model description in django-admin

Is it possible to put a model description or description on the list display page of a certain model in django-admin?Im talking about something like when you click a model name link on the homepage of…

Print underscore separated integer

Since python3.6, you can use underscore to separate digits of an integer. For examplex = 1_000_000 print(x) #1000000This feature was added to easily read numbers with many digits and I found it very u…

What does (numpy) __array_wrap__ do?

I am diving into the SciPy LinAlg module for the first time, and I saw this function:def _makearray(a):new = asarray(a)wrap = getattr(a, "__array_prepare__", new.__array_wrap__)return new, wr…

SqlAlchemy TIMESTAMP on update extra

I am using SqlAlchemy on python3.4.3 to manage a MySQL database. I was creating a table with:from datetime import datetimefrom sqlalchemy import Column, text, create_engine from sqlalchemy.types import…

Is it possible to pass a dictionary with extraneous elements to a Django object.create method?

I am aware that when using MyModel.objects.create in Django, it is possible to pass in a dictionary with keys which correspond to the model fields in MyModel. This is explained in another question here…

When should I use varargs in designing a Python API?

Is there a good rule of thumb as to when you should prefer varargs function signatures in your API over passing an iterable to a function? ("varargs" being short for "variadic" or …