How to Save io.BytesIO pdfrw PDF into Django FileField

2024/9/23 12:18:07

What I am trying to do is basically:

  1. Get PDF from URL
  2. Modify it via pdfrw
  3. Store it in memory as a BytesIO obj
  4. Upload it into a Django FileField via Model.objects.create(form=pdf_file, name="Some name")

My issue is that when the create() method runs, it saves all of the fields except for the form.

helpers.py

import io
import tempfile
from contextlib import contextmanagerimport requests
import pdfrw@contextmanager
def as_file(url):with tempfile.NamedTemporaryFile(suffix='.pdf') as tfile:tfile.write(requests.get(url).content)tfile.flush()yield tfile.namedef write_fillable_pdf(input_pdf_path, output_pdf_path, data_dict):template_pdf = pdfrw.PdfReader(input_pdf_path)## PDF is modified herebuf = io.BytesIO()print(buf.getbuffer().nbytes). # Prints "0"!pdfrw.PdfWriter().write(buf, template_pdf)buf.seek(0)return buf

views.py

from django.core.files import Fileclass FormView(View):def get(self, request, *args, **kwargs):form_url = 'http://some-pdf-url.com'with as_file(form_url) as temp_form_path:submitted_form = write_fillable_pdf(temp_form_path, temp_form_path, {"name": "John Doe"})print(submitted_form.getbuffer().nbytes).  # Prints "994782"!FilledPDF.objects.create(form=File(submitted_form), name="Test PDF") return render(request, 'index.html', {})

As you can see, print() gives out two different values as the BytesIO is populated, leading me to believe the increase in size means there is actually data in it. Is there a reason it is not saving properly into my django model instance? Also, if anyone knows a more efficient way to do this, please let me know!

Answer

You can use ContentFile class in your code. I did modification accordingly in your view to save your file in filefield.

from django.core.files.base import ContentFileclass FormView(View):def get(self, request, *args, **kwargs):form_url = 'http://some-pdf-url.com'with as_file(form_url) as temp_form_path:submitted_form = write_fillable_pdf(temp_form_path, temp_form_path, {"name": "John Doe"})pdf_content = ContentFile(submitted_form.getvalue(), 'sample.pdf')FilledPDF.objects.create(form=pdf_content, name="Test PDF") return render(request, 'index.html', {})

You can also use the save method to store file using the ContentFile class.

from django.core.files.base import ContentFileclass FormView(View):def get(self, request, *args, **kwargs):form_url = 'http://some-pdf-url.com'with as_file(form_url) as temp_form_path:submitted_form = write_fillable_pdf(temp_form_path, temp_form_path, {"name": "John Doe"})pdf_content = ContentFile(submitted_form.getvalue())filled_pdf = FilledPDF()filled_pdf.name = "Test PDF"filled_pdf.form.save("sample.pdf", pdf_content, save=False)filled_pdf.save()return render(request, 'index.html', {})
https://en.xdnf.cn/q/71792.html

Related Q&A

Which python static checker can catch forgotten await problems?

Code: from typing import AsyncIterableimport asyncioasync def agen() -> AsyncIterable[str]:print(agen start)yield 1yield 2async def agenmaker() -> AsyncIterable[str]:print(agenmaker start)return …

Tkinter : Syntax highlighting for Text widget

Can anyone explain how to add syntax highlighting to a Tkinter Text widget ?Every time the program finds a matching word, it would color that word to how I want. Such as : Color the word tkinter in pi…

how to use pkgutils.get_data with csv.reader in python?

I have a python module that has a variety of data files, (a set of csv files representing curves) that need to be loaded at runtime. The csv module works very well # curvefile = "ntc.10k.csv"…

How to make celery retry using the same worker?

Im just starting out with celery in a Django project, and am kinda stuck at this particular problem: Basically, I need to distribute a long-running task to different workers. The task is actually broke…

Make an AJAX call to pass drop down value to the python script

I want to pass the selected value from dropdown which contains names of databases and pass it to the python script in the background which connects to the passed database name. Following is the ajax co…

PyLint 1.0.0 with PyDev + Eclipse: include-ids option no longer allowed, breaks Eclipse integration

As noted in this question: How do I get Pylint message IDs to show up after pylint-1.0.0?pylint 1.0.0 no longer accepts "include-ids" option. (It returns "lint.py: error: no such optio…

Shifting all rows in dask dataframe

In Pandas, there is a method DataFrame.shift(n) which shifts the contents of an array by n rows, relative to the index, similarly to np.roll(a, n). I cant seem to find a way to get a similar behaviour …

Pandas dataframe: omit weekends and days near holidays

I have a Pandas dataframe with a DataTimeIndex and some other columns, similar to this:import pandas as pd import numpy as nprange = pd.date_range(2017-12-01, 2018-01-05, freq=6H) df = pd.DataFrame(ind…

How to dump a boolean matrix in numpy?

I have a graph represented as a numpy boolean array (G.adj.dtype == bool). This is homework in writing my own graph library, so I cant use networkx. I want to dump it to a file so that I can fiddle wit…

Cant append_entry FieldList in Flask-wtf more than once

I have a form with flask-wtf for uploading images, also file field can be multiple fields. my form: class ComposeForm(Form):attachment = FieldList(FileField(_(file)), _(attachment))add_upload = SubmitF…