Loading JSON file in BigQuery using Google BigQuery Client API

2024/9/24 19:22:28

Is there a way to load a JSON file from local file system to BigQuery using Google BigQuery Client API?

All the options I found are:

1- Streaming the records one by one.

2- Loading JSON data from GCS.

3- Using raw POST requests to load the JSON (i.e. not through Google Client API).

Answer

I'm assuming from the python tag that you want to do this from python. There is a load example here that loads data from a local file (it uses CSV, but it is easy to adapt it to JSON... there is another json example in the same directory).

The basic flow is:

# Load configuration with the destination specified.
load_config = {'destinationTable': {'projectId': PROJECT_ID,'datasetId': DATASET_ID,'tableId': TABLE_ID}
}load_config['schema'] = {'fields': [{'name':'string_f', 'type':'STRING'},{'name':'boolean_f', 'type':'BOOLEAN'},{'name':'integer_f', 'type':'INTEGER'},{'name':'float_f', 'type':'FLOAT'},{'name':'timestamp_f', 'type':'TIMESTAMP'}]
}
load_config['sourceFormat'] = 'NEWLINE_DELIMITED_JSON'# This tells it to perform a resumable upload of a local file
# called 'foo.json' 
upload = MediaFileUpload('foo.json',mimetype='application/octet-stream',# This enables resumable uploads.resumable=True)start = time.time()
job_id = 'job_%d' % start
# Create the job.
result = jobs.insert(projectId=project_id,body={'jobReference': {'jobId': job_id},'configuration': {'load': load}},media_body=upload).execute()# Then you'd also want to wait for the result and check the status. (check out# the example at the link for more info).
https://en.xdnf.cn/q/71666.html

Related Q&A

How to extract tables from a pdf with PDFMiner?

I am trying to extract information from some tables in a pdf document. Consider the input:Title 1 some text some text some text some text some text some text some text some text some text some textTabl…

Draw Box-Plot with matplotlib

Is it possible to plot this kind of chart with matplotlib?

Why I get urllib2.HTTPError with urllib2 and no errors with urllib?

I have the following simple code:import urllib2 import sys sys.path.append(../BeautifulSoup/BeautifulSoup-3.1.0.1) from BeautifulSoup import * page=http://en.wikipedia.org/wiki/Main_Page c=urllib2.urlo…

python - replace the boolean value of a list with the values from two different lists [duplicate]

This question already has answers here:Merge two or more lists with given order of merging(2 answers)Closed 6 years ago.I have one list with boolean values likelyst = [True,True,False,True,False]and tw…

Convert pandas DataFrame to dict and preserve duplicated indexes

vagrant@ubuntu-xenial:~/lb/f5/v12$ python Python 2.7.12 (default, Nov 12 2018, 14:36:49) [GCC 5.4.0 20160609] on linux2 Type "help", "copyright", "credits" or "licens…

Drawing rectangle on top of data using patches

I am trying to draw a rectangle on top of a data plot in matplotlib. To do this, I have this codeimport matplotlib.patches as patches import matplotlib.pyplot as pl...fig = pl.figure() ax=fig.add_axes(…

Setting row edge color of matplotlib table

Ive a pandas DataFrame plotted as a table using matplotlib (from this answer).Now I want to set the bottom edge color of a given row and Ive this code:import pandas as pd import numpy as np import matp…

TypeError: string indices must be integers (Python) [duplicate]

This question already has answers here:Why am I seeing "TypeError: string indices must be integers"?(10 answers)Closed 5 years ago.I am trying to retrieve the id value : ad284hdnn.I am getti…

how to split numpy array and perform certain actions on split arrays [Python]

Only part of this question has been asked before ([1][2]) , which explained how to split numpy arrays. I am quite new in Python. I have an array containing 262144 items and want to split it in small…

NLTK was unable to find the java file! for Stanford POS Tagger

I have been stuck trying to get the Stanford POS Tagger to work for a while. From an old SO post I found the following (slightly modified) code:stanford_dir = C:/Users/.../stanford-postagger-2017-06-09…