I have a Dataflow job to write to BigQuery. It works well for non-nested schema, however fails for the nested schema.
Here is my Dataflow pipeline:
pipeline_options = PipelineOptions()p = beam.Pipeline(options=pipeline_options)wordcount_options = pipeline_options.view_as(WordcountTemplatedOptions)schema = 'url: STRING,' \'ua: STRING,' \'method: STRING,' \'man: RECORD,' \'man.ip: RECORD,' \'man.ip.cc: STRING,' \'man.ip.city: STRING,' \'man.ip.as: INTEGER,' \'man.ip.country: STRING,' \'man.res: RECORD,' \'man.res.ip_dom: STRING'first = p | 'read' >> ReadFromText(wordcount_options.input)second = (first| 'process' >> (beam.ParDo(processFunction()))| 'write' >> beam.io.WriteToBigQuery('myBucket:tableFolder.test_table',schema=schema))
I created BigQuery Table using the following Schema is:
[{"mode": "NULLABLE","name": "url","type": "STRING"},{"mode": "NULLABLE","name": "ua","type": "STRING"},{"mode": "NULLABLE","name": "method","type": "STRING"},{"mode": "REPEATED","name": "man","type": "RECORD","fields":[{"mode": "REPEATED","name": "ip","type": "RECORD","fields":[{"mode": "NULLABLE","name": "cc","type": "STRING"},{"mode": "NULLABLE","name": "city","type": "STRING"},{"mode": "NULLABLE","name": "as","type": "INTEGER"},{"mode": "NULLABLE","name": "country","type": "STRING"}]},{"mode": "REPEATED","name": "res","type": "RECORD","fields":[{"mode": "NULLABLE","name": "ip_dom","type": "STRING"}]}]}
]
I am getting the following error:
BigQuery creation of import job for table "test_table" in dataset "tableFolder" in project "myBucket" failed., BigQuery execution failed., HTTP transport error:Message: Invalid value for: url is not a valid valueHTTP Code: 400
Question Can someone please guide me? What am I doing wrong? Also, If there is a better way to iterate through all the nested schema and write to BigQuery please suggest?
Additional info My data file:
{"url":"xyz.com","ua":"Mozilla/5.0 Chrome/63","method":"PUT","man":{"ip":{"cc":"IN","city":"delhi","as":274,"country":"States"},"res":{"ip_dom":"v1"}}}
{"url":"xyz.com","ua":"Mozilla/5.0 Chrome/63","method":"PUT","man":{"ip":{"cc":"DK","city":"munlan","as":4865,"country":"United"},"res":{"ip_dom":"v1"}}}
{"url":"xyz.com","ua":"Mozilla/5.0 Chrome/63","method":"GET","man":{"ip":{"cc":"BS","city":"sind","as":7655,"country":"India"},"res":{"ip_dom":"v1"}}}