I would like to try and load a csv data in python and stream each row spark via SPark Streaming.
Im pretty new to network stuff. Im not exactly if Im supposed to create a server python script that once it establishes a connection(with spark streaming) it will start sending each row. In the Spark Streaming Documentation they do a nc -l 9999 which is a netcat server listening on port 9999 if im correct. So I tried creating a python script similar that parses a csv and sends on port 60000
import socket # Import socket module
import csvport = 60000 # Reserve a port for your service.s = socket.socket() # Create a socket objecthost = socket.gethostname() # Get local machine names.bind((host, port)) # Bind to the ports.listen(5) # Now wait for client connection.print('Server listening....')while True:conn, addr = s.accept() # Establish connection with client.print('Got connection from', addr)csvfile = open('Titantic.csv', 'rb')reader = csv.reader(csvfile, delimiter = ',')for row in reader:line = ','.join(row)conn.send(line)print(line)csvfile.close()print('Done sending')conn.send('Thank you for connecting')conn.close()
SPark Streaming Script -
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
ssc = StreamingContext(sc, 1)# Create a DStream that will connect to hostname:port, like localhost:9999
lines_RDD = ssc.socketTextStream("localhost", 60000)# Split each line into words
data_RDD = lines_RDD.flatMap(lambda line: line.split(","))data_RDD.pprint()ssc.start() # Start the computation
ssc.awaitTermination() # Wait for the computation to terminate
When run the spark script(This is in Jupyter Notebooks btw) I get this error - IllegalArgumentException: 'requirement failed: No output operations registered, so nothing to execute'
I' dont think I am doing my socket script properly but im not really sure what to do Im basically trying to replicate what nc -lk 9999 does so I can send text data over the port and then spark streaming is listening to it and receives the data and processes it.
Any help would be greatly appreciated