How to read Data from Url in python using Pandas?

2024/10/9 8:37:34

I am trying to read the text data from the Url mentioned in the code. But it throws an error:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2

url="https://cdn.upgrad.com/UpGrad/temp/d934844e-5182-4b58-b896-4ba2a499aa57/companies.txt"
c=pd.read_csv(url, encoding='utf-8')
Answer

Seems like there was some encoding issues with df.read_csv() it never splitted the code:

#!/usr/bin/env python3
import requests
import pandas as pd
url = "https://cdn.upgrad.com/UpGrad/temp/d934844e-5182-4b58-b896-4ba2a499aa57/companies.txt"
r = requests.get(url)
df = None
if r.status_code == 200: rows  = r.text.split('\r\n')header = rows[0].split('\t')data = []for n in range(1, len(rows)):cols = rows[n].split('\t')data.append(cols)df = pd.DataFrame(columns=header, data=data)
else:print("error: unable to load {}".format(url))sys.exit(-1)
print(df.shape)
print(df.head(2))$ ./test.py
(66369, 10)permalink      name            homepage_url                                      category_list     status country_code state_code      region           city  founded_at
0     /Organization/-Fame     #fame      http://livfame.com                                              Media  operating          IND         16      Mumbai         Mumbai
1  /Organization/-Qounter  :Qounter  http://www.qounter.com  Application Platforms|Real Time|Social Network...  operating          USA         DE  DE - Other  Delaware City  04-09-2014
https://en.xdnf.cn/q/118605.html

Related Q&A

Testing multiple string in conditions in list comprehension [duplicate]

This question already has answers here:How to test multiple variables for equality against a single value?(31 answers)Closed 6 years ago.I am trying to add multiple or clauses to a python if statement…

Filter range from two dates in the same query Django/Python

I need the result from a query that filters two dates from the same model. I need to get in the result 5 days (today plus 4 days) from original date and sale from target date (today plus 4 more days) b…

Python While/For loop

how can I make this into a while loop and output the same thing????for x in range(56,120) :if (x < 57) :summation = 0summation = x + summationif (x == 119) :print (“Sum of integers from 56 to 1…

Read a file into a nested dictionary?

Say I have a simple file like so holding arbitrary values:A, 20, Monday, 14, Tuesday, 15, Tuesday, 16 B, 40, Wednesday, 14, Friday, 12How would I get it into a nested dictionary so that each k/v pair l…

Using .replace function

I have a code with more than 2500 lines that contains several references to GIS layers. I need to replace these layers in the code for several web maps so I have to find a way to automate a find and re…

Python Turtle unit of measurement

When we instantiate a turtle object, we can draw a circle. I wonder about the radius parameter of the circle() method. import turtle myTurtle = turtle.Turtle() myTurtle.circle(50)What is the unit of me…

Python reverse() vs [::-1] slice performance [duplicate]

This question already has answers here:Difference between reverse and [::-1](2 answers)Time complexity of reversed() in Python 3(1 answer)Closed last year.Python provides two ways to reverse a list: Li…

Django Callback on Facebook Credits

I would like to use Facebook Credits with my Django Application.In the Facebook Credits documentation, there is only a sample for the callback page in PHP (https://developers.facebook.com/blog/post/489…

Remove \n from each string stored in a python list

I have a python list in which look like this:my_list = [OFAC\n, Social Media Analytics\n, Teaching Skills\n, Territory...\n, Active Directory...\n, Business Research\n, Call Center...\n, Treatment of d…

Optimizing loop. Faster ResultList.append( [ c, d, c[1]/d[1]] )? Array? Map?

The following works well but Id like to make it faster. The actual application could process Tuple1 and Tuple2 each with 30,000 elements and 17 nested sequences per element. I see numerous questions …