I'm trying incrementally to build a financial statement database. The first steps center around collecting 10-Ks from the SEC's EDGAR database. I have code for pulling the relevant 8-Ks, 10-Ks, and 10-Qs by CIK number and accession number, and retrieving the relevant excel spreadsheet. The code below is now centering on trying to create a folder within a directory, then name the folder with the CIK code, then pull the spreadsheet from the EDGAR database, and save the spreadsheet to the folder with the CIK code. My example is a csv file I'm calling "accessionnumtest.csv", which has headings:
company_name,report_type,cik,date,cik_accession
and data:
4Less Group, Inc.,10K/A,1438901,11/27/2019,edgar/data/1438901/000121390019024801.txt
AB INTERNATIONAL GROUP CORP.,10K,1605331,10/22/2019,edgar/data/1605331/000166357719000384.txt
ABM INDUSTRIES INC /DE/,10K,771497,12/20/2019,edgar/data/771497/000162828019015259.txt
ACTUANT CORP,10K,6955,10/29/2019,edgar/data/6955/000000695519000033.txt
my code is below
import os
import pandas as pdpath = os.getcwd()folder_path = "C:/metricdatadb/"df = pd.read_csv("accessionnumtest.csv")folder_name = df['cik']
print(folder_name)for row in df.iterrows():dir = df.path.join(folder_path, folder_name)os.makedirs(dir)
This code is giving me, AttributeError: 'DataFrame' object has no attribute 'path' error. I have renamed the path, checked for whitespace in the headers. Any suggestions are appreciated.