Convert several YAML files to CSV

2024/7/7 6:39:01

I am very new to Python and have several YAML files that I need to convert into csv. These are notes, comments and emails that came from our CRM (Highrise). I ONLY need the Notes and Comments, not the emails. Here are a few examples.

Test_Co_1.txt

---
- ID: 273679215Name: Test Company 1Tags: - Sample tag 1- Sample tag 2- Sample tag 3- Sample tag 4
- Contact: - - Addresses- - "123 W Elm Street, Anywhere, FL, 11111, United States"- - Phone_numbers- - 555-111-2222
- Background: sample text
- Note 424169327: - Author: Diane S.- Written: "May 16, 2017 19:32"- About: Jeff Smith- Body: Called 5/16/17 - Receptionist indicated Jeff was unavailable. She said they are not interested in attending any webinars hung up.
- Note 424598243: - Author: Jenny S.- Written: "May 18, 2017 15:45"- About: Test Company 1- Body: |-email sent to TM: PetePete,Can you help us with this prospective customer to determine if he is interested?We made some outbound calls this week, inviting dealers to the prospective dealer Summer Series webinars, with the first one being this Friday.  Can you see if Jeff is interested?  We do not have an email for him.  Do you have that?This is the note from earlier this week:Called 5/16/17 - Receptionist indicated Jeff was unavailable. She said they are not interested in attending any webinars hung up.Thanks for your help.photoJenny
- Comment 424601588: - Author: Jenny S.- Written: "May 18, 2017 15:56"- About: Test Company 1- Body: |-email back from TM: Jenny,Yes.  I will reach out to them. Thanks!Pete

Another Example: Fake_Co_2

---
- ID: 306184746Name: Fake Company 2Tags: - Sample Tag 1
- Contact: - - Addresses- - "444 N Oak St, Faketon City, MI, 22222, United States"- - Phone_numbers- - 333-333-3333
- Note 473905168: - Author: Robin S.- Written: "February 20, 2018 22:19"- About: Fake Company 2- Body: "1:1 with Steven 2/27/18"
- Email 476444812: - Author: Aaron N.- Written: "March 06, 2018 16:30"- About: Jose Viago- Subject: Welcome Call- Body: |-Hello Jose,We just talked and we scheduled your welcome call.  I noticed after we hungup that time changes this weekend.  Unfortunately Arizonadoesn't change time and we will now be 2 hours behind you.  Are youavailable on at 10:30 AM CST on Tuesday, March 13th?  Otherwise I will needto schedule at a different time.  I apologize for the error and inconvenience. <http://fakedomain.com/> Support Team Lead D: xxx-xxx-xxxx | C: xxx-xxx-xxxx | F: xxx-xxx-xxxx <mailto:[email protected]> [email protected] <http://fakedomain.com/> Website |<https://www.youtube.com/watch?v=xxx> Our StoryConfidentiality Disclaimer: This email may contain confidential and/orprivate information. If you received this email in error please delete and notifysender.
- Note 476458623: - Author: Jamie H.- Written: "March 06, 2018 17:12"- About: Fake Company 2- Body: ""
- Note 476460268: - Author: Aaron N.- Written: "March 06, 2018 17:18"- About: Fake Company 2- Body: |-Called and talked to Jose and scheduled the Welcome Call for Tuesday, March 13 at 9:30 AM.  After I hung up I realized that time changes this weekend.  I left him a voice mail and emailed to see if doing the appointment at 10:30 AM would be ok.  Prep for appointment: Monday, March 12 at 2:30 PM Welcome Call: Tuesday, March 13 at 10:30 AM CSTJose emailed back and said that 10:30 is fine.  Michael H has been scheduled
- Comment 476460532: - Author: Aaron N.- Written: "March 06, 2018 17:18"- About: Jose Viago- Body: |-From: Jose Viago [mailto:[email protected]] Sent: Tuesday, March 6, 2018 10:01 AMTo: [email protected]Subject: Re: Welcome CallYes that is fine.  Thank you! Jose ViagoFake Company 2xxx-xxx-xxxx
- Note 477585004: - Author: Laura H.- Written: "March 12, 2018 23:46"- About: Fake Company 2- Body: |-Welcome call prep complete. Roadmap & workbook have been saved to their profile in BOX, and updated per their provided information. 03/12/18 (LH)
- Note 477740716: - Author: Michael H.- Written: "March 13, 2018 16:47"- About: Fake Company 2- Body: |-03-13-2018. Did a welcome call with Jose. Jose now has access to the box. We will have a follow up call for Dashboard roll out.03-13-2018. Did a follow up with Jose. He now has owner and tech role to the App and Dashboard. We also reviewed Online portal and help center. (MH)
- Note 502997603: - Author: Laura H.- Written: "August 06, 2018 17:14"- About: Fake Company 2- Body: |-Received a text from Jose letting me know there is a leak in his office, and he needs to reschedule our call today. I moved him to Thursday 08/09/18 @ 9:00AM CDT. 08/06/18 (LH)

Some of these text files are 1000's of lines long, containing every internal note, comment, and email ever recorded for that specific customer (or contact that works for that customer).

We are moving to a different CRM and need to import the Notes and Comments only. I would like to generate a csv (or multiple csv files if needed) like this:

output.csv

Name,Author,Written,About,Body
"Fake Company 2"|"Robin S."|"February 20, 2018 22:19"|"Fake Company 2"|"1:1 with Steve 2/27/18"
"Fake Company 2"|"Aaron N."|"March 06, 2018 17:18"|"Fake Company 2"|"Called and talked to Jose and scheduled the Welcome Call for Tuesday, March 13 at 9:30 AM.  After I hung up I realized that time changes this weekend.  I left him a voice mail and emailed to see if doing the appointment at 10:30 AM would be ok.  Prep for appointment: Monday, March 12 at 2:30 PM Welcome Call: Tuesday, March 13 at 10:30 AM CSTJose emailed back and said that 10:30 is fine.  Michael H has been scheduled"

I found this code Need a script that extracts from a yaml file content and output as a csv file but I do not know enough about Python to get it to work without syntax errors.

Answer

I would make use of a Python YAML library to help with doing that. This could be installed using:

pip install pyyaml

The files you have given could then be converted to CSV as follows:

import csv
import yamlfieldnames = ['Name', 'Author', 'Written', 'About', 'Body']with open('output.csv', 'w', newline='') as f_output:csv_output = csv.DictWriter(f_output, fieldnames=fieldnames)csv_output.writeheader()for filename in ['Test_Co_1.txt', 'Test_Co_2.txt']:with open(filename) as f_input:data = yaml.safe_load(f_input)name = data[0]['Name']for entry in data:key = next(iter(entry))if key.startswith('Note') or key.startswith('Comment'):row = {'Name' : name}for d in entry[key]:for get in ['Author', 'Written', 'About', 'Body']:try:row[get] = d[get]except KeyError as e:passcsv_output.writerow(row)

This assumes a standard CSV format (i.e. commas between fields and quotes are used if a field contains a newline or commas).

To understand this, I would recommend you add some print statements to see what things look like. For example data holds then entire file contents in a format of lists and dictionaries. It is then a case of extracting the bits you need.

To apply this to all of your YAML files, I would replacing the filenames with a call to glob.glob('*.txt')

https://en.xdnf.cn/q/120637.html

Related Q&A

Python split unicode characters and words

Im running a data science project and i need your help.My string is:string = 🎁Testand I expect that output:s1 = 🎁s2 = Test

How to run two Flask discord.py bots in the same repl.it project?

I am using repl.it to run two bots in the same repl and Im using imports as a I saw in other stackoverflow questions. print(This will be the page where all bots will be made.) import os import bot1 imp…

build a perfect maze recursively in python

I have this project to build a perfect maze recursively by using python. I have a MyStack class which creates a stack to track the path that I go through. And a Cell class which represent each square w…

How do I select random text on screen [closed]

Closed. This question needs debugging details. It is not currently accepting answers.Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to repro…

Conditional return with no else [duplicate]

This question already has answers here:Python Ternary Operator Without else(9 answers)Closed 10 months ago.Im try to do something like that in Python:return None if a is NoneInstead of having:if a is N…

Finding the position of an item in another list from a number in a different list

Developing on my previous question Im wondering whether there is a way to have an element in a list and find that position in another list and return what is in the position as a variable. for example:…

Why am I getting array instead of vector size? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 3 years ago.Improve…

Sending Email attachment (.txt file) using Python 2.7 (smtplib) [duplicate]

This question already has answers here:How to send email attachments?(21 answers)Closed 9 years ago.So Im trying to send a .txt file as an attachment and I cant find the right code to work. Here is my…

Python selenium drop down menu click

i want to select option from a drop down menu, for this i use that :br.find_element_by_xpath("//*[@id=adyen-encrypted-form]/fieldset/div[3]/div[2]/div/div/div/div/div[2]/div/ul/li[5]/span").c…

TypeError: Argument must be rect style object - Pygame (Python [duplicate]

This question already has answers here:Closed 11 years ago.Possible Duplicate:Pygame (Python) - TypeError: Argument must be rect style object I am trying to make a brick breaker game in Pygame (with P…