I have a python file that reads a file given by the user, processes it, and ask questions in flash card format. The program works fine with an english txt file but I encounter errors when trying to process a french file.
When I first encountered the error, I was using the windows command prompt window and running python cards.py
. When inputting the french file, I immediately got a UnicodeEncodeError
. After digging around, I found that it may have something to do with the fact I was using the cmd window. So I tried using IDLE. I didn't get any errors but I would get weird characters like œ
and Ã
and ®
.
Upon further research, I found some documentation that instructs to use encoding='insert encoding type'
in the open(file)
part of my code. After running the program again in IDLE, it seemed to minimize the problem, but I would still get some weird characters. When running it in the cmd, it wouldn't break IMMEDIATELY, but would eventually when it encountered an unknown character.
My question: what do I implement to ensure the program can handle ALL of the chaaracters in the file (given any language) and why does IDLE and the command prompt handle the file differently?
EDIT: I forgot to mention that I ended up using utf-8 which gave the results I described.