The current implementation of the DOI ingestion function processes the input file line by line, regardless of file format or structure. This approach allows any file type to be processed. Current code:
with open(args.list_of_dois, "r") as csv_file:
for line in csv_file:
list_of_dois.append(line.strip())
Problem: This code does not validate the file type, so it will try to process any input file (e.g., .txt, .csv, .json, yaml). While it works for line-based formats, this lack of restriction could lead to issues if the input is a file with a different format or structure.
Also, if one passes the invalid .csv file the pipeline does not have a failure feedback mechanism as it gives a Success message.
The current implementation of the DOI ingestion function processes the input file line by line, regardless of file format or structure. This approach allows any file type to be processed. Current code:
Problem: This code does not validate the file type, so it will try to process any input file (e.g., .txt, .csv, .json, yaml). While it works for line-based formats, this lack of restriction could lead to issues if the input is a file with a different format or structure.
Also, if one passes the invalid .csv file the pipeline does not have a failure feedback mechanism as it gives a
Successmessage.