garmentiq.utils.check_filenames_metadata
1import os 2import shutil 3 4 5def check_filenames_metadata(output_dir, file_dir, metadata_df): 6 """ 7 Validates that the filenames in a directory match those listed in a metadata DataFrame. 8 9 This function compares the filenames found in the specified directory with the filenames 10 listed in the provided metadata DataFrame. If the filenames do not match exactly (after sorting), 11 the output directory is deleted and a `ValueError` is raised. If they match, a confirmation 12 message is printed. 13 14 Args: 15 output_dir (str): Path to the output directory that will be deleted if filenames do not match. 16 file_dir (str): Path to the directory containing the files to check. 17 metadata_df (pandas.DataFrame): A pandas DataFrame containing a 'filename' column with expected filenames. 18 19 Raises: 20 ValueError: If the filenames in the directory do not match those in the metadata. 21 22 Returns: 23 None 24 """ 25 file_list = os.listdir(file_dir) 26 metadata_filenames = metadata_df["filename"].tolist() 27 file_list.sort() 28 metadata_filenames.sort() 29 30 if file_list != metadata_filenames: 31 shutil.rmtree(output_dir) 32 raise ValueError( 33 f"Mismatch between directory filenames and metadata filenames. Maybe try again." 34 ) 35 else: 36 print(f"\nAll filenames in {file_dir} match the metadata.\n")
def
check_filenames_metadata(output_dir, file_dir, metadata_df):
6def check_filenames_metadata(output_dir, file_dir, metadata_df): 7 """ 8 Validates that the filenames in a directory match those listed in a metadata DataFrame. 9 10 This function compares the filenames found in the specified directory with the filenames 11 listed in the provided metadata DataFrame. If the filenames do not match exactly (after sorting), 12 the output directory is deleted and a `ValueError` is raised. If they match, a confirmation 13 message is printed. 14 15 Args: 16 output_dir (str): Path to the output directory that will be deleted if filenames do not match. 17 file_dir (str): Path to the directory containing the files to check. 18 metadata_df (pandas.DataFrame): A pandas DataFrame containing a 'filename' column with expected filenames. 19 20 Raises: 21 ValueError: If the filenames in the directory do not match those in the metadata. 22 23 Returns: 24 None 25 """ 26 file_list = os.listdir(file_dir) 27 metadata_filenames = metadata_df["filename"].tolist() 28 file_list.sort() 29 metadata_filenames.sort() 30 31 if file_list != metadata_filenames: 32 shutil.rmtree(output_dir) 33 raise ValueError( 34 f"Mismatch between directory filenames and metadata filenames. Maybe try again." 35 ) 36 else: 37 print(f"\nAll filenames in {file_dir} match the metadata.\n")
Validates that the filenames in a directory match those listed in a metadata DataFrame.
This function compares the filenames found in the specified directory with the filenames
listed in the provided metadata DataFrame. If the filenames do not match exactly (after sorting),
the output directory is deleted and a ValueError
is raised. If they match, a confirmation
message is printed.
Arguments:
- output_dir (str): Path to the output directory that will be deleted if filenames do not match.
- file_dir (str): Path to the directory containing the files to check.
- metadata_df (pandas.DataFrame): A pandas DataFrame containing a 'filename' column with expected filenames.
Raises:
- ValueError: If the filenames in the directory do not match those in the metadata.
Returns:
None