garmentiq.utils.check_filenames_metadata

 1import os
 2import shutil
 3
 4
 5def check_filenames_metadata(output_dir, file_dir, metadata_df):
 6    """
 7    Validates that the filenames in a directory match those listed in a metadata DataFrame.
 8
 9    This function compares the filenames found in the specified directory with the filenames
10    listed in the provided metadata DataFrame. If the filenames do not match exactly (after sorting),
11    the output directory is deleted and a `ValueError` is raised. If they match, a confirmation
12    message is printed.
13
14    Args:
15        output_dir (str): Path to the output directory that will be deleted if filenames do not match.
16        file_dir (str): Path to the directory containing the files to check.
17        metadata_df (pandas.DataFrame): A pandas DataFrame containing a 'filename' column with expected filenames.
18
19    Raises:
20        ValueError: If the filenames in the directory do not match those in the metadata.
21
22    Returns:
23        None
24    """
25    file_list = os.listdir(file_dir)
26    metadata_filenames = metadata_df["filename"].tolist()
27    file_list.sort()
28    metadata_filenames.sort()
29
30    if file_list != metadata_filenames:
31        shutil.rmtree(output_dir)
32        raise ValueError(
33            f"Mismatch between directory filenames and metadata filenames. Maybe try again."
34        )
35    else:
36        print(f"\nAll filenames in {file_dir} match the metadata.\n")
def check_filenames_metadata(output_dir, file_dir, metadata_df):
 6def check_filenames_metadata(output_dir, file_dir, metadata_df):
 7    """
 8    Validates that the filenames in a directory match those listed in a metadata DataFrame.
 9
10    This function compares the filenames found in the specified directory with the filenames
11    listed in the provided metadata DataFrame. If the filenames do not match exactly (after sorting),
12    the output directory is deleted and a `ValueError` is raised. If they match, a confirmation
13    message is printed.
14
15    Args:
16        output_dir (str): Path to the output directory that will be deleted if filenames do not match.
17        file_dir (str): Path to the directory containing the files to check.
18        metadata_df (pandas.DataFrame): A pandas DataFrame containing a 'filename' column with expected filenames.
19
20    Raises:
21        ValueError: If the filenames in the directory do not match those in the metadata.
22
23    Returns:
24        None
25    """
26    file_list = os.listdir(file_dir)
27    metadata_filenames = metadata_df["filename"].tolist()
28    file_list.sort()
29    metadata_filenames.sort()
30
31    if file_list != metadata_filenames:
32        shutil.rmtree(output_dir)
33        raise ValueError(
34            f"Mismatch between directory filenames and metadata filenames. Maybe try again."
35        )
36    else:
37        print(f"\nAll filenames in {file_dir} match the metadata.\n")

Validates that the filenames in a directory match those listed in a metadata DataFrame.

This function compares the filenames found in the specified directory with the filenames listed in the provided metadata DataFrame. If the filenames do not match exactly (after sorting), the output directory is deleted and a ValueError is raised. If they match, a confirmation message is printed.

Arguments:
  • output_dir (str): Path to the output directory that will be deleted if filenames do not match.
  • file_dir (str): Path to the directory containing the files to check.
  • metadata_df (pandas.DataFrame): A pandas DataFrame containing a 'filename' column with expected filenames.
Raises:
  • ValueError: If the filenames in the directory do not match those in the metadata.
Returns:

None