python -- 实现路径的匹配，剔除掉指定路径，并保存路径-Toy模板网

这篇具有很好参考价值的文章主要介绍了python -- 实现路径的匹配，剔除掉指定路径，并保存路径。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

python – 实现路径的匹配，剔除掉指定路径，并保存路径

在处理nc数据时，由于部分数据在插值的过程中，存在过多的0值，使得在制作标签时该时刻的数据出现报错，但是对于一年的数据量来说，无关紧要，所以只是记录了出现报错的时刻的路径，方面在后续变量读取过程中进行剔除，报错后续文件的处理。
下面记录一下主要的代码过程，包含以下部分
1、记录报错的文件路径
2、剔除原始路径中报错的路径
3、匹配其他数据剔除后的路径并保存

1、记录保存的文件路径

	skipped_files = []  # 记录跳过的文件路径
	cloud_label = []
    start = time.time()
    for filename in data:
        print(filename)
        try:
        	cloud_data = process_cloud(filename)
            cloud_label.append(cloud_data)
        except Exception as e:
            print(f"Error occurred while processing {filename}: {str(e)}")
            skipped_files.append(filename)
            
    cloud_label = np.array(cloud_label)    
    np.savez_compressed('cloud_label',cloud_label=cloud_label)
    if skipped_files:
        with open("skipped_files.txt", "w") as f:
            f.write("\n".join(skipped_files))

2、剔除原始路径中报错的路径

原始路径中包含出现报错的路径

import pandas as pd        
import pickle
def read_pickle_file(file_path):
    
    with open(file_path, 'rb') as file:
        
        data = sorted(pickle.load(file))
        
    return data

sate_path  = read_pickle_file(r'./match_sate_list_2018_2018.pkl')
gpm_path   = read_pickle_file(r'./match_gpm_list_2018_2018.pkl')
skip_path  = pd.read_csv('./skipped_files.txt',header=None, squeeze=True)


removed_indices = []

for index, path in enumerate(sate_path):
    print(index,path)
    if any(skip in path for skip in skip_path):
        removed_indices.append(index)
        
remaining_sate_path = [path for index, path in enumerate(sate_path) if index not in removed_indices]

remaining_gpm_path  = [path for index, path in enumerate(gpm_path) if index not in removed_indices]

# 打印删除后剩余的路径
print("删除后剩余的 sate_path:", remaining_sate_path)
print("删除后剩余的 sate_path:", remaining_gpm_path)

保存后的索引显示如下：
检查记录的索引是否与原始路径对的上，可以发现是对的上的。skip-path的第一个对于原始路径中的第12个索引位置，结果是没有问题的

上述代码思路为:文章来源地址https://www.toymoban.com/news/detail-690559.html

1、读取目标文件的路径，包含两个原始路径sate和gpm，以及一个记录出现报错的路径skip
2、通过循环，记录出现报错的路径在原始路径中的索引位置
3、再次通过循环，剔除掉在原始路径中出现报错信息对应索引位置的路径，并保存剔除后的路径

保存处理后的路径

保存方式1：

def save_paths_to_file(file_path, data):
    with open(file_path, 'w') as file:
        for path in sorted(data):
            file.write(path + '\n')

save_paths_to_file( remaining_sate_path,'seafog_sate_path.pkl')
save_paths_to_file( remaining_gpm_path,'seafog_gpm_path.pkl',)

保存方式2：

def save_to_pickle(data, file_path):
    with open(file_path, 'wb') as f:
        pickle.dump(data, f)
        print(file_path, 'saved.')
        
save_to_pickle( remaining_sate_path,'2018_seafog_sate_path.pkl')
save_to_pickle( remaining_gpm_path,'2018_seafog_gpm_path.pkl')