pandas 从for循环中的函数返回唯一值

mklgxw1f  于 2023-03-06  发布在  其他
关注(0)|答案(2)|浏览(112)

我有一个逻辑问题,我正在努力理清头绪。我目前正在逐个样本地处理一些数据。每个样本都有一个与之关联的原始数据的 Dataframe 。同时处理的样本数量不同(即,一次运行代码可以处理两个样本,一次可以处理四个,一次可以只处理一个。它目前给我每个样本的输出,但我希望能够从函数返回某些值来执行不同的计算(我取所有样本的平均值,等等)到目前为止,我的代码片段看起来像这样:

import process data
import rafu
import pandas as pd

specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]

for y in range(len(specimen_ids)):
        
        # Read in dcpd data for specimen - this
        dcpd_df = rafu.dcpd_data_input (directory,specimen_ids[y])
        
        # If data is there, process it
        if dcpd_df is not None:
        
            specimen_df_csv, summary_df, delta_k_values, crack_length_values = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type,  target_growths)
            
            specimen_df_csv.to_csv(directory + '\\'+ specimen_ids[y]+'.csv')
            
            summary_df.to_csv(directory + '\\'+ specimen_ids[y]+' Summary.csv', index = False)

这里我的问题是delta_k_valuescrack_length_values这两个输出-我需要将它们与样本ID关联起来,以便将来进行计算(现在,我的代码只是覆盖每个样本值。有没有办法为它们附加一个唯一的样本标识符?我听说过eval,但我不确定是否正确。任何帮助都是很好的,干杯!

7cjasjjr

7cjasjjr1#

如果要跟踪每个delta_k_values和每个crack_length_values的样本ID,dict可能会起作用。
我还将直接迭代specimen_ids,因为除了索引到specimen_ids之外,您不使用y

import process data
import rafu
import pandas as pd

import os.path

specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]

delta_k_values = {}
crack_length_values = {}
for specimen_id in specimen_ids:
        
        # Read in dcpd data for specimen - this
        dcpd_df = rafu.dcpd_data_input (directory, specimen_id)
        
        # If data is there, process it
        if dcpd_df is not None:
        
            specimen_df_csv, summary_df, delta_k_values[specimen_id], crack_length_values[specimen_id] = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type,  target_growths)
            
            specimen_df_csv.to_csv(os.path.join(directory, f"{specimen_id}.csv"))
            
            summary_df.to_csv(os.path.join(directory, f"{specimen_id} Summary.csv"), index = False)

(我修改了路径/文件名位,但结果是一样的。两种都可以,使用os.path.join在目录分隔符方面可能更安全一些。)
然后,一旦完成,您就可以遍历这些语句,例如

for specimen_id, value in delta_k_values.items():
    print(specimen_id, ':', value)

或者直接访问值(如果您知道id):

specific_value = delta_k_values[known_id]
qfe3c7zg

qfe3c7zg2#

听起来您需要一个dict,键入样本ID:

import process data
import rafu
import pandas as pd

specimen_ids = [spec_1, spec_2, spec_3]
directory = 'some_folder'
cycle_rise_values = [1,2,3]
cycle_fall_values = [3,2,1]
stage_type = 'cycling'
target_growths = [0.1,0.2,0.3]

values = {}

for specimen_id in specimen_ids:
    # Read in dcpd data for specimen - this
    dcpd_df = rafu.dcpd_data_input (directory,specimen_id)
        
    # If data is there, process it
    if dcpd_df is not None:
        
        specimen_df_csv, summary_df, delta_k_values, crack_length_values = process_data.main(dcpd_df, cycle_rise_values, cycle_fall_values, stage_type,  target_growths)
        values[specimen_id] = delta_k_values, crack_length_values
            
        specimen_df_csv.to_csv(directory + '\\'+ specimen_id +'.csv')
            
        summary_df.to_csv(directory + '\\'+ specimen_id +' Summary.csv', index = False)

现在,您可以访问保存的值,如下所示(给定样本ID):
delta_k_values, crack_length_values = values[specimen_id]

相关问题