将 Dataframe 转换为R中的ped对象(pedtools)

c9qzyr3d  于 2023-04-09  发布在  其他
关注(0)|答案(2)|浏览(137)

我在R中有这个数据框。它具有谱系数据框的结构,具有idfidmidsex列。

pedigree <- structure(list(id = c(212, 214, 263, 266, 273, 274, 275, 279, 
280, 281, 286, 287, 312, 313, 314, 315, 316, 317, 318, 319, 320, 
321, 322, 323, 324, 325, 326, 327, 332, 333, 334, 335, 336, 337, 
338, 339, 340, 341, 346, 347, 348, 349, 389, 390, 391, 392, 413, 
414, 415, 416, 466, 475, 476, 477, 478, 479, 480, 483, 486, 487, 
491, 492, 493, 494, 498, 501, 502, 506, 507, 508, 509, 510, 511, 
512, 513, 514, 518, 519, 542, 543, 544, 545, 546, 547, 551, 552, 
553, 554, 555, 556, 564, 565, 568, 569, 570, 575, 576, 579, 580, 
584, 585, 586, 589, 590, 593, 595, 596, 597, 598, 599, 614, 615, 
616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 653, 654, 662, 
663, 671, 672, 673, 674, 675, 676, 681, 682, 683, 684, 688, 689, 
693, 694, 695, 696, 697, 698, 701, 702, 703, 704, 709, 710, 715, 
716, 718, 720, 721, 722, 723, 724, 725, 726, 727, 730, 731, 736, 
737, 738, 739, 740, 744, 745, 842, 843, 874, 875, 884, 885, 886, 
887, 889, 890, 894, 895, 896, 897, 898, 903, 905, 906, 907, 908, 
909, 910, 911, 912, 913, 914, 915, 917, 925, 926, 927, 928, 929, 
931, 932, 936, 965, 999, 1000, 1006, 1007, 1041, 1043, 1044, 
1046, 1068, 1069, 1070, 1071, 1072, 1073, 1074, 1075, 1099, 1100, 
1101, 1321, 1322, 1368, 1551, 1552, 1553, 1554, 1555), fid = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 326, 326, 326, 326, 279, 320, 320, 320, 320, 320, 320, 
320, 320, 320, 324, 324, 324, 324, 322, 322, 322, 324, 324, 324, 
324, 324, 324, 324, 324, 324, 318, 318, 326, 326, 326, 326, 326, 
326, 326, 326, 326, 326, 326, 326, 332, 332, 287, 287, 287, 287, 
287, 286, 286, 346, 346, 346, 348, 348, 348, 326, 326, 326, 326, 
326, 332, 332, 320, 320, 320, 320, 320, 287, 346, 346, 346, 346, 
273, 273, 273, 273, 266, 334, 334, 334, 334, 334, 336, 336, 336, 
336, 336, 336, 334, 334, 334, 334, 334, 334, 338, 338, 338, 338, 
340, 340, 340, 338, 338, 334, 334, 334, 334, 334, 334, 334, 334, 
314, 314, 314, 314, 314, 314, 314, 312, 312, 0, 0, 286, 286, 
314, 314, 314, 314, 314, 314, 334, 334, 334, 334, 334, 389, 389, 
389, 389, 389, 389, 389, 389, 389, 389, 389, 389, 338, 332, 332, 
332, 332, 332, 332, 332, 346, 274, 391, 391, 391, 391, 0, 0, 
0, 0, 316, 316, 316, 316, 316, 316, 316, 316, 842, 842, 842, 
1041, 1041, 1041, 1043, 1043, 1043, 1043, 1043), mid = c(0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 327, 327, 327, 327, 275, 321, 321, 321, 321, 321, 321, 
321, 321, 321, 325, 325, 325, 325, 323, 323, 323, 325, 325, 325, 
325, 325, 325, 325, 325, 325, 319, 319, 327, 327, 327, 327, 327, 
327, 327, 327, 327, 327, 327, 327, 333, 333, 212, 212, 212, 212, 
212, 214, 214, 347, 347, 347, 349, 349, 349, 327, 327, 327, 327, 
327, 333, 333, 321, 321, 321, 321, 321, 212, 347, 347, 347, 347, 
281, 281, 281, 281, 263, 335, 335, 335, 335, 335, 337, 337, 337, 
337, 337, 337, 335, 335, 335, 335, 335, 335, 339, 339, 339, 339, 
341, 341, 341, 339, 339, 335, 335, 335, 335, 335, 335, 335, 335, 
315, 315, 315, 315, 315, 315, 315, 313, 313, 0, 0, 214, 214, 
315, 315, 315, 315, 315, 315, 335, 335, 335, 335, 335, 390, 390, 
390, 390, 390, 390, 390, 390, 390, 390, 390, 390, 339, 333, 333, 
333, 333, 333, 333, 333, 347, 280, 392, 392, 392, 392, 0, 0, 
0, 0, 317, 317, 317, 317, 317, 317, 317, 317, 843, 843, 843, 
1044, 1044, 1044, 1046, 1046, 1046, 1046, 1046), sex = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 
2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 
1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 
2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L), levels = c("1", "2"), class = "factor")), row.names = c(NA, 
-234L), class = c("tbl_df", "tbl", "data.frame"))

这是一个结构,其中有234个人:

str(pedigree)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   234 obs. of  4 variables:
 $ id : num  212 214 263 266 273 274 275 279 280 281 ...
 $ fid: num  0 0 0 0 0 0 0 0 0 0 ...
 $ mid: num  0 0 0 0 0 0 0 0 0 0 ...
 $ sex: Factor w/ 2 levels "1","2": 1 1 1 2 2 2 1 2 1 1 ...

我正在尝试使用pedtools进行谱系分析。
为了将这个dataframe转换为ped对象,我使用了这个as.ped(pedigree)函数。
然而,我看到这个畸形的血统信息:

as.ped(pedigree)
Error: Malformed pedigree.
 Individual 287 is female, but appear as the father of 568
 Individual 212 is male, but appear as the mother of 568

我检查了id 568287212,但是所有的都被正确地分配了。这意味着287568的母亲(它被包含在fid中),类似地,212568的父亲(并且被包含在mid中)。
作为惯例,1指男性,2指女性。
会发生什么事?

kpbwa7wx

kpbwa7wx1#

我检查了id 568,287和212,但是所有的都被正确的分配了。这意味着287是568的母亲(它包含在fid中),287也是如此。
看看你的数据集,有568个州的记录

A tibble: 1 x 4
     id   fid   mid sex  
  <dbl> <dbl> <dbl> <fct>
1   568   287   212 1

287在fid列,而不是你说的中间列。数据中的某个地方有错误(要么fid和mid在这里被交换了,要么287和212的性别值被交换了)
编辑:经过进一步检查,有几个记录显示287是父亲,212是母亲,具体如下:

# A tibble: 6 x 4
     id   fid   mid sex  
  <dbl> <dbl> <dbl> <fct>
1   568   287   212 1    
2   569   287   212 1    
3   570   287   212 2    
4   575   287   212 1    
5   576   287   212 2    
6   621   287   212 2

这可能表明287和212的sex值不正确(而不是fid和mid在几个记录之间交换),但您需要检查数据源(或处理管道)来确认

3duebb1j

3duebb1j2#

问题是男性(1)被指定为母亲(2),女性被指定为父亲。R只返回它评估的第一个案例的错误。
您可以使用colnames重命名,然后运行代码:

colnames(pedigree) = c("id", "mid", "fid", "sex")
as.ped(pedigree)

你也可以直接在df中更改名称。

相关问题