python-3.x 将重复的键值对文本文件转换为JSON

2ic8powd  于 2023-05-19  发布在  Python
关注(0)|答案(3)|浏览(145)

我有一个文件,其中包含以下格式:

  1. Cus Id: 1234
  2. Cus Name: 10:/John Parks
  3. Cus Type: temporary
  4. Cus client: tesla;toyota
  5. Dept Id: 111
  6. Cus Id: 1235
  7. Cus Name: 10:/William Parks
  8. Cus Type: temporary
  9. Cus client:
  10. Dept Id: 222

如何将其转换为JSON格式?任何方法bash,jq或python都可以。

  1. [
  2. {
  3. "Cus Id": 1234,
  4. "Cus Name": "10:/John Parks",
  5. "Cus Type": "temporary",
  6. "Cus client": "tesla;toyota",
  7. "Dept Id": 111
  8. },
  9. {
  10. "Cus Id": 1235,
  11. "Cus Name": "10:/William Parks",
  12. "Cus Type": "temporary",
  13. "Cus client": "null",
  14. "Dept Id": 222
  15. }
  16. ]
tcomlyy6

tcomlyy61#

  1. jq -Rs ' # 1. -Rs = read the input into a single string
  2. split("\n{2,}"; "") # 2. split on sequences of blank lines
  3. | map( # 3. transform each paragraph into an object
  4. split("\n")
  5. | map(scan("^([^:]+)(: (.*))?") | {key: first, value: last})
  6. | from_entries
  7. )
  8. ' data.file

产出

  1. [
  2. {
  3. "Cus Id": "1234",
  4. "Cus Name": "10:/John Parks",
  5. "Cus Type": "temporary",
  6. "Cus client": "tesla;toyota",
  7. "Dept Id": "111"
  8. },
  9. {
  10. "Cus Id": "1235",
  11. "Cus Name": "10:/William Parks",
  12. "Cus Type": "temporary",
  13. "Cus client": null,
  14. "Dept Id": "222"
  15. }
  16. ]
展开查看全部
xxb16uws

xxb16uws2#

Python解决方案:

  1. import json
  2. def try_to_convert_to_int(val):
  3. try:
  4. val = int(val)
  5. except:
  6. pass
  7. return val
  8. data, group = [], {}
  9. with open('your_file.txt', 'r') as f_in:
  10. for line in map(str.strip, f_in):
  11. if line == "":
  12. if group:
  13. data.append(group)
  14. group = {}
  15. else:
  16. k, v = map(str.strip, line.split(':', maxsplit=1))
  17. group[k] = try_to_convert_to_int(v) if v else None
  18. if group:
  19. data.append(group)
  20. print(json.dumps(data, indent=4))

图纸:

  1. [
  2. {
  3. "Cus Id": 1234,
  4. "Cus Name": "10:/John Parks",
  5. "Cus Type": "temporary",
  6. "Cus client": "tesla;toyota",
  7. "Dept Id": 111
  8. },
  9. {
  10. "Cus Id": 1235,
  11. "Cus Name": "10:/William Parks",
  12. "Cus Type": "temporary",
  13. "Cus client": null,
  14. "Dept Id": 222
  15. }
  16. ]
展开查看全部
zz2j4svz

zz2j4svz3#

您可以通过迭代地构建输出数组来创建原始文本流reduce(flag -Rinputs(flag -n)。从包含一个空对象([{}])的数组开始,然后使用正则表达式计算capture每行的内容,并用它填充当前最后一个数组项。如果捕获在结构上失败(测试键的存在),则添加另一个空对象。

  1. jq -Rn 'reduce (inputs | capture("(?<k>[^:]+):\\s*(?<v>.*)|")) as $in (
  2. [{}]; if $in.k then last[$in.k] = $in.v else . + [{}] end
  3. )'
  1. [
  2. {
  3. "Cus Id": "1234",
  4. "Cus Name": "10:/John Parks",
  5. "Cus Type": "temporary",
  6. "Cus client": "tesla;toyota",
  7. "Dept Id": "111"
  8. },
  9. {
  10. "Cus Id": "1235",
  11. "Cus Name": "10:/William Parks",
  12. "Cus Type": "temporary",
  13. "Cus client": "",
  14. "Dept Id": "222"
  15. }
  16. ]

Demo
更进一步:

  • 如果您希望以不同的方式处理特殊值,请在赋值之前相应地调整$in.v。例如,首先测试值是否像数字并将其转换为1,然后测试空字符串并将其替换为特殊字符串("null"),或者使用给定的字符串,您可以使用($in.v | tonumber? // (select(. == "") | "null") // .)
  • 如果您想以不同的方式对待特殊块,请在缩减后处理输出。例如,为了防止在输入块被多于一个空行分隔时生成空对象,您可以使用类似map(select(. != {}))的代码。
展开查看全部

相关问题