MongoDB Aggregate groupBy子文档上的日期和计数

zaq34kh6  于 2023-06-05  发布在  Go
关注(0)|答案(1)|浏览(196)

我知道这个问题还有其他版本,但我很难让这个问题符合要求。我有类似的文档:

[
{_id: 1, date: "2022-04-08T23:30:12.000Z", books: [{author: "Johnson", title: "First Title"}, {author: "Smith", title: "Second Title}]},
{_id: 2, date: "2022-04-22T23:30:12.000Z", books: [{author: "Johnson", title: "Some Other Title"}]},
{_id: 3, date: "2022-05-05T23:30:12.000Z", books: [{author: "Smith", title: "Title Round 2"}]},
{_id: 4, date: "2022-05-15T23:30:12.000Z", books: [{author: "Johnson", title: "Found a Title", {author: "Smith", title: "Wrote again"}, {author: "Brooks", title: "New Title"}]}
]

我尝试按月-年对文档进行分组,然后计算不同值在author字段中显示的次数。到目前为止,我有一个管道,看起来像:

{
          "$unwind": "$books"
        },
        {
          $project: {
            _id: 1,
            books: 1,
            month: {
              "$month": "$date"
            },
            year: {
              "$year": "$date"
            }
          }
        },
        {
          $project: {
            _id: 1,
            books: 1,
            date: {
              $concat: [
                {
                  $substr: [
                    "$year",
                    0,
                    4
                  ]
                },
                "-",
                {
                  $substr: [
                    "$month",
                    0,
                    2
                  ]
                },
                
              ]
            }
          }
        },
        {
          $group: {
            _id: {
              date: "$date",
              books: {
                freq: {
                  $sum: 1
                }
              }
            }
          }
        },
        {
          $project: {
            "_id": 1,
            "date": 1,
            "books": 1
          }
        },
        
      ]
    }

我的目标是最终的输出,看起来像:

[
{date: "2022-04", authors: { "Johnson": 2, "Smith": 1}},
{date: "2022-05", authors: {"Johnson": 1, "Smith": 2, "Brooks": 1}}
]

我已经看到了计算子文档的方法,但是在尝试实现时,我丢失了我的组的日期或只是得到了错误。我已经看够了,知道它的可行性,只是失去了试图让它恰到好处。任何帮助都很感激。

sr4lhrrt

sr4lhrrt1#

前3个阶段可以保持不变(假设日期存储为日期对象而不是字符串)
之后

  1. group by $date$books.author字段,并计算每组的出现次数。这将给予您在最终答案中需要的计数
    1.然后仅按$date分组,并将每个计数以键值{k:key,v:value}的格式推送到authors数组中,以便可以在下一阶段将其转换为对象
  2. authors数组上的$arrayToObject将其转换为对象
    如果你还想对日期进行排序,请添加一个{ $sort: {date: 1 } }阶段
db.collection.aggregate([
  { $unwind: "$books" },
  { $project: { _id: 1, books: 1, month: { "$month": "$date" }, year: { "$year": "$date" } } },
  { $project: { _id: 1, books: 1, date: { $concat: [ { $substr: [ "$year", 0, 4 ] }, "-", { $substr: [ "$month", 0, 2 ] } ] } } },
  {
    $group: {
      _id: { date: "$date", author: "$books.author" },
      count: { $sum: 1 }
    }
  },
  {
    $group: {
      _id: "$_id.date",
      authors: { $push: { k: "$_id.author", v: "$count" } }
    }
  },
  {
    $project: {
      _id: 0,
      date: "$_id",
      authors: { $arrayToObject: "$authors" }
    }
  }
])

playground您可以从顶部的Stage下拉列表中查看中间结果

相关问题