mongoDB aggregation (pipeline)

1. mongoDB aggregation 사용법 익히기¶

기존의 find로는 원하는 데이터로 가공하는데 어려움
빅데이터를 다루려면 새로운 데이터 가공 방식이 필요
mongodb aggregation을 사용하면 documents를 grouping, filtering 등 다양한 연산을 적용할 수 있음
mongodb aggregation 기본 개념:
- Shard를 통하여 BigData를 저장하고, Aggragation Framework을 통하여 BigData를 처리
- MongoDB의 Aggregation은 Sharding 기반의 데이터를 효율적으로 처리하고 집계하는 프레임워크라고 이해하면 됨
- documents를 grouping, filtering 등 다양한 연산을 적용하여 계산된 결과를 반환
  - 주요 mongodb aggregation operators:
    - 예) filtering, like operation, transforming
    - https://docs.mongodb.com/manual/meta/aggregation-quick-reference/#aggregation-expressions

Aggregation Framework Pipeline (mongodb aggregation 방식)
- Unix의 pipe와 같은 방식으로 데이터를 처리하는 방식
- document를 여러 단계의 파이프라인으로 처리해서, 데이터를 처리/집계한다고 이해하면 됨
- 집합파이프라인이란? 데이터베이스로부터 원하는 데이터로가공하여 출력받는 방법
- Pipeline(파이프라인)이란? 컴퓨터과학에서 한 데이터 처리 단계의 출력이 다음 단계의 입력으로 이어지는 형태로 연결된 구조

1.1. Aggregation Framework Pipeline 사용 문법

aggregate함수구조

db.collection.aggregate([{stage}, ...], options)

이미지 출처 - https://docs.mongodb.com/manual/aggregation/#aggregation-framework

1.2. Aggregation Framework Pipeline 주요 명령 (SQL과 비교하면 이해가 쉬워짐)

출처 : 잔재미코딩 블로그 (https://www.fun-coding.org/mongodb_advanced1.html)

Stage

pipeline이란 stage의 연속 묶음이라고 볼 수 있다.

공식문서에서는

db.collection.aggregate 메소드 및 db.aggregate 메소드에서 파이프 라인 단계는 배열로 나타납니다. 문서는 순차적으로 단계를 통과합니다. 라고 서술되어있다.

한마디로 모든 aggregate구문은 순차적으로 stage를 구성하게 된다.

수많은 stage들이 있지만 그중에서 기본적으로

{$match : {query}} 
//해당 쿼리에 해당하는 문서만 선택, find와 같은 역할 

{$group:{_id: {expression},{field1} : {{accumulator1} : {expression1}}, ....}} 
//_id값으로 설정되는 값으로 그룹핑하고 임의로 설정한 필드에 그룹안의 값을 계산(accumulator)해서 집어넣을 수 있다.

등등이 있으며 그 외에도 공식문서에 전부 서술되어 있다.

StageDescription (https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/)

$addFields	Adds new fields to documents. Similar to $project, $addFields reshapes each document in the stream; specifically, by adding new fields to output documents that contain both the existing fields from the input documents and the newly added fields. $set is an alias for $addFields.
$bucket	Categorizes incoming documents into groups, called buckets, based on a specified expression and bucket boundaries.
$bucketAuto	Categorizes incoming documents into a specific number of groups, called buckets, based on a specified expression. Bucket boundaries are automatically determined in an attempt to evenly distribute the documents into the specified number of buckets.
$collStats	Returns statistics regarding a collection or view.
$count	Returns a count of the number of documents at this stage of the aggregation pipeline.
$facet	Processes multiple aggregation pipelines within a single stage on the same set of input documents. Enables the creation of multi-faceted aggregations capable of characterizing data across multiple dimensions, or facets, in a single stage.
$geoNear	Returns an ordered stream of documents based on the proximity to a geospatial point. Incorporates the functionality of $match, $sort, and $limit for geospatial data. The output documents include an additional distance field and can include a location identifier field.

예시문제

{accumulator}
$sum '합 '
$avg '평균 '
$first '첫번째 '
$last '마지막 '
$max '최대값 '
$min '최소값 '
$push '배열로 '
$addToSet 'unique한 배열로 '

예제 기본데이터

{"_id":1,"item":"abc","price":10,"quantity":2} 
{"_id":2,"item":"jkl","price":20,"quantity":1} 
{"_id":3,"item":"xyz","price":5,"quantity":10} 
{"_id":4,"item":"xyz","price":5,"quantity":20} 
{"_id":5,"item":"abc","price":10,"quantity":10}

db.sales.aggregate( 
 [ 
  { 
   $group:{ 
   //_id값은 상관하지 않는다. null로 통일 
   _id:null, 
   //totalPrice는 price와 quantity의 곱의 총합 
   totalPrice:{$sum:{$muultiply:["$price","$quantity"]}}, 
   //averageQuantity는 quantity의 평균값 
   averageQuantity:{$avg:"$quantity"}, 
   //각 document값 하나당 count++ 
   count:{$sum:1} 
   } 
  } 
 ] 
)

결과

{"_id":null, "totalPrice":290,"averageQuantity":8.6,"count":5}

'MongoDB' 카테고리의 다른 글

[MongoDB]DeprecationWarning: current Server Discovery and Monitoring engine is deprecated 오류 해결 (0)	2019.11.14
Node.js와 Mongodb 연동 기본(TypeError : db.collection is not a funtion오류해결) (0)	2019.11.14
MongoDB Query 연습문제 (0)	2019.11.12
MongoDB query 이해하기 (0)	2019.11.07
MongoDB 기본 명령어(Create, Read,Delete,Update) (0)	2019.11.06

jeongjunho94

mongoDB aggregation (pipeline)

1. mongoDB aggregation 사용법 익히기¶

1.1. Aggregation Framework Pipeline 사용 문법

1.2. Aggregation Framework Pipeline 주요 명령 (SQL과 비교하면 이해가 쉬워짐)

'MongoDB' 카테고리의 다른 글

티스토리툴바

mongoDB aggregation (pipeline)

1. mongoDB aggregation 사용법 익히기¶

1.1. Aggregation Framework Pipeline 사용 문법

1.2. Aggregation Framework Pipeline 주요 명령 (SQL과 비교하면 이해가 쉬워짐)

'MongoDB' 카테고리의 다른 글

'MongoDB' Related Articles

티스토리툴바