Loading...

Follow ZappySys Blog | MongoDB on Feedspot

Continue with Google
Continue with Facebook
or

Valid
Introduction

In our previous blog post we saw how to update / delete mongodb array item. Now let’s look at how to Upsert MongoDB Array Items (i.e. Insert Item if not found in Array else Update existing record). Upsert into nested MongoDB Array requires two step process, unfortunately there is no easy way to do in a single step. You may see in below screenshot that first we will do Update Step for Existing MongoDB Array Items and then we will insert missing Records in second step. So let’s get started.

Prerequisites Before we perform steps listed in this article, you will need to make sure following prerequisites are met:
  1. SSIS designer installed. Sometimes it is referred as BIDS or SSDT (download it from Microsoft site).
  2. Basic knowledge of SSIS package development using Microsoft SQL Server Integration Services.
  3. Make sure ZappySys SSIS PowerPack is installed (download it).
MongoDB Array Upsert – Update / Insert using Custom JOIN condition (such as $ne )

By default SSIS Mongodb Destination performs Lookup using $eq condition (Match rows using Equal operator). However there will be a time when you like to lookup and update target rows using custom join criteria (e.g. Use Not Equal condition – $ne ). V2.7.6  and later introduced new property called 

EnableCustomLookupQuery
 . When you enable this setting, you can supply a document which contains Data and Condition. You can map this XML formatted document to   __DOCUMENT__  (see below)

Here is the description how to use custom Join using this new property.

Enables use of custom lookup query for Update / Upsert or Delete Operation. 
By default JOIN condition for target record match is generated automatically based on columns you supply in ColumnsForLookup property. 
However in some cases you need to supply custom condition for lookup to perform complex operations, in such case enable this option. 
When you enable this option you must supply map __DOCUMENT__ input column. String you supply to this column should be in this format 

<command>
  <query>YOUR_LOOKUP_QUERY</query>
  <document>YOUR_JSON_DOC</document>
</command>. 

Lookup query in <query> tag can be either Mongo JSON format (e.g. { \"CustomerID\" : \"AAA\", \"Orders.OrderID\" : { \"$ne\" : \"1000\" } }) 
OR
you can use ZappySys SQL query (e.g. select * from mytable where CustomerID='AAA' and [Orders.$.OrderID] != '1000' )

Now lets look at step by step. In below example we have a table called CustomerTest. We will load it with 2 records (With No orders). Then Update array using custom lookup condition, and later step we will insert record in array using custom lookup condition.

SSIS MongoDB Array Upsert Example (Update / Insert Array Items based on custom lookup condition)

Update Array using Custom Lookup Condition
  1. Lets create few sample rows in MongoDB Collection. You can use following command in SSIS MongoDB Execute SQL Task
    {
     scope: 'database',
     db: 'Northwind',
     command: 'eval',
     args: 
     {
     code: 'db.CustomerTest.insert( [ {CustomerID:"AAA", Orders:[]}, {CustomerID:"BBB", Orders:[]}, {CustomerID:"CCC", Orders:[{OrderID:"1004","OrderDate" : "2008-02-05","Qty" : "5"} ]} ] )' 
     
     } 
    }
  2. Now drag Data flow.
  3. Drag ZS CSV Source or any other source. For example purpose we will configure CSV Source using Direct Value option with following Sample data. In the first run of package it will update only orders for CCC customer because only that customer will have records after table creation. one row. But in 2nd run all rows updated.
    CustomerID,Company,OrderID,OrderDate,Qty,Address
    AAA,Anthony Inc,1000,2008-01-01,90,Po Box 111\4612
    AAA,Anthony Inc,1001,2010-02-01,91,Po Box 111 / 4612
    BBB,Bob Inc,1002,2008-02-01,92,Po Box 222 / 4612
    BBB,Bob Inc,1003,2010-01-01,93,Po Box 222 / 4612
    CCC,Cindy Inc,1004,2010-01-07,2,Po Box 555 / 2345
     
  4. Click OK to save Source UI
  5. Now drag ZS Template Transform to build XML document (Data + Query). Connect Source to Template transform. You can also use ZS XML Generator Transform but for simplicity we will use Template Transform.
  6. Double click to open Template Transform and enter text as below. Notice how we have entered MongoDB Query and Data in two separate XML nodes. First node is custom query for lookup. Second node is Operation we like to perform (e.g. $set in our case to update existing data). You can use Insert Variable and then select Columns option to insert placeholders. Also in some fields we used JSONENCODE function to make sure we escape double quote and slash correctly.
    <command>
    	<query><![CDATA[
    {
    	"CustomerID": "<%CustomerID%>", 
    	"Orders.OrderID": { "$eq" : "<%OrderID%>"}
    }
    ]]></query>
    	<document><![CDATA[
    {
      $set :
       { "CompanyName":"<%Company,JSONENCODE%>", 
         "Address":"<%Address,JSONENCODE%>", 
         "Orders.$.OrderID": "<%OrderID%>", 
         "Orders.$.OrderDate": "<%OrderDate%>", 
         "Orders.$.Qty": "<%Qty%>" 
        } 
      }
    }
    ]]></document>
    </command>

    Using ZS Template Transform to specify custom JOIN condition for MongoDB Update (Array Items)

  7. Click OK to save Template Transform
  8. Drag ZS MongoDB Destination. Connect Template Transform to Destination
  9. Double click MongoDB Destination to edit. Set Connection, Table name, Operation=Update, EnableCustomLookupQuery=True

    Configure MongoDB Destination for Custom Join Condition (EnableCustomLookupQuery Setting)

  10. Goto Mappings Tab and attach TemplateOutput to __DOCUMENT__

    MongoDB Mappings – Loading Document for Array Update (Custom Join Criteria)

  11. Click OK to Save
Insert into Array using Custom Lookup Condition

Now let’s look at slightly modified steps to Insert into Array for new records. In this example we will insert Source records if CustomerID found but OrderID is not found. (See Template Transform Step). We will use Not Equal Condition this time.

  1. Now drag Data flow rename it to something like [Add records to array]
  2. Drag ZS CSV Source or any other source. For example purpose we will configure CSV Source using Direct Value option with following Sample data
    CustomerID,Company,OrderID,OrderDate,Qty
    AAA,Anthony Inc,1000,2008-01-01,30
    AAA,Anthony Inc,1001,2010-02-01,4
    BBB,Bob Inc,1002,2008-02-01,30
    BBB,Bob Inc,1003,2010-01-01,4
     
  3. Click OK to save Source UI
  4. Now drag ZS Template Transform to build XML document (Data + Query). Connect Source to Template transform. You can also use ZS XML Generator Transform but for simplicity we will use Template Transform.
  5. Double click to open Template Transform and enter text as below. Notice how we have entered MongoDB Query and Data in two separate XML nodes. First node is custom query for lookup. Second node is Operation we like to perform (e.g. $addToSet in our case to insert into array )
    <command>
    <query>
    {"CustomerID": "<%CustomerID%>", "Orders.OrderID":{ "$ne" : "<%OrderID%>"}}
    </query>
    <document>{
      $addToSet :
       { Orders : {"OrderID": "<%OrderID%>", "OrderDate": "<%OrderDate%>", "Qty": "<%Qty%>" } }    
     }
    </document>
    </command>
  6. Click OK to save Template Transform
  7. Drag ZS MongoDB Destination. Connect Template Transform to Destination
  8. Double click MongoDB Destination to edit. Set Connection, Table name, Operation=Update, EnableCustomLookupQuery=True
  9. Goto Mappings Tab and attach TemplateOutput to __DOCUMENT__
  10. Click OK to Save

Download Sample SSIS Package (2012 format) – MongoDB_Upsert_ArrayItem

Conclusion

MongoDB integration can be challenging if you are new to NoSQL world. If you are using SSIS as your primary ETL tool then not to worry because SSIS PowerPack can give super power needed to complete your project on time with drag and drop high performance connectors.

Keywords: ssis mongodb upsert array item | ssis mongodb update array item | ssis mongodb update array elements | mongodb update array documents | MongoDB $set operator | MongoDB $addToSet operator | MongoDB update sub document items | MongoDB CRUD operations | MongoDB Bulk Updates | MongoDB bulk updates to array items

The post Update or Insert – Upsert MongoDB Array Items using SSIS appeared first on ZappySys Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Introduction

In our previous post we explained how to read/load MongoDB data in SSIS. This post covers specifically how to parse MongoDB date time stored inside your MongoDB documents. By default any well known date formats (e.g. ISO date) will be parsed as valid datetime (e.g. DT_DBTIMESTAMP) when you use SSIS MongoDB Source. But if you have date stored as other format (e.g. MM-dd-yyyy) then system will not parse as string unless you specify custom date format on JSON Datetime tab. If you want to learn How to query Date in MongoDB then read this article

Parse MongoDB Date time using SSIS MongoDB Source

Perform the following steps to enable custom date format parsing for date not stored as ISODate in MongoDB. Below steps assume you have Downloaded and Installed SSIS PowerPack from Here

NOTE: Below steps not necessary if your date is stored as ISODate in MongoDB…. e.g.  { “OrderDate” : ISODate(“2015-12-31T00:00:00Z”)  }

  1. Open MongoDB Source. Change Mode to Query Mode
  2. Check Enable JSON option. You don’t have to type filter (its optional)
  3. Goto JSON Options and then Datetime options Tab
  4. Enter custom date format (The way your date stored in MongoDB)

Now go to columns and click refresh columns. It will detect column as DT_DBTIMESTAMP (e.g. date/time)

SSIS MongoDB Source- Date Time Options ( Parse MongoDB Date)

SSIS MongoDB Source – Query and Parse MongoDB Date with Custom format

The post How to parse MongoDB Date time in SSIS appeared first on ZappySys Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Introduction

MongoDB is one of the most poplar NoSQL database out there. In this article you will learn how to use ZappySys MongoDB Destination Connector to perform various bulk operations on MongoDB such as Insert, Update, Delete and Upsert. If you wish to learn how to Update MongoDB Array items then check this blog post instread.

MongoDB data read/write Example SSIS Package

Before we talk more about loading data into MongoDB lets look at screenshot and steps involved in our sample package for this tutorial. Below sample SSIS package uses 4 Custom SSIS Tasks

  1. SSIS MongoDB Source Connector (Used to read data from MongoDB Collection)
  2. SSIS MongoDB Destination Connector (Used to write data into MongoDB Collection also you can do other bulk operations such as MongoDB update, MongoDB upsert and MongoDB delete)
  3. SSIS MongoDB ExecuteSQL Task (Used to call any MongoDB Shell commands including MongoDB DDL or DML type operations such as CREATE/DROP, Insert, Update, Delete, Read, you can also call server side JavaScript)
  4. Dummy Data Source (Used to generate sample JSON Documents which can be loaded in MongoDB)
MongoDB Examples

Here is the list of various MongoDB examples included in attached sample package (see at the end for download links)

  • MongoDB example – How to read data from MongoDB collection using SQL query language
  • MongoDB example – How to load raw JSON document into MongoDB Collection (Bulk Insert with Reload)
  • MongoDB example – How to do MongoDB Upsert (Bulk Update or Insert) for specific fields or entire document
  • MongoDB example – How to perform MongoDB Delete ( Bulk Delete based on Matching Rows)
  • MongoDB example – How to create new MongoDB collection (Only if not exist – Use safe option)
  • MongoDB example – How to fetch MongoDB collection record count and save to SSIS variable
  • MongoDB example – How to fetch MongoDB collection size and save to SSIS variable
  • MongoDB example – How to get collection names and loop through them using ForEach Loop Task
  • MongoDB example – How to save query output into Recordset variable and loop though records using ForEach Loop Task
Video Tutorial – Update/Delete/Write/Bulk Load data into MongoDB from SQL Server/ORACLE

In below video tutorial you will see how easy it is to load data into MongoDB from any Datasource such as SQL Server, Oracle, MySQL, Flatfile etc. You will also learn how to product nested JSON from multiple tables using JSON Generator Transform. You can also use JSON Source with Output as Document Mode to load raw JSON data.

SSIS MongoDB Destination - Generate and load JSON Documents into MongoDB Collection - YouTube
Video Tutorial – Read data from MongoDB

In the following video tutorial you will learn how to consume data from MongoDB Collection using SSIS MongoDB Source Connector. You will notice that it uses SQL query syntax rather than MongoDB specific native query syntax (JSON query). This makes it super easy to query data without any learning curve. Data stored in MongoDB is in JSON document format but Data coming out from MongoDB Source Connector is Flat table structure (It de-normalize nested nodes). You can also Query inner Hierarchy using JSON Path expression (e.g. query Orders from Customer Document using $.Customer.Orders[*] ) .

SSIS MongoDB Source - Use SQL Like Query Language for reading MongoDB Collection - YouTube

SSIS Example – Loading data into MongoDB, Read from MongoDB, Upsert, Upsert, Delete Insert JSON Documents, Execute Shell Commands

MongoDB Insert – Bulk Loading data into MongoDB

Using MongoDB Destination Connector you can easily Insert JSON documents in Bulk. There are two input modes for inserting records into MongoDB.

Simple Mode – Loading data in simple mode (array not allowed)

In simple loading mode you can map source columns to target column in MongoDB destination. If you Insert data using this mode then you cant load complex documents with array.

Document Mode – Loading JSON documents into MongoDB

In this mode you can insert/update/delete documents by mapping inbuilt __DOCUMENT__ input column which appears on mapping tab (Target columns). When you perform Insert you have two options (Insert (i.e. Append) and Reload (Truncate + Insert). When you select Operation=Reload in that case Collection is first Truncated and then new records are loaded.

In JSON document load mode if you don’t supply _id column part of your input JSON then new _id is automatically generated by MongoDB for newly inserted document.

Loading JSON Documents into MongoDB Collection (Map Raw Document Column)

Loading JSON files into MongoDB

Another scenario could be loading JSON files into MongoDB. You can load JSON Files into MongoDB two ways.

  • Use Flat File Source with just one column with DT_NTEXT datatype (use this method when you each JSON documents in one line and they are separated by new line as below screenshot)
  • Use JSON Source Connector with Output as Document option checked. You can map __DOCUMENT__ source column to Target MongoDB collection as below. Advantage of this method is you extract JSON from array too and specify many other advanced options.

    SSIS Extract JSON Documents or Sub Documents from JSON file or JSON Array

Specifying LoadOptions

MongoDB Destination Connector supports following LoadOptions property (see Component Properties Tab on UI) which controls how target data gets modified. LoadOptions property is ignored for Reload and Insert operations.

  • op : this option specify update operators (Valid for Update and Upsert operation (i.e. UpdateOrInsert) ).  See this link for more information on various Update operators available in MongoDB. Most common operators listed below. Refer official MongoDB help for more information.
    • op=none : Replace entire document
    • op=$set  : Update only those fields specified in mapping. If Specified mapping field not available in target then its added to target document.
    • op=$push : Adds new item to destination array. See following examples (How to insert single or multiple documents into MongoDB Array by supplying JSON into  __DOCUMENT__ column
      /* Insert single item into array - Construct input JSON as below */
      {YourArrayField : 111 }
      
      /* Insert multiple items into array - Construct document as below */
      {YourArrayField : { $each: [ 111, 222, 333 ] } }
      
      /* Insert multiple documents into array */
      {YourArrayField : { $each: [ {OrderID:1, Total:20.00}, {OrderID:2, Total:12.00} ] } }
  • multi : This option controls if you want to Update or Delete multiple records for matching condition for each upstream input record.  For example if you joining by Country and City columns for update and target system finds 5 records then by default they all will be updated if multi=true is set in LoadOptions property. If you don’t want to update multiple records
MongoDB Update

MongoDB Destination Connector supports Batch Update and Batch Upsert Operations. There are few things to remember how Update operation works with MongoDB Destination Connector.

  • To perform Update operation you have to specify Join Criteria for record lookup on target (see ColumnsForLookup property).
  • Specify JOIN columns in ColumnsForLookup property. Use comma to specify multiple columns (e.g. CustomerID,OrderID). If you dont specify columns in this property then any columns mapped other than __DOCUMENT__ will be considered as JOIN column.
  • With MongoDB destination you can only specify simple Update condition by mapping JOIN input columns. Behind the scene it will construct search query for update. For example if you Map 3 input columns (__DOCUMENT__ , Country, State) and you specify LoadOptions : op=$set;multi=true then its similar as below SQL Statement.
    UPDATE MyCollection
    SET [properties specified in __DOCUMENT__]
    WHERE Country=[input value] AND State=[input value]
  • Join columns can be individually mapped or can be supplied as part of JSON document mapped to __DOCUMENT__ column.
  • If __DOCUMENT__ column is not mapped then you must specify at least one column name in ColumnsForLookup property to identify search column.
  • If __DOCUMENT__ column is mapped and column name is not specified in ColumnsForLookup property then you must map at least one more input column which can treated as JOIN column
Update specific fields  ($set operator)

To update specific fields in target MongoDB collection you can map JOIN column(s) and then map fields you want to update or map __DOCUMENT__ if you suppling JSON document with fields you want to update.

See below screenshot where its updating Customers record with new coordinates  based on City and Country information.

SSIS MongoDB Update Bulk – $set update operator – Update specific fields – Update single or multiple documents

Update single or multiple target records for single input row

If you want to update all matching records in target then set LoadOptions property to op=$set,multi=true
If you want to update only one record per matching criteria in target then set LoadOptions property to op=$set,multi=false

Update entire document (i.e. Overwrite/Replace documents)

If you use $set operator in LoadOptions property then it will only update specified fields in mappings. But if you want to overwrite entire document (Replace document) then set LoadOptions property to op=none,multi=true or op=none,multi=false (If you want to update one record for matching condition)

Insert single item into MongoDB Array using $push operator

If you want to insert new items into array then set LoadOptions property to op=$push,multi=true or op=$push,multi=false (If you want to update one record for matching condition). You have to map __DOCUMENT__ column on target mappings tab to supply item(s) or document(s) you want to insert into target array.

/* Insert single number or string into MongoDB array - Construct input JSON as below */
{YourArrayField : 111 }

/* Insert single document into MongoDB array - Construct input JSON as below */
{YourArrayField : {OrderID:1, Total:20.00} }

Insert multiple items into MongoDB Array using $push operator along with $each

Inserting multiple items into array requires use of $each modifier as below. Use above instructions except input document (e.g. data passed to __DOCUMENT__ column) will look like below (see how $each used).

/* Insert multiple items into MongoDB array - Construct document as below */
{YourArrayField : { $each: [ 111, 222, 333 ] } }

/* Insert multiple documents into MongoDB array */
{YourArrayField : { $each: [ {OrderID:1, Total:20.00}, {OrderID:2, Total:12.00} ] } }

Other MongoDB Update Operators

So far we have seen $set and $push update operators. To learn more about other operators see this link

MongoDB Update Array Items ($pull, $push, $addToSet)

If you wish to update items found inside nested array of document then Check this full length article

MongoDB Upsert – Bulk Update or Insert JSON documents

If you wish to perform Bulk Upsert using MongoDB Destination connector then select Upsert Action from dropdown (Action=UpdateOrInsert). In Upset if document is not found for matching criteria then new document is inserted in MongoDB Collection. If document is found then Update operation occurs. To set specific fields use op=$set in LoadOptions   and if you wish to overwrite entire document then use $op=none in LoadOptions

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview