You’re going to need the following:
- An ObjectRocket MongoDB instance.
- An ObjectRocket Elasticsearch(2.x) instance.
- Both instances are in the same zone.
- You have existing data within your MongoDB instance.
Select your source MongoDB collection and the target Elasticsearch index¶
This part is pretty straight forward with three easy steps.
- Give your Data Connector a name.
- Select the MongoDB instance, database, and collection to read from.
- Select the Elasticsearch instance and either create a new index or select an existing one.
Once you click “Add Connector” your schema analysis request will be queued up and you should expect your results back within a few moments. The workflow state is persisted, so you can close your browser or walk away between steps and resume at any time.
Choose the fields to index¶
This is where we’ve tried to make it a very easy and pain free process for you. No APIs to learn or script against, no need to learn the latest pipeline tool’s syntax. We’ve abstracted that away.
By default the schema analysis checks the most recent 1000 rows within your MongoDB collection. In the coming months we will be introducing a number of options for the depth and breadth of the schema analysis, as well as the ability to schedule the scan for off peak hours.
So what are we seeing here? This is a simple schema with no variety to the data types across the 1000 rows examined. Let’s see what these various fields mean.
- Types - these are the data types found within the same field. The number next to it tells you how many of that type were detected. For example, a field with [String (100), Int(900)] tell us that there were 100 rows with strings and 900 rows with integers
- Occurrences - how many times has this field appeared within your collection
- Percent - the percentage of this fields appearance
- Map Key Target - the type of the field within Elasticsearch
- Analyzed - analyze the string and full text index it.
- Not Analyzed - Index this field, so it is searchable, but index the value exactly as specified. Do not analyze it.
- No - do not index, store, or analyze this field.
- Raw - this create a second indexed field appended with ”.raw”, and set to “Not Analyzed”. This is used for display purposes within Kibana and other reporting tools.
So now that we know that, go ahead and take a few moments to review the shape of your schema, make your selections and confirm your mapping. Once you confirm and continue the platform will begin to provision your connector and start the initial bulk sync. After the bulk sync your updates will be moving a near-time speeds between MongoDB and Elasticsearch.
Now you can begin to query Elasticsearch or creating your own custom Kibana dashboards.