Tag Archives: mongodb

Elastic MongoDB

A few months ago, I had spare time and I wanted to combine my experience in the last year with MongoDB and SaltStack. The result is Elastic MongoDB, an example of orchestration/automation to setup a MongoDB cluster with replication and sharding using Salt. The instructions to launch are in the README, it should be running in a few steps. There is only an issue, you have to run the salt \* state.highstate command three of four times to get everything working. That’s because there are some dependency between the boxes, so the setup should be in some order. By default, Salt applies all recipes  at the same time, generating failures. To fix this I have to use a feature of Salt called overstate, but I haven’t had time to implement it. 

I’ll fix when I have some time, but anyway you can try it. Have look here.

Fell free to ask any question!

Post to Twitter

Another couple of notes about MongoDB

I forgot to mention another couple of thoughts about MongoDB in my last post. 

  1. MongoDB is fast, but there is no magic. When I first implemented it, I didn’t take care of indexes. I thought: “MongoDB is fast, I’ll setup indexes later.” Big mistake! Without indexes the performance was horrible. I have to rollback my deployment in production. 
  2. Padding. When you insert documents in your database, MongoDB adds them next to each other. If your document grows (for example, if you have an array and you add an element), Mongo must move the document to a new place. Document growing implies two tasks: document movement and indexes update (of course). If you add elements to a lot of documents at once, there is a lot of work, locking, reads that have to wait, etc. To avoid this, there is something called Padding. When Mongo detects that documents tend to grow, it adds extra space to them. So, when you insert a new element, movement won’t be needed. This padding is calculated using the padding factor, which is clearly explained in the official documentation. In my case, to be sure that performance will be good, I force padding after my first bulk import. I did it running “db.runCommand ( { compact: ‘collection’, paddingFactor: 1.1 } )“. That command adds a padding of 10% of each document and compacts the collection (eliminating wasted space).

Have a look at the log file, by default MongoDB writes a lot of interesting information there. For example, if you see update log entries with nmoves:1, that means that the update operation required the document to be moved.

Post to Twitter

A couple of notes from my first steps with MongoDB

I’m starting the process of migration from MySQL to MongoDB. I’ve heard and tried a lot of interesting things and it’s worth the migration. Async writes sounds very interesting for my app and schemaless is fantastic (goodbye “alter table” and migration scripts). I faced two problems, the first one is documented, the second not (or I wasn’t able to find it).

  1. Multi-key indexes only supports one array field. Mongo supports array fields, an special type in the document (record in the SQL world) that holds a list of items of any type. You can use them in an index, but only one at a time.
  2. Writing aggressively to indexed fields kills performance. I have a huge collection with more than 10M of documents (rows in the SQL world). In my applications I iterate over the docs every minute to do some processing and then I write the results to a field of the same doc. This field was part of an index (it’s an array).  I have to remove it from the index. After that, processing time was reduced from more than 1 minute to less than 10 seconds. It looks like the index update locks the table, blocking read operations too. I’m waiting for a confirmation from the mailing list about this.

Post to Twitter