Warning: Declaration of c2c_ConfigureSMTP::options_page_description() should be compatible with C2C_Plugin_023::options_page_description($localized_heading_text = '') in /var/www/woitasen/wp-content/plugins/configure-smtp/configure-smtp.php on line 47
 Diego Woitasen | Infrastructure developer, DevOps Engineer, Linux and Open Source expert | Page 2

New page in my wiki knowledge base: 389 DS errors and solutions

In the last two or three years I worked a lot with 389 DS. I setup small installations and really big ones (more than 10.000 objects and 150 replica servers around the world). I decided to open a new page in my wiki to write down the errors and the solutios that I’ve found in this long way with DS. You can have a look here. Right now there is only one error and one solution, but I’ll try to fill it with more information gathered from my experience soon.

Post to Twitter

Another couple of notes about MongoDB

I forgot to mention another couple of thoughts about MongoDB in my last post. 

  1. MongoDB is fast, but there is no magic. When I first implemented it, I didn’t take care of indexes. I thought: “MongoDB is fast, I’ll setup indexes later.” Big mistake! Without indexes the performance was horrible. I have to rollback my deployment in production. 
  2. Padding. When you insert documents in your database, MongoDB adds them next to each other. If your document grows (for example, if you have an array and you add an element), Mongo must move the document to a new place. Document growing implies two tasks: document movement and indexes update (of course). If you add elements to a lot of documents at once, there is a lot of work, locking, reads that have to wait, etc. To avoid this, there is something called Padding. When Mongo detects that documents tend to grow, it adds extra space to them. So, when you insert a new element, movement won’t be needed. This padding is calculated using the padding factor, which is clearly explained in the official documentation. In my case, to be sure that performance will be good, I force padding after my first bulk import. I did it running “db.runCommand ( { compact: ‘collection’, paddingFactor: 1.1 } )“. That command adds a padding of 10% of each document and compacts the collection (eliminating wasted space).

Have a look at the log file, by default MongoDB writes a lot of interesting information there. For example, if you see update log entries with nmoves:1, that means that the update operation required the document to be moved.

Post to Twitter

A couple of notes from my first steps with MongoDB

I’m starting the process of migration from MySQL to MongoDB. I’ve heard and tried a lot of interesting things and it’s worth the migration. Async writes sounds very interesting for my app and schemaless is fantastic (goodbye “alter table” and migration scripts). I faced two problems, the first one is documented, the second not (or I wasn’t able to find it).

  1. Multi-key indexes only supports one array field. Mongo supports array fields, an special type in the document (record in the SQL world) that holds a list of items of any type. You can use them in an index, but only one at a time.
  2. Writing aggressively to indexed fields kills performance. I have a huge collection with more than 10M of documents (rows in the SQL world). In my applications I iterate over the docs every minute to do some processing and then I write the results to a field of the same doc. This field was part of an index (it’s an array).  I have to remove it from the index. After that, processing time was reduced from more than 1 minute to less than 10 seconds. It looks like the index update locks the table, blocking read operations too. I’m waiting for a confirmation from the mailing list about this.

Post to Twitter

Desktop Virtualization and IP addressing issue

My client’s problem

A bank, one of my clients, bought new hardware for their desktops some time ago. They have to run Windows 2000 on the branches because there is an critical application that doesn’t work on new versions. After they received the computers (around 700 units), they found that Windows 2000 isn’t compatible with the hardware. They are really new machines and the vendor doesn’t provide drivers for this old version of the Microsoft operating system. Porting the application to a new version is difficult and specially requires a lot of time.  Old computers are breaking from time to time and provisions for new hardware is urgent.

My client’s solution

The only solution that they found was to run Windows 2000 in the new machine virtualized. They install Linux, KVM and run the end user operating system over there. Hardware abstraction of KVM solves the problem, and Windows never sees the real hardware. This workaround works perfect. This may not be the best solution, but the other ones requires a lot of time.

The IP addressing problem

After finding this solution, they faced a new problem. Addressing. They use /24 subnets in the branches and big ones have more than 100 of desktops running. If they deploy, the virtualized desktop will double the required IPs per branch. One option is to change the IP addressing to support more IPs per branch, but that’s another big modification that requires time (IPs hardcoded in some apps, routers and firewalls configuration, etc, etc). It isn’t an option.

Linux hosts requires IP address because support techies will need to access to fix issues.

The solution to the IP addressing issue

The first measure to fix this issue was to configure every Linux with and IP addresss within the range 169.254.0.0/16, a special network range used for local communication between computers in the same network segment. All the branches will use the same subnet for the hosts. Connection between computers of every network branch is solved with this address, but connections from headquarters are impossible. This network isn’t routable.

So, another problem appears… how are support techies able to access the Linux hosts from the headquarters?

KVM uses a bridge to connect virtual machines to the physical network. With ebtables and iptables I’ve found a trick that permits connections to port 22 of the host using the IP address of the virtual machine. Let’s say that the VM uses the address 10.60.130.100 which is a valid address in the bank network. VM also has it’s own MAC address, for example 52:54:00:bf:57:bb. Have a look at this ebtables rule: 

ebtables -t nat -A PREROUTING -p arp --arp-opcode 1 --arp-ip-dst 10.60.130.100 -j arpreply --arpreply-mac 52:54:00:bf:57:bb

 This rule captures all the ARP requests asking for the IP address of Windows, generating a reply with the MAC address of the VM. So, ARP requests will be replied whether the Windows is running or not. This allows the packets to go through Linux always.

Now check this rule: 

iptables -t nat -A PREROUTING -i virbr0 -p tcp -d 10.60.130.100 --dport 22 -j REDIRECT

This is a typical REDIRECT rule, all the packets that have the IP address of the Windows machine and destination port 22 will be redirected to the Linux host.

Looks easy, right? But there are more work that needs to be done. In the default gateway of your network, you have to insert these rules:

 

iptables -t nat -A POSTROUTING -o eth0 -d 10.60.130.0/24 -p tcp --dport 22 -j SNAT --to 169.254.0.1 
iptables -t nat -A POSTROUTING -o ! eth0 -s 169.254.0.0/24 -o 10.0.0.0/8 -j SNAT --to 10.60.130.254

The first one is because the workstation only accepts traffic from local link network and the second one is to allow the machine to communicate with the rest of the world.

Post to Twitter

Qué hace alguien que usa software propietario en éste caso?

Hoy estaba programando una aplicación en Python usando Twisted. Tenía un bug que me llevó un rato encontrar porque estaba en la librería, no en mi aplicación. Era un simple error en una función de una clase, la cuál rescribí en una clase derivada, cortando y pegando. Fue simple de resolver, porque tenía el código ahí. Y si no lo tenía? Por eso pregunto que hace un tipo que labura con una librería cerrada en éstos casos. Para encontrar el problema, edité el código del módulo de Python, le puse un par de prints y al rato ya sabía que pasaba. Si hubiera sido cerrada, seguramente estaría todavía vueltas…

Por cierto, ya reporté el bug: http://twistedmatrix.com/trac/ticket/6212

Post to Twitter

Convert Rackspace Cloud Server images to OVA

I wrote an script to convert Rackspace Cloud Server Images to OVA files. This file can be imported to Vmware and Virtualbox (and may be other hypervisors).

You have to get a copy of .tgz files generated from Cloud Servers snapshots and then provide it as first argument of this script.

The script is here.

In the comments you can see the requirements and how to use it.

Feedback is welcome.

Post to Twitter

Load balance between source IPs in Linux

Today I received a question about how to distribute the outgoing connections between several IP addresses attached to an interface. Suppose that you have 3 IPs in the eth0 interface and you want to do round robin between that IPs for outgoing connections. With regular iproute commands you can’t. Doing some tricks with fwmarks, ip rule and ip route neither.

The only way that I’ve found to it is using SNAT and statistics to get a real Round Robin balance:

 

iptables -t nat -A POSTROUTING  -m statistic --mode nth --every 3 -j SNAT --to 192.168.1.201

iptables -t nat -A POSTROUTING  -m statistic --mode nth --every 2 -j SNAT --to 192.168.1.202

iptables -t nat -A POSTROUTING  -m statistic --mode nth --every 1 -j SNAT --to 192.168.1.203

The IPs described in the example should be local IPs.

Post to Twitter