Remote office technology
With office build-out done, my next task was to implement the necessary technology to integrate with the main San Francisco office. Step 1 was easy -- I put together detailed specifications for my desired hardware and got quotes from the local computer stores. One vendor not only got me the right equipment but also delivered and installed everything (including the server cabinet).Afterwards, I spent roughly 2 weeks installing and finetuning software for optimal remote office integration.
Internet
In China, you can get internet from the telephone company, cable company or the mobile phone companies.
- From the telephone company, the internet is standard ADSL using Point-to-Point Protocol over Ethernet (PPPoE) This means almost any operating system, router or firewall will properly support it. You can also get T1 internet for roughly 1200rmb/mo.
- From the cable company, the hardware looks just like what you might see from Comcast. Unfortunately, they require special Network Address Translation (NAT) software that only runs on Windows XP or Vista. Upon plugging in the cable modem, you get an internal IP address in the 172.x.x.x range. The custom software then logs in and the NAT box at the cable company central office opens up access for you. This setup not only means you must use a Windows XP OS but some applications and services do not work behind NAT without special configs.
- From the mobile phone companies, you can get an USB dongle that provides 3G internet access. Again, as far as I know, the software they provide only works with Windows XP or Vista as a friend running Windows 7 could not get it to work. In addition, you pay for 3G internet usage by the minute.
Virtual Private Network (VPN)
To create a private computer network on top of a public transport (the Internet in most cases), the standard technology is Virtual Private Neytworks (VPN). And the VPN solution we have standardized on is OpenVPN. It is an open source project based on Secure Socket Layer (SSL) or the same software and algorithms that is used to encrypted web transactions. The price was free and the implementation easier than the other two major competitors -- Point-to-Point Tunneling Protocol (PPTP) and Internet Protocol Security (IPSec). Both PPTP and IPSec use far more complex software and depend on GRE packets meaning they either must run at the firewall or the firewall must have special support for it. By comparison, OpenVPN uses a single port (either TCP or UDP) which allows for straightforward firewall configuration.
Now buying a hardware firewall with built-in VPN would have been easier to implement but costlier and less flexible. Because we have multiple data centers across the U.S., we have multiple VPN connections from each point-to-point. We then can change traffic routing dynamically if one connection goes down. While it will be slower to zig-zag to a farther data center first before heading back to your closer one, at least the connections can still be maintained. We looked at hardware firewalls and none of the cheaper firewalls (under $10K) had the flexibility to support dynamic routing over multiple VPN networks. So for our requirements, it was better to use a software solution running on rock-solid hardware.
There is a problem in OpenVPN with either slower remote connections or asynchronous upload/download transfer rates -- or a combination of both. Certain protocols like SMTP (email send) and IMAP (email read) make all OpenVPN traffic slow down to a crawl. This means when I download big attachments off our email server across the world, the VPN link seems to "lock" up for about 30 seconds. However, it does resume after a while. Disabling compression alleviates some of the problems but not all. I've googled up a few links saying it might be a packet fragmenting issue but have not narrowed it down yet. Moving our SMTP server to our local network solves half the problem but moving IMAP to our remote office cannot be done without making an email sub-domain. It's annoying but not fatal. (If anybody has similar experience with OpenVPN, please post a reply with your observations and solutions.)
Services
Besides seamless network connections between offices, every office needs the typical range of network services. The following is the list implemented and the packages used:
- Firewall - iptables
- DHCP - dhcpd
- Name Services (DNS) - bind
- File Sharing - samba
- Login Authentification - samba (domain controller)
- FTP - vsftp
- Web Server - apache
- Email - exim
In the past, people would separate out services into separate computers to avoid having a single point of failure. The current practice is to use a single beefy machine who's primary purpose is to host other "machines". The overall technology is called Virtual Machines while emulating hardware to host operating systems is usually referred to as Hardware Virtualization. By using HVM software, separate servers can run under a single host as if they were separate machines.
This means for the China office, the server hardware consists of a single machine with RAID storage running Linux KVM. I then created separate VM guests running Centos 5.4 to handle internet/firewall/vpn and office services. Not only does this save on computer hardware costs but it makes maintenance far easier. If the hardware ever died, all I'd need to do is get any machine running a newish version of Linux with KVM and copy over the VM images. (KVM even has an option to freeze and transfer over a live VM guest -- however, you really need 10gige for images to transfer fast enough.)
Note that even the internet connection is hooked up to a VM guest -- the VM host exposes the underlying network interfaces as bridges and routes traffic from/to the VM guests. In theory, I could use this technique to host a Windows XP VM running the custom cable modem or 3G usb dongle software with internet sharing enabled. Then I would add a default route on my router VM to send traffic to the Windows XP VM. This would allow my remote office computers to share a single internet connect even though those cable/3G alternatives require custom software.
Communications
For communications, I first experimented with using Ekiga. From my home laptop, it worked fine but from the new office, I'd see error messages about ports being blocked. While I could have found out the ports to unblock, this probably meant only a single computer behind a firewall could connect to an Ekiga server on the outside. However, since Ekiga uses open standards, I was able to install OpenSER into an internal server to act as our own Voice over Internet Protocol (VOIP) server. While this solution worked well at a technology level for voice and video, the practical issue of putting Ekiga on every machine was hard to overcome. When calling internal staff, we'd have to use Ekiga but then use regular phones for external communications. Hence, we limit Ekiga to only video conferencing.
For voice communications, the San Francisco office continues to use regular phone lines but our China office uses Skype with the annual pre-paid U.S. calling package. For about $60 annually per line, employees at our China office have unlimited outbound calling. For another $30, we can give them U.S. phone numbers for inbound calling.
The majority of our communications though is via text messaging. Instead of using the typical instant messaging clients that connect one person to another, we instead have an internal Internet Relay Chat (IRC) server where everybody "chats" in the same "room". We have 2 public rooms, one for work topics and one for non-work topics. Instead of calling people and possibly interrupting whatever they're doing, we usually leave questions and comments on IRC for them to get around to when they have time. In addition, because these messages are public to all employees, everybody can contribute to the discussion if they have knowledge on that topic.
Printing/Scanning
For printing and scanning, I obtained a Brother MFC-7840N. (The Brother link shows the 7840W which includes wireless -- only the wired version is sold in China.) It is the typical multi-function laser printer that does printing, scanning and faxing. The key I liked most was the ability to scan to PDFs and upload to a FTP server. Most MFC competitors only support scanning to email. This means the printer actually sends the PDF to an outside email server and then you pull that file off your email back down. If your email server close by on a fast connection, this system works ok but with an email server across the Pacific, it is a total hassle. Instead, I looked for MFCs that could scan directly either to FTP or SMB. I found a list of several and this printer was the best and cheapest option available.
Work flow
The final piece of the puzzle is the custom work flow application we have developed internally. By breaking up work into steps, the China office can look for tasks and projects that have hit the "ready to be tested" stage, do their work and either send back for more changes or mark as done. Without such an application, the work flow would be much more intrusive requiring far more coordination between offices.
Summary
All-in-all, it took me roughly a week to put together the initial software packages and then another week of monitoring and fine-tuning the VPN failover scripts. By sticking with open source solutions where ever possible, the cost of the technology was minimal. Now some people will say "open source is only free if your time is free". I've found though that working with open source not only keeps my skills sharp but gives me the practice needed to be prepared for real emergencies. In addition, knowing all the nitty-gritty details gives me a full understanding of the technology versus simply parroting a white paper. And it is the rapid pace of technology that makes remote offices possible even for small companies. A decade ago, a small company would not have even considered a remote office much less an overseas one due to work flow and integration issues.
(Filed in technology)
While I haven't figured out the source of the OpenVPN problems, I came upon a workaround. By requiring TLS encryption and routing IMAP traffic over the external interface, the traffic that triggers the problems no longer is tunnelled.