Planet Sysadmin               

          blogs for sysadmins, chosen by sysadmins...
(Click here for multi-language)

September 01, 2014

Everything Sysadmin

TPOCSA Book Tour announcement!

I'm excited to announce my "book tour" to promote The Practice of Cloud System Administration, which starts shipping on Friday, September 5!

I'll be speaking and/or doing book signings at the following events. More dates to be announced soon.

This book is the culmination of 2 years of research on the best practices for modern IT / DevOps / cloud / distributed computing. It is all new material. We're very excited by the early reviews and hope you find the book educational as well as fun to read.

I'd be glad to autograph you copy if you bring it to any of these events. (I have something special planned for ebook owners.)

Information about the book is available on Read a rough draft on Safari Books Online. For a limited time, save 35% by using discount code TPOSA35 on

I look forward to seeing you at one or more of these events!

September 01, 2014 09:00 PM

Rands in Repose

The Wolf

You’ve heard of the 10x engineer, but I am here to tell you about the Wolf. They are an engineer and they consistently exhibit the following characteristics:

  • They appear to exist outside of the well-defined process that we’ve defined to get things done, but the appear to suffer no consequences for not following these rules.
  • Everyone knows they’re the Wolf, but no one ever calls them the Wolf.
  • They have a manager, but no one really knows who it is.
  • They have a lot of meetings, but none of them are scheduled. Inviting them to your meeting is a crap shoot.
  • They understand how “the system” works, they understand how to use “the system” to their advantage, they understand why “the system” exists, but they think “the system” is a bit of a joke.
  • You can ask a Wolf to become a manager, but they’ll resist it. If you happen to convince them to do it, they will do a fine job, but they won’t stay in that role long. In fact, they’ll likely quit managing when you least expect it.
  • Lastly, and most importantly, the Wolf generates disproportionate value for the company with their unparalleled ability to identify and rapidly work on projects essential to the future of the company.

The Wolf moves fast because he or she is able to avoid the encumbering necessities of a group of people building at scale. This avoidance of most things process related combined with exceptional engineering ability allows them to move at speed which makes them unusually productive. It’s this productivity that the rest of the team can… smell. It’s this scent of pure productivity that allows them to further skirt documentation, meetings, and annual reviews.

It’s easy to hate the Wolf when you’ve just spent the day writing integration tests, but it’s also easy to admire the fact that they appear to be dictating their own terms.

In my career, I’ve had the pleasure of the working with a handful of Wolves. They appreciate that I have identified them as such and we have interesting ongoing conversations regarding their Wolf-i-ness. Two times now, I’ve attempted to reverse engineer Wolves and then hold up the results to other engineers. See? Here is a well-defined non-manager very technical track. Both attempts have mostly failed. The reason was the same both times: the influence earned by the Wolf can never ever be granted by a manager.

The Wolf doesn’t really need me. In fact, the Wolf is reading this right now and grinning because he or she knows that I’ve done an ok job describing them – there is a chance this description may help inspire future Wolves, but what really matters… is what they’re working on right now.

by rands at September 01, 2014 04:26 PM

Ubuntu Geek

Ubuntu 14.10 (Utopic Unicorn) beta-1 released

Sponsored Link
The first beta of the Utopic Unicorn (to become 14.10) has now been released!

This beta features images for Kubuntu, Lubuntu, Ubuntu GNOME,UbuntuKylin, Xubuntu and the Ubuntu Cloud images.

Pre-releases of the Utopic Unicorn are *not* encouraged for anyone needing a stable system or anyone who is not comfortable running into occasional, even frequent breakage. They are, however, recommended for Ubuntu flavor developers and those who want to help in testing,reporting and fixing bugs as we work towards getting this release ready.
Read the rest of Ubuntu 14.10 (Utopic Unicorn) beta-1 released (312 words)

© ruchi for Ubuntu Geek, 2014. | Permalink | No comment | Add to
Post tags: , ,

Related posts

by ruchi at September 01, 2014 02:53 PM

Yellow Bricks

VMware / ecosystem / industry news flash… part 2

There we go, part two of the VMware / ecosystem / industry news flash. I expected a lot of news around VMworld as traditionally is always the case. I hope the below is a good summary, these are the articles / announcements I read and found interesting. It is the Monday after VMworld and I figured I would get this out there as I will be out for most of this week to recover.

  • Maginatics: A Virtual Filer for VMware’s Virtual SAN
    Last week I mentioned the Nexenta solution for VSAN… this week Maginatics is up. They also announced it last week, but somehow it fell through the cracks so I figured I would list it this week. MSCP offers a distributed file system with global deduplication, multiple caching layers and Content Distribution Network logic build in.
  • VMware EVO:RAIL was of course all over the news, with these being my fav posts Chris Wahl, Julian Wood, Dell, Chad Sakac)
    Do I really need to comment on this one? I am hoping everyone read my blog… Also, make sure to watch the demo!
  • Infinio announced version 2.0 of their acceleration platform
    A whole bunch of announcements around the 2.0 version of Infinio Acellerator. Support for Fibre Channel, iSCSI and FCoE is probably the biggest piece of functionality added. On top of that the extension of the monitoring / reporting section is very handy for those who want to tweak based on latency / IO information you will be able to do so. There are some more features announced, make sure to read the announcement for the full details.
  • VMware joins Open Compute Project
    I was surprised about this announcement, did not know it was coming… but I am very excited. The OCP solution is interesting as it is highly optimized around efficiency / power consumption / rack units etc. I have looked at some of the configurations for Virtual SAN but the problem I saw was hardware compatibility / support. Hopefully with this announcement these constraints will be lifted soon! Definitely one I will be following with a lot of interest!
  • Nutanix announced a new round of funding: 140 million
    What more can I say than: Congratulations! Hyper-converged infrastructure is hot, and Nutanix has a compelling solution for sure. 140 million (series e) is significant, and I guess they are on their way to an IPO (rumours have been floating around for months now).

That was it for now.

"VMware / ecosystem / industry news flash… part 2" originally appeared on Follow me on twitter - @DuncanYB.

Pre-order my upcoming book Essential Virtual SAN via Pearson today!

by Duncan Epping at September 01, 2014 07:32 AM

Chris Siebenmann

We don't believe in DHCP for (our) servers

I understand that in some places it's popular for servers to get their IP addresses on boot through DHCP (presumably or usually static IP addresses). I understand the appeal of this for places with large fleets of servers that are frequently being deployed or redeployed; to put it one way I imagine that it involves touching machines less. However it is not something that we believe in or do in our core network, for at least two reasons.

First and lesser, it would add an extra step when setting up a machine or doing things like changing its network configuration (for example to move it to 10G-T). Not only would you have to rack it and connect it up but you'd also have to find out and note down the Ethernet address and then enter it into the DHCP server. Perhaps someday we will all have smartphones and all servers that we buy will come with machine readable QR codes of their MACs, but today neither is at all close to being true (and never mind the MACs of ports on expansion cards).

(By the way, pretty much every vendor who prints MACs on their cases uses way too small a font size.)

Second and greater, it would add an extra dependency to the server boot process. In fact it would add several extra dependencies; we'd be dependent not just on the DHCP server being up and having good data, but also on the network path between the booting server and the DHCP server (here I'm thinking of switches, cables, and so on). The DHCP server would become a central point of near total failure, and we don't like having those unless we're forced into it.

(Sure, we could have two or more DHCP servers. But then we'd have to make very sure their data stayed synchronized and we'd be devoting two servers to something that we don't see much or any advantage from. And two servers with synchronized data doesn't protect against screwups in the data itself. The DHCP server data is a real single point of failure where a single mistake has great potential for serious damage.)

A smaller side issue is that we label our physical servers with what host they are, so assigning IPs (and thus hostnames) through DHCP creates new and exciting ways for a machine's label to not match the actual reality. We also have their power feeds labeled in the network interfaces of our smart PDUs, which would create similar possibilities for exciting mismatches.

by cks at September 01, 2014 01:41 AM

August 31, 2014

Rands in Repose

You are Bad at Giving Technical Interviews

Laurie Voss via Quartz:

You are looking for grasp of complex topics and the ability to clearly communicate about them, which are the two jobs of the working engineer.


by rands at August 31, 2014 04:12 PM

Server Density

Yellow Bricks

Liking the VMware EVO:RAIL look? How about a desktop / phone wallpaper?

Dave Shanley (lead engineer for VMware EVO:RAIL) dropped me an email with an awesome looking wallpaper for desktops and smart phones. I asked him if I could share with the world and I guess it is needless to say what the answer was. Grab ‘em below while they are still hot :). Thanks Dave! Note, that each pic below links (so click it) to Flickr with various resolutions available!

Desktop wallpaper:

Smart phone (optimized for iPhone 5s):

"Liking the VMware EVO:RAIL look? How about a desktop / phone wallpaper?" originally appeared on Follow me on twitter - @DuncanYB.

Pre-order my upcoming book Essential Virtual SAN via Pearson today!

by Duncan Epping at August 31, 2014 10:52 AM

Chris Siebenmann

The downside of expanding your storage through bigger disks

As I mentioned recently, one of the simplest ways of expanding your storage space is simply to replace your current disks with bigger disks and then tell your RAID system, file system, or volume manager to grow into the new space. Assuming that you have some form of redundancy so you can do this on the fly, it's usually the simplest and easiest approach. But it has some potential downsides.

The simplest way to put the downsides is that this capacity expansion is generally blind and not so much inflexible as static. Your storage systems (and thus your groups) get new space in proportion to however much space (or how many disks) they're currently using, and that's it. Unless you already have shared storage, you can't reassign this extra new space from one entity to another because (for example) one group with a lot of space doesn't need more but another group with only a little space used now is expanding a lot.

This is of course perfectly fine if all of your different groups or filesystems or whatever are all going to use the extra space that you've given them, or if you only have one storage system anyways (so all space flowing to it is fine). But in other situations this rigidity in assigning new space may cause you heartburn and make you wish to reshape the storage to a lesser or greater amount.

(One assumption I'm making is that you're going to do basically uniform disk replacement and thus uniform expansion; you're not going to replace only some disks or use different sizes of replacement disks. I make that assumption because mixed disks are as much madness as any other mixed hardware situation.)

by cks at August 31, 2014 04:35 AM

August 30, 2014

Everything Sysadmin

Tom speaking at LOPSA-NJ September meeting (near Princeton/Trenton)

I'll be the speaker at the September LOPSA-NJ meeting. My topic will be the more radical ideas in our new book, The Practice of Cloud System Administration. This talk is (hopefully) useful whether you are legacy enterprise, fully cloud, or anywhere in between.

  • Topic: Radical ideas from The Practice of Cloud Computing
  • Date: Thursday, September 4, 2014
  • Time: 7:00pm (social), 7:30pm (discussion)

This is material from our newest book, which starts shipping the next day. Visit for more info.

August 30, 2014 04:28 PM

Aaron Johnson

Three Peaks Challenge

The family and I moved to England (Reading, which is a 25 minute train ride west of London) back in December of last year (2013) and one of the things on my list for our time here was a climb of some sort. Somehow I stumbled across the Three Peaks Challenge (which you can read about on wikipedia) and decided that I had do to it before we left and went back to the US. I tried to get a group of people at work to sign up as a team since that’d be the cheapest route and it would have been more fun but that turned out to be pretty difficult to get people to commit to, turns out for good reason. I ended up signing up last Tuesday night for a trip that happened this past weekend (August 22-23) through a site that offers trips once every couple weeks, bought a plane ticket to Glasgow and got (I think) the last hotel room in Fort William for Friday night.

The organizers picked me up in Glasgow at the airport with one other guy (Nervil?) and then we piled into the bus for a couple hour ride to Fort William. We drove through Glencoe (which is beautiful, some day I’d love go back and just walk around the mountains there) and then after tailing a really really really slow driver for about 30 minutes, finally made it into Fort William. I got a taxi to my hotel (Croit Anna Hotel, which was very quaint), had a nice dinner and got ready for the next day.

I took another taxi back to the Morrison’s parking lot in the morning after taking a panorama of the lake that the hotel was sitting right next to:

and then picked up (IIRC) about 10 liters of water and sat in the parking lot waiting for my van to come. I was the first one there so when the van arrived I got to pick out my seat, which was important because the van was packed to the gills with people. I got a single seat in the very back with a bunch of room to stretch out my legs and then after a brief walk through of the next 24 hours and introductions, we were off.

A short 10 minute drive later and we arrived at the base of Ben Nevis, where our first climb would start and end. I ended up bringing up the rear and chatting with what the team called a “trailer” or a “sweeper”, basically a guy that they had hired locally to make sure that any stragglers were taken care of. He was a super nice guy, worked for the local fire department and did this type of work on the weekends for fun. 15 minutes into the hike he pointed out a small valley where apparently a bunch of the filming for the movie Braveheart was filmed which of course I had to take a picture of because, freedom, or rather “FREEDOM!“:

Ben Nevis is a beautiful mountain and because it a bank holiday weekend, the mountain was packed with people, which made for slow going at some points but I don’t think we (as a group) stopped a single time on the way up. The leaders set a fantastic pace and for the most part every single one of the 16 people on the trip was in great shape. I barely had time to break into my backpack for food and for a jacket when it started getting cold but man… the views:

were like that most of the way up (if you took a breather and looked behind you on the trail). The weather was perfect: cool, but not too cold and the only rain I got on this mountain was a sprinkle when I was about 10 minutes away from the van and then it stopped and got sunny. But then the scenery started to change and the weather changed with it:

It got a bit cold up there, in fact cold enough that as we were starting our descent it started to snow, just teeny little flakes barely floating out of the sky. I stayed at the top for about 10 minutes, took a couple pictures, ate some dried fruit and granola and then headed back down. The views going down were just as amazing:

At this point I wasn’t really with anyone in the group so I figured I’d try and go faster on the descent which worked out alright for about a mile or two and then I kicked a rock and took a nosedive, luckily not breaking a wrist in the process. From there on out I slowed it down a bit but I think I ended up the 3rd or 4th person off the mountain. Total time: 4:38 for 10.5 miles and 4435 feet elevation gain.

The tour company made everyone some coffee or tea and we were encouraged to stretch out and changed clothes before heading back into the van. I took my boots and socks off and immediately changed into my slippers (one of the guys on the South Sister climb I did last year recommended having a pair of flip flips or slippers that you can change into after a hike and YES, highly recommended) which saved my feet from having to cooped up in boots for the 6 hour drive down to the next mountain. Turns out my feet, at least the bottom of my feet, wasn’t the thing I needed to worry about, more on that later.

The 6 hour drive to Scafell was brutal, I didn’t have enough food so at one of the stops I saw one of the guys at Burger King getting a Whopper, no cheese. At first I thought he might be crazy but then thought, “hey, I need food in my body badly” and so I ordered a Double Whopper, no cheese, no fries… and man that tasted good.

Some number of barely tolerable hours later we arrived at the base of Scafell in the pitch black. I thought my legs were going to be really sore but I’m pretty sure the adrenaline started taking over because I wasn’t all that sore (except for my ankles, see below) and hiking again actually felt good. I think we started sometime around 10:30 or 10:45pm but I was tired enough that I forgot to write it down. :) This hike was kind of a blur, mostly because it was pitch black (we all had our “torches” on) and then once we got to the top, really windy and cold. I got one picture from this hike (that wasn’t completely black):

and finished the hike in 3:50 but we did this one as a group since the chance of one person getting lost on the mountain going down by themselves in the dark at 2am was pretty high. Best estimates for this one say that it’s about a 6 mile hike (in and back) and a 3200 foot elevation gain and as a group we did it in 3:50.

After another change of clothes, some coffee (fresh brewed by our crew) and a change of clothes we were back in the van for another multiple hour drive down to the Snowdon. I lost track of time (3am I think?) but I know that we eventually got to Snowdon around 8am and got started up the mountain at 8:30am. I also remember stopping at a rest stop somewhere in the middle of nowhwere and barely being able to walk and buying a box of Ibuprofen and something to drink, which did help the aches in my knees and ankles a bit.

Snowdon was / is beautiful mountain as well, just 15 minutes into the hike (we started from the Pen-y-Pass car park and took the Pyg Track) the view looked like this looking back to the car park:

and just got better from there. We ended up getting up to the peak by 10:30am (didn’t complete all the peaks top to bottom in 24 hours, mostly because our van was pretty slow and we got delayed on the first hike by someone that didn’t make it down the mountain as quickly as everyone else) but then the 24 hour time limit is completely arbitrary anyway, the important thing… well the important thing is that you try to and then if possible, complete the challenge, no matter what the time. As with most things in life, it’s a mental challenge even more than a physical one, although it really helps to be in shape if you attempt to climb all three in 24 hours. There were multiple times when I was sitting in the back of the bus, with my ankles and knees and feet aching, thinking that it’d be so much easier to be back at home, drinking a beer, watching TV but in hindsight and I think I’ve realized this more and more as I get older, the mental rewards from attempting and then hopefully completing something like this vastly and dramatically outweigh any mental benefits of watching TV for 2 hours on a Saturday night. The feeling that you get when you’re standing on a trail looking up a mountain like this:

that you know you’re only about 30 minutes from being done with is pretty amazing. Our group got the top, like I mentioned, at about 10:30am, fully 24 hours after starting Ben Nevis the day before and we were beat but not beaten:

A grueling hike down that felt like it would never end and then we were back at van and each going our separate ways. I was and still am thrilled that I got to go and would love to find something like this to do once or twice a year once we get back to the US.

All in, the trip is supposed to be 27 miles and 9800 feet of elevation gain although adding up the numbers above I don’t see it. Either way, I’d do it again in a heartbeat. You should too.

Full set of pictures here:

Oh yeah, one last thing: I’ve got an amazing pair of hiking boots from Asolo but for some reason my ankles are NOT a big fan of said boots. I’ve been on a number of hikes with them now so I think they’re broken in (but maybe not?) but after each hikes the area around the outside of my ankles (both sides) hurt a bunch. Definitely was not the result of a twist, pretty sure there’s some kind of reinforcement in the boot that’s not jiving with the shape of my ankle, which looked like this on Sunday:

Not sure what to do on my next hike… tape? Different boots? I had / have good thick wool hiking socks, not sure how to fix this.

by ajohnson at August 30, 2014 03:53 PM

Chris Siebenmann

How to change your dm-cache write mode on the fly in Linux

Suppose that you are using a dm-cache based SSD disk cache, probably through the latest versions of LVM (via Lars, and see also his lvcache). Dm-cache is what I'll call an 'interposition' disk read cache, where writes to your real storage go through it; as a result it can be in either writethrough or writeback modes. It would be nice to be able to find out what mode your cache is and also to be able to change it. As it happens this is possible, although far from obvious. This procedure also comes with a bunch of caveats and disclaimers.

(The biggest disclaimer is that I'm fumbling my way around all of this stuff and I am in no way an expert on dm-cache. Instead I'm just writing down what I've discovered and been able to do so far.)

Let's assume we're doing our caching through LVM, partly because that's what I have and partly because if you're dealing directly with dm-cache you probably already know this stuff. Our cache LV is called testing/test and it has various sub-LVs under it (the original LV, the cache metadata LV, and the cache LV itself). Our first job is to find out what the device mapper calls it.

# dmsetup ls --tree
testing-test (253:5)
 |-testing-test_corig (253:4)
 |  `- (9:10)
 |-testing-cache_data_cdata (253:2)
 |  `- (8:49)
 `-testing-cache_data_cmeta (253:3)
    `- (8:49)

We can see the current write mode with 'dmsetup status' on the top level object, although the output is what they call somewhat hard to interpret:

# dmsetup status testing-test
0 10485760 cache 8 259/128000 128 4412/81920 16168 6650 39569 143311 0 0 0 1 writeback 2 migration_threshold 2048 mq 10 random_threshold 4 sequential_threshold 512 discard_promote_adjustment 1 read_promote_adjustment 4 write_promote_adjustment 8

I have bolded the important bit. This says that the cache is in the potentially dangerous writeback mode. To change the writeback mode we must redefine the cache, which is a little bit alarming and also requires temporarily suspending and unsuspending the device; the latter may have impacts if, for example, you are actually using it for a mounted filesystem at the time.

DM devices are defined through what the dmsetup manpage describes as 'a table that specifies a target for each sector in the logical device'. Fortunately the tables involved are relatively simple and better yet, we can get dmsetup to give us a starting point:

# dmsetup table testing-test
0 10485760 cache 253:3 253:2 253:4 128 0 default 0

To change the cache mode, we reload an altered table and then suspend and resume the device to activate our newly loaded table. For now I am going to just present the new table with the changes bolded:

# dmsetup reload --table '0 10485760 cache 253:3 253:2 253:4 128 1 writethrough default 0' testing-test
# dmsetup suspend testing-test
# dmsetup resume testing-test

At this point you can rerun 'dmsetup status' to see that the cache device has changed to writethrough.

So let's talk about the DM table we (re)created here. The first two numbers are the logical sector range and the rest of it describes the target specification for that range. The format of this specification is, to quote the big comment in the kernel's devices/md/dm-cache-target.c:

cache <metadata dev> <cache dev> <origin dev> <block size>
      <#feature args> [<feature arg>]*
      <policy> <#policy args> [<policy arg>]*

The original table's ending of '0 default 0' thus meant 'no feature arguments, default policy, no policy arguments'. Our new version of '1 writethrough default 0' is a change to '1 feature argument of writethrough, still the default policy, no policy arguments'. Also, if you're changing from writethrough back to writeback you don't end the table with '1 writeback default 0' because it turns out that writeback isn't a feature, it's just the default state. So you write the end of the table as '0 default 0' (as it was initially here).

Now it's time for the important disclaimers. The first disclaimer is that I'm not sure what happens to any dirty blocks on your cache device if you switch from writeback to writethrough mode. I assume that they still get flushed back to your real device and that this happens reasonably fast, but I can't prove it from reading the kernel source or my own tests and I can't find any documentation. At the moment I would call this caveat emptor until I know more. In fact I'm not truly confident what happens if you switch between writethrough and writeback in general.

(I do see some indications that there is a flush and it is rapid, but I feel very nervous about saying anything definite until I'm sure of things.)

The second disclaimer is that at the moment the Fedora 20 LVM cannot set up writethrough cache LVs. You can tell it to do this and it will appear to have succeeded, but the actual cache device as created at the DM layer will be writeback. This issue is what prompted my whole investigation of this at the DM level. I have filed Fedora bug #1135639 about this, although I expect it's an upstream issue.

The third disclaimer is that all of this is as of Fedora 20 and its 3.15.10-200.fc20 kernel (on 64-bit x86, in specific). All of this may change over time and probably will, as I doubt that the kernel people consider very much of this to be a stable interface.

Given all of the uncertainties involved, I don't plan to consider using LVM caching until LVM can properly create writethrough caches. Apart from the hassle involved, I'm just not happy with converting live dm-cache setups from one mode to the other right now, not unless someone who really knows this system can tell us more about what's really going on and so on.

(A great deal of my basic understanding of dmsetup usage comes from Kyle Manna's entry SSD caching using dm-cache tutorial.)

Sidebar: forcing a flush of the cache

In theory, if you want to be more sure that the cache is clean in a switch between writeback and writethrough you can explicitly force a cache clean by switching to the cleaner policy first and waiting for it to stabilize.

# dmsetup reload --table '0 10485760 cache 253:3 253:2 253:4 128 0 cleaner 0' testing-test
# dmsetup wait testing-test

I don't know how long you have to wait for this to be safe. If the cache LV (or other dm-cache device) is quiescent at the user level, I assume that you should be okay when IO to the actual devices goes quiet. But, as before, caveat emptor applies; this is not really well documented.

Sidebar: the output of 'dmsetup status'

The output of 'dmsetup status' is mostly explained in a comment in front of the cache_status() function in devices/md/dm-cache-target.c. To save people looking for it, I will quote it here:

<metadata block size> <#used metadata blocks>/<#total metadata blocks>
<cache block size> <#used cache blocks>/<#total cache blocks>
<#read hits> <#read misses> <#write hits> <#write misses>
<#demotions> <#promotions> <#dirty>
<#features> <features>*
<#core args> <core args>
<policy name> <#policy args> <policy args>*

Unlike with the definition table, 'writeback' is considered a feature here. By cross-referencing this with the earlier 'dmsetup status' output we can discover that mq is the default policy and it actually has a number of arguments, the exact meanings of which I haven't researched (but see here and here).

by cks at August 30, 2014 04:56 AM

August 29, 2014

Yellow Bricks

Day 3 and 4 report #VMworld

I was planning on writing up a Day 3 at the end of Wednesday but ended up at the VMworld party instead. Lets start off with Wednesday by itself though. I had a couple of sessions scheduled. A session on VMware EVO:RAIL in a slightly bigger room than the session on Tuesday (6x as big) and a panel session. The EVO:RAIL session went great in my opinion. Lots of good feedback after the session and a couple of great questions. It looks to me that everyone understands the value of a Hyper-Converged Infrastructure Appliance offering from VMware and OEM partners. The simplicity of management and configuration, the “one stop shop” approach for procurement and support, and the auto-scale out functionality are just some of the key things that were mentioned as value add! Judging by the number of people that showed up to the session and did the lab and visited the zone… VMware EVO:RAIL will be a success.

After this session I walked around the solutions exchange for a bit just to wind down and then I had the panel session with Rick, William, Scott and Chad. Funnily enough the category of questions we received was different this time. Still many EVO:RAIL questions, but a bit less NSX and a bit more generic questions around home labs etc. Derek was there again and he did a great job of capturing most of the Q&A, make sure to read that.  Good attendance, good audience participation… Lets hope it will be the same in EMEA!

VMworld party… What can I say other than: I am not a big fan of the Black Keys. All songs sounded the same, and I only really liked Lonely Boy. Judging though by the crowd going wild it probably is just me.

Thursday was a busy day for me with customer meetings. Some great conversations around the future of the datacenter, the state of converged infrastructures, practical discussions around distributed switch failure domains, stretched clustering and of course containers. I also managed to sit in on a session by Christian Dickmann on troubleshooting Virtual SAN. If you missed that one, make sure to watch the video as it had some interesting tips. Especially some of the troubleshooting commands for RVC. At the end of the day I hung out with Cormac, Fred Nix and Rawlinson, had a couple of drinks and somehow ended up at the DNA Lounge watching Corrosion of Conformity. Nice way to end VMworld to be honest, away from everything related to technology… just some nice noisy music too unwind :)

Now what? Well I am going to take a couple of days off just to get some rest… and then the preparation for VMworld EMEA will start I guess. If you have not registered yet, do so before it is too late. I promise it will be an interesting event!

I want to thank everyone who took the time to say “hi” at VMworld, I truly appreciate it and always love to hear feedback on material I wrote or just have a general conversation about what challenges people are having in their datacenters. It is always nice to know who reads your stuff. Also, it was great meeting up with many friends whom I had not seen in a long time.

Once again, great event… Thanks VMworld team for putting up another great show, and lets repeat this in Barcelona!

"Day 3 and 4 report #VMworld" originally appeared on Follow me on twitter - @DuncanYB.

Pre-order my upcoming book Essential Virtual SAN via Pearson today!

by Duncan Epping at August 29, 2014 06:31 PM

Everything Sysadmin

Save up to 55% off on our new book for a short time

The Practice Of Cloud System Administration is featured in the InformIT Labor Day sale. Up to 55% off!

Here is the link

Buy 3 or more and save 55% on your purchase. Buy 2 and save 45%, or buy 1 and save 35% off list price on books, video tutorials, and eBooks. Enter code LABORDAY at checkout.

You will also receive a 30-day trial to Safari Books Online, free with your purchase. Plus, get free shipping within the U.S.

The book won't be shipping until Sept 5th, but you can read a recent draft via Safari Online.

August 29, 2014 04:28 PM

Google Blog

Through the Google lens: search trends August 22-28

It was a busy week for entertainment junkies with the Emmys and VMAs, and the cat was out of the bag for Sanrio fans after a surprising piece of news. Read on for more on the last week in search:

And the Emmy goes to…
Though Breaking Bad took home the top honors at Monday’s Emmy Awards, people searched less for the acclaimed drama than for some of the event’s other, more unexpected happenings. American Horror Story’s Jessica Lange proved she’s still got it—she was the top search of the night. Meanwhile, Hayden Panettiere accidentally revealed the gender of her forthcoming baby, leading people to search for information about the actress and her fiancé Wladimir Klitschko. And it was a night of funny women: Julia Louis-Dreyfus did justice to her award for best actress in a comedy with a Seinfeld-inspired bit on stage… and a Seinfeld-throwback kiss just offstage; and Sarah Silverman won an award for best variety special (and showed off some unusual accessories). Other popular Emmys searches included HBO’s The Normal Heart, which was nominated for 16 awards and won two, and True Detective, which won for directing but did not capture the acting awards some expected.
I want my MTV
The other awards show making news this week was MTV’s Video Music Awards. As can only be expected at this point, Beyoncé’s performance was the highlight of the night; the day after the show, there were more than 50,000 searches for [beyonce vma performance] as people scrambled to re-live (or catch up with) the spectacle. But part of Bey’s appeal this time was actually her daughter, Blue Ivy, who appeared on stage (as well as in multiple GIFs, natch) to steal the show like only an adorable child can. Searchers were dazzled by performances by Ariana Grande (in a crystal onesie), Rita Ora (with diamonds in her manicure) and Iggy Azalea. Finally, Katy Perry and Riff Raff’s double denim red carpet tribute to that VMA power couple of the past, Justin Timberlake and Britney Spears, had people giggling—and searching.
Trouble out west
After a nine-year-old in Arizona accidentally shot and killed her shooting instructor with an Uzi, people came to Google to learn more about the incident, which has sparked debates throughout the country. And the largest earthquake to hit the San Francisco Bay Area in 20+ years shook up Napa and surrounding counties this weekend, leading people to the web to learn more about the damage.

Raining [searches for] cats and dogs
Sanrio fans worldwide got some startling news this week: Hello Kitty is not a kitty. According to the Japanese company, she is a little girl. Whatever her species, she was a top trend in search this week. And for those of you who aren’t cat fans (in which case, do you even like the Internet?), there was National Dog Day, Tuesday’s top search and—if you ask us—a great excuse for thousands of people to share photos of their own favorite man’s best friend.

Tip of the week
Don’t let delays ruin your long weekend. To help you decide whether it’s faster to bike or take transit to your Labor Day destination, Google Search can show you all of your transportation options and estimated travel times on a single card. Just tap the mic and say “Ok Google, what’s the traffic like to AT&T Park” and easily switch between transportation modes to determine which route works best for you.

by Emily Wood ( at August 29, 2014 02:33 PM


Distrubuted JMeter testing using Docker

Distributed performance testing using JMeter has been around for a while.  The recipe goes something like this: A JMeter client (the green box referred to here as JMeter master) drives the process.  It has the test script (the JMX file).  The actual testing is done by n JMeter Server instances (blue boxes above).  When the test is initiated, the JMX script and the necessary data files are

by Srivaths Sankaran ( at August 29, 2014 12:56 PM


How ASLR exposed an old bug

Recently I was investigating an issue where OpenAFS server processes where crashing on start-up if ASLR (Address Space Layout Randomization) is enabled. All of them were crashing in the same place. Initially I enabled ASLR globally and restarted AFS services:
$ sxadm enable -c model=all aslr
$ svcadm restart ms/afs/server
This resulted in core files from all daemons, let's look at one of them:
$ mdb core.fileserver.1407857175
Loading modules: [ ]
> ::status
debugging core file of fileserver (64-bit) from XXXX
file: /usr/afs/bin/fileserver
initial argv: /usr/afs/bin/fileserver -p 123 -pctspare 20 -L -busyat 50 -rxpck 2000 -rxbind
threading model: native threads
status: process terminated by SIGSEGV (Segmentation Fault), addr=ffffffffb7a94b20
> ::stack`memset+0x31c()
All the other daemons crashed in the same place. Let's take a closer look at the core.
> afsconf_OpenInternal+0x965::dis
afsconf_OpenInternal+0x930: movl $0x0,0xfffffffffffffce4(%rbp)
afsconf_OpenInternal+0x93a: movq 0xfffffffffffffce0(%rbp),%r9
afsconf_OpenInternal+0x941: movq -0x8(%rbp),%r8
afsconf_OpenInternal+0x945: movq %r9,0x18(%r8)
afsconf_OpenInternal+0x949: movq -0x8(%rbp),%rdi
afsconf_OpenInternal+0x94d: movl $0x0,%eax
afsconf_OpenInternal+0x952: call +0x1619
afsconf_OpenInternal+0x957: movq -0x8(%rbp),%rdi
afsconf_OpenInternal+0x95b: movl $0x0,%eax
afsconf_OpenInternal+0x960: call +0x8ecb _afsconf_loadrealms>
afsconf_OpenInternal+0x965: movl $0x0,-0x24(%rbp)
afsconf_OpenInternal+0x96c: jmp +0x6
afsconf_OpenInternal+0x96e: nop
afsconf_OpenInternal+0x970: jmp +0x2
afsconf_OpenInternal+0x972: nop
afsconf_OpenInternal+0x974: movl -0x24(%rbp),%eax
afsconf_OpenInternal+0x977: leave
afsconf_OpenInternal+0x978: ret
0x4e4349: nop
0x4e434c: nop
ParseHostLine: pushq %rbp
It actually crashes in _afsconf_LoadRealms(), we need a little bit more debug info:
$ truss -u a.out -u :: -vall /usr/afs/bin/fileserver $args
/1: -> _afsconf_LoadRealms(0x831790290, 0x1, 0x3, 0x0, 0x5bbbe8, 0x8317952bc)
/1: -> libc:malloc(0x28, 0x1, 0x3, 0x0, 0x28, 0x8317952bc)
/1: - -="" 0x8317965c0="" libc:malloc=""> libc:memset(0x317965c0, 0x0, 0x28, 0x0, 0x28, 0x8317952bc)
/1: Incurred fault #6, FLTBOUNDS %pc = 0x7FFD55C802CC
/1: siginfo: SIGSEGV SEGV_MAPERR addr=0x317965C0
/1: Received signal #11, SIGSEGV [default]
/1: siginfo: SIGSEGV SEGV_MAPERR addr=0x317965C0
It fails just after first malloc() followed by memset() in _afsconf_LoadRealms(), the corresponding source code is:
local_realms = malloc(sizeof(struct afsconf_realms));
if (!local_realms) {
code = ENOMEM;
goto cleanup;
memset(local_realms, 0, sizeof(struct afsconf_realms));
The code looks fine... but notice in the above truss output that memset() is using a different pointer to what malloc() returned. Might be a bug in truss but since this is where it crashes it is probably real. Let's confirm it with other tool and maybe we can also spot some pattern.
$ dtrace -n 'pid$target::_afsconf_LoadRealms:entry{self->in=1}' \
-n 'pid$target::memset:entry/self->in/{printf("%p %d %d", arg0, arg1, arg2);}' \
-n 'pid$target::malloc:entry/self->in/{trace(arg0);}' \
-n 'pid$target::malloc:return/self->in/{printf("%p, %p", arg0,arg1);}' \
-c "/usr/afs/bin/fileserver $args"

3 99435 malloc:entry 40
3 99437 malloc:return 54, c62324a50
3 99433 memset:entry 62324a50 0 40

$ dtrace -n 'pid$target::_afsconf_LoadRealms:entry{self->in=1}' \
-n 'pid$target::memset:entry/self->in/{printf("%p %d %d", arg0, arg1, arg2);}' \
-n 'pid$target::malloc:entry/self->in/{trace(arg0);}' \
-n 'pid$target::malloc:return/self->in/{printf("%p, %p", arg0,arg1);}' \
-c "/usr/afs/bin/fileserver $args"

3 99435 malloc:entry 40
3 99437 malloc:return 54, 10288d120
3 99433 memset:entry 288d120 0 40

$ dtrace -n 'pid$target::_afsconf_LoadRealms:entry{self->in=1}' \
-n 'pid$target::memset:entry/self->in/{printf("%p %d %d", arg0, arg1, arg2);}' \
-n 'pid$target::malloc:entry/self->in/{trace(arg0);}' \
-n 'pid$target::malloc:return/self->in/{printf("%p, %p", arg0,arg1);}' \
-c "/usr/afs/bin/fileserver $args"

3 99435 malloc:entry 40
3 99437 malloc:return 54, de9479a10
3 99433 memset:entry ffffffffe9479a10 0 40
It looks like the lowest 4 bytes in the pointer returned from malloc() and passed to memset() are always preserved, while the top 4 bytes are mangled. I was curious how it looks like when ASLR is disabled:
$ elfedit -e 'dyn:sunw_aslr disable' /usr/afs/bin/fileserver
$ dtrace -n 'pid$target::_afsconf_LoadRealms:entry{self->in=1}' \
-n 'pid$target::_afsconf_LoadRealms:return{self->in=0}' \
-n 'pid$target::memset:entry/self->in/{printf("%p %d %d", arg0, arg1, arg2);}' \
-n 'pid$target::malloc:entry/self->in/{trace(arg0);}' \
-n 'pid$target::malloc:return/self->in/{printf("%p, %p", arg0,arg1);}' \
-c "/usr/afs/bin/fileserver $args"

1 99436 malloc:entry 40
1 99438 malloc:return 54, 5bd170
1 99434 memset:entry 5bd170 0 40
... [ it continues as it doesn't crash ]
Now the pointer passed to memset() is the same as the one returned from malloc() - notice however that it is 32bit (all daemons are compiled as 64bit). Let's have a look at the core again where it actually fails:
_afsconf_LoadRealms+0x59: call -0xa7c36 
_afsconf_LoadRealms+0x5e: movl %eax,%eax
_afsconf_LoadRealms+0x60: cltq
_afsconf_LoadRealms+0x62: movq %rax,%r8
_afsconf_LoadRealms+0x65: movq %r8,-0x20(%rbp)
_afsconf_LoadRealms+0x69: movq -0x20(%rbp),%r8
_afsconf_LoadRealms+0x6d: cmpq $0x0,%r8
_afsconf_LoadRealms+0x71: jne +0xd _afsconf_loadrealms x80="">
_afsconf_LoadRealms+0x73: movl $0xc,-0x18(%rbp)
_afsconf_LoadRealms+0x7a: jmp +0x1f5 _afsconf_loadrealms x274="">
_afsconf_LoadRealms+0x7f: nop
_afsconf_LoadRealms+0x80: movl $0x28,-0x48(%rbp)
_afsconf_LoadRealms+0x87: movl $0x0,-0x44(%rbp)
_afsconf_LoadRealms+0x8e: movq -0x48(%rbp),%r8
_afsconf_LoadRealms+0x92: movq %r8,%rdx
_afsconf_LoadRealms+0x95: movl $0x0,%esi
_afsconf_LoadRealms+0x9a: movq -0x20(%rbp),%rdi
_afsconf_LoadRealms+0x9e: movl $0x0,%eax
_afsconf_LoadRealms+0xa3: call -0xa7d10

Bingo! See the movl and cltq instructions just after returning from malloc(). This means that malloc() is returning a 64bit address but compiler thinks it returns a 32bit address, so it clears the top 4 bytes and then expands the pointer back to 64 bits and this is what is being passed to memset(). With ASLR disabled it just happens we get a low address that the lowest 4 bytes are enough to address it so we don't get the issue, with ASLR most of the time we end up with much higher address where you can't just chop of the top four bytes.

Compilers do it if they have an implicit function declaration and then they assume the return is an int which on x86_64 means 32 bits. The fix was trivial - all that was required was to add #include &ltstdlib.h> and recompile - now compiler knows that malloc() returns 64 bit pointer, the movl, cltq instructions are gone and we get no more crashes.

by milek ( at August 29, 2014 11:27 AM

Evaggelos Balaskas

Web Proxy Autodiscovery Protocol with dnsmasq

It seems that you can push a WPAD to desktops via dhcp.

My proxy is based on squid running on 8080.

I ‘ve build a WPAD file similar to the below:


function FindProxyForURL(url, host)
        return "PROXY; DIRECT";

next thing is to publish it via a web server.
I am using thttpd for static pages/files:

how to test it:

# curl -L

after that a simple entry on Dnsmasq


and restart your dnsmasq

Dont forget to do a dhcp release on your windows machine

Tag(s): dnsmasq, squid, WPAD

August 29, 2014 08:28 AM

Chris Siebenmann

A hazard of using synthetic data in tests, illustrated by me

My current programming enthusiasm is a 'sinkhole' SMTP server that exists to capture incoming email for spam analysis purposes. As part of this it supports matching DNS names against hostnames or hostname patterns that you supply, so you can write rules like:

@from reject host .boring.spammer with message "We already have enough"

Well, that was the theory. The practice was that until very recently this feature didn't actually work; hostname matches always failed. The reason I spent so much time not noticing this is that the code's automated tests passed. Like a good person I had written the code to do this matching and then written tests for it, in fact tests for it even (and especially) in the context of these rules. All of these tests passed with flying colours, so everything looked great (right up until it clearly failed in practice while I was lucky enough to be watching).

One of the standard problems of testing DNS-based features (such as testing matching against the DNS names of an IP address) is that DNS is an external dependency and a troublesome one. If you make actual DNS queries to actual Internet DNS servers, you're dependent on both a working Internet connection and the specific details of the results returned by those DNS servers. As a result people often mock out DNS query results in tests, especially low level tests. I was no exception here; my test harness made up a set of DNS results for a set of IPs.

(Mocking DNS query results is especially useful if you want to test broken things, such as IP addresses with predictably wrong reverse DNS.)

Unfortunately I got those DNS results wrong. The standard library for my language puts a . at the end of all reverse DNS queries, eg the result of looking up the name of is (currently) '' (note the end). Most standard libraries for most languages don't do that, and while I knew that Go's was an exception I had plain overlooked this while writing the synthetic DNS results in my tests. So my code was being tested against 'DNS names' without the trailing dot and matched them just fine, but it could never match actual DNS results in live usage because of the surprise final '.'.

This shows one hazard of using synthetic data in your tests: if you use synthetic data, you need to carefully check that it's accurate. I skipped doing that and I paid the price for it here.

(The gold standard for synthetic data is to make it real data that you capture once and then use forever after. This is relatively easy in algnauges with a REPL but is kind of a pain in a compiled language where you're going to have to write and debug some one-use scratch code.)

Sidebar: how the Go library tests deal with this

I got curious and looked at the tests for Go's standard library. It appears that they deal with this by making DNS and other tests that require external resources be optional (and by hardcoding some names and eg Google's public DNS servers). I think that this is a reasonably good solution to the general issue, although it wouldn't have solved my testing challenges all by itself.

(Since I want to test results for bad reverse DNS lookups and so on, I'd need a DNS server that's guaranteed to return (or not return) all sorts of variously erroneous things in addition to some amount of good data. As far as I know there are no public ones set up for this purpose.)

by cks at August 29, 2014 04:33 AM

Build a $35 Time Capsule - Raspberry Pi Time Machine Backup Server

This is a simple guide on building a $35 Time Capsule with a Raspberry Pi. A Time Capsule is a network attached storage device from Apple for use with their Time Machine. Time Machine gives users a very easy and userfriendly way to automatically create and restore (encrypted) backups. A Time Capsule is basically an expensive NAS that only talks the AFP/netatalk protocol. The 2 TB version costs $299 at this time, a Raspberry Pi only $35.

August 29, 2014 12:00 AM

August 28, 2014

Ubuntu Geek

Everything Sysadmin

I'm giving away 1 free ticket to PuppetConf 2014!

PuppetConf 2014 is the 4th annual IT automation event of the year, taking place in San Francisco September 20-24. Join the Puppet Labs community and over 2,000 IT pros for 150 track sessions and special events focused on DevOps, cloud automation and application delivery. The keynote speaker lineup includes tech professionals from DreamWorks Animation, Sony Computer Entertainment America, Getty Images and more.

If you're interested in going to PuppetConf this year, I will be giving away one free ticket to a lucky winner who will get the chance to participate in educational sessions and hands-on labs, network with industry experts and explore San Francisco. Note that these tickets only cover the cost of the conference (a $970+ value), but you'll need to cover your own travel and other expenses (discounted rates available). You can learn more about the conference at:

To enter, click here.

Even if you aren't selected, you can save 20% off registration by reading the note at the top of the form.

ALL ENTRIES MUST BE RECEIVED BY Tue, September 3 at 8pm US/Eastern time.

August 28, 2014 06:28 PM

Evaggelos Balaskas

dnsmasq with custom hosts file

Title: dnsmasq with custom hosts file - aka ban sites with dnsmasq

I ‘ve already said it too many times, but dnsmasq is a beautiful project for SOHO (small office/home office) environment.

I am using it as DNS caching server, DHCP server & tftpd (PXE) server and it’s amazing.

One thing i do with the dns section is that i “BAN” urls i dont like. Think something like AdBlock on firefox.
Two configuration changes:


as root

wget -O /etc/hosts.txt && 



in /etc/dnsmasq.conf


You can also put the wget cmd in your crontab with the @monthly scheduler but you need to restart the dnsmasq every month!

Another amazing thing is that you can add your one entries:

echo >> /etc/hosts.txt

restart your dnsmasq service and check it:

# dig @localhost +short
Tag(s): dnsmasq

August 28, 2014 05:29 PM

Rich Bowen

Apache httpd at ApacheCon Budapest

tl;dr - There will be a full day of Apache httpd content at ApacheCon Europe, in Budapest, November 17th -


* ApacheCon website -
* ApacheCon Schedule -
* Register -
* Apache httpd -

I'll be giving two talks about the Apache http server at in a little over 2 months.

On Monday morning (November 17th) I'll be speaking about Configurable Configuration in httpd. New in Apache httpd 2.4 is the ability to put conditional statements in your configuration file which are evaluated at request time rather than at server startup time. This means that you can have the configuration adapt to the specifics of the request - like, where in the world it came from, what time of day it is, what browser they're using, and so on. With the new If/ElseIf/Else syntax, you can embed this logic directly in your configuration.

2.4 also includes mod_macro, and a new expression evaluation engine, which further enhance httpd's ability to have a truly flexible configuration language.

Later in the day, I'll be speaking about mod_rewrite, the module that lets you manipulate requests using regular expressions and other logic, also at request time. Most people who have some kind of website are forced to use mod_rewrite now and then, and there's a lot of terrible advice online about ways to use it. In this session, you'll learn learn the basics of regular expression syntax, and how to correctly craft rewrite expressions.

There's other httpd content throughout the day, and the people who created this technology will be on hand to answer your questions, and teach you all of the details of using the server. We'll also have a hackathon running the entire length of the conference, where people will be working on various aspects of the server. In particular, I'll be working on the documentation. If you're interested in participating in the httpd docs, this is a great time to learn how to do that, and dive into submitting your first patch.

See you there!

by rbowen at August 28, 2014 02:18 PM

Chris Siebenmann

One reason why we have to do a major storage migration

We're currently in the process of migrating from our old fileservers to our new fileservers. We're doing this by manually copying data around instead of anything less user visible and sysadmin intensive. You might wonder why, since the old environment and the new architecture are the same (with ZFS, iSCSI backends, mirroring, and so on). One reason is that the ZFS 4K sector issue has forced our hands. But even if that wasn't there it turns out that we probably would have had to do a big user visible migration because of a collision of technology and politics (and ZFS limitations). The simple version of the organizational politics are that we can't just give people free space.

Most storage migrations will increase the size of disks involved, and ours was no exception; we're going from a mixture of 750 GB and 1 TB drives to 2 TB drives. If you just swap drives out for bigger ones, pretty much any decent RAID or filesystem can expand into the newly available space without any problems at all; ZFS is no exception here. If you went from 1 TB to 2 TB disks in a straightforward way, all of your ZFS pools would or could immediately double in size.

(Even if you can still get your original sized disks during an equipment turnover, you generally want to move up in size so that you can reduce the overhead per GB or TB. Plus, smaller drives may actually have worse cost per GB than larger ones.)

Except that we can't give space away for free, so if we did this we would have had to persuade people to buy all this extra space. There was very little evidence that we could do this, since people weren't exactly breaking down our doors to buy more space as it stood.

If you can't expand disks in place and give away (or sell) the space, what you want (and need) to do is to reshape your existing storage blobs. If you were using N disks (or mirror pairs) for a storage blob before the disk change, you want to move to using N/2 disks afterwards; you get the same space but on fewer disks. Unfortunately ZFS simply doesn't support this sort of reshaping of pool storage with existing pools; once a pool has so many disks (or mirrors), you're stuck with that many. This left us with having to do it by hand, in other words a manual migration from old pools with N disks to new pools with N/2 disks.

(If you've decided to go from smaller disks to larger disks you've already implicitly decided to sacrifice overall IOPs per TB in the name of storage efficiency.)

Sidebar: Some hacky workarounds and their downsides

The first workaround would be to double the raw underlying space but then somehow limit people's ability to use it (ZFS can do this via quotas, for example). The problem is that this will generally leave you with a bit less than half of your space allocated but unused on a long term basis unless people wake up and suddenly start buying space. It's also awkward if the wrong people wake up and want a bunch more space, since your initial big users (who may not need all that much more space) are holding down most of the new space.

The second workaround is to break your new disks up into multiple chunks, where one chunk is roughly the size of your old disks. This has various potential drawbacks but in our case it would simply have put too many individual chunks on one disk as our old 1 TB disks already had four (old) chunks, which was about as many ways as we wanted to split up one disk.

by cks at August 28, 2014 03:52 AM

August 27, 2014

Adrian C.

E-mail infrastructure you can blog about

The "e" in eCryptfs stands for "enterprise". Interestingly in the enterprise I'm in its uses were few and far apart. I built a lot of e-mail infrastructure this year. In fact it's almost all I've been doing, and "boring old e-mail" is nothing interesting to tell your friends about. With inclusion of eCryptfs and some other bits and pieces I think it may be something worth looking at, but first to do an infrastructure design overview.

I'm not an e-mail infrastructure architect (even if we make up that term for a moment), or in other words I'm not an expert in MS Exchange, IBM Domino and some other "collaborative software", and most importantly I'm not an expert in all the laws and legal issues related to E-mail in major countries. I consult with legal departments, and so should you. Your infrastructure designs are always going to be driven by corporate e-mail policies and local law - which can, for example, require from you to archive mail for a period of 7-10 years, and do so while conforming with data protection legislation... and that makes a big difference on your infrastructure. I recommend this overview of the "Climategate" case as a good cautionary tale. With that said I now feel comfortable describing infrastructure ideas someone may end up borrowing from one day.

E-mail is critical for most business today. Wait, that sounds like a stupid generalization. As a fact I can say this for types of businesses I've been working with; managed service providers and media production companies. They all operate with teams around the world and losing their e-mail system severely degrades their ability to get work done. That is why:

The system must be highly-available and fault-tolerant

Before I go on to the pretty pictures I have to note that good network design and engineering I am taking as a given here. The network has to be redundant well in advance of services. Network engineers I worked with were very good at their jobs and I had it easy, inheriting good infrastructure.

The first layer deployed on the network is the MX frontend. If you already have, or rent, an HA frontend that can sustain abuse traffic it's an easy choice to pull mail through it too. But your mileage may vary, as it's not trivial to proxy SMTP for a SPAM filter. If the filter sees connections only from the LB cluster it would be impossible for it to perform well; no rate limiting, no reputation scoring... I prefer HAProxy. People making it are great software engineers and their software and services are superior to anything else I've used (it's true I consulted for them once as a sysadmin but that has nothing to do with my endorsements). The HAProxy PROXY protocol, or TPROXY mode can be used in some cases. Or if you are a Barracuda Networks customer instead you might have their load balancers which are supposed to integrate with their SPAM firewalls, but I've been unable to find a single implementation detail to verify their claim. Without load balancers using the SPAM filtering cluster as the MX, and load balancing across it with round-robin DNS is a common deployment:

Network diagram

I wouldn't say much about the SPAM filter, obviously it's supposed to do a very good job at rating and scanning incoming mail, and everyone has their favorites. My own favorite classifier component for many years has been the crm114 discriminator, but you can't expect from (many) people to train their own filters and that it takes 3-6 months to achieve >99% accuracy, Gmail has spoiled the world. The important thing in the context of the diagram above is that the SPAM filter needs to be redundant, and that it must have the capability to spool incoming mail if all the Mailstore backends fail.

The system must have backups and DR fail-over strategy

For building the backend, the "Mailstores", some of my favorites are Postfix, sometimes Qmail, and Dovecot. It's not relevant, but I guess someone would want to hear that too.

eCryptfs (stacked) file-system runs on top of the storage file-system, and all the mailboxes and spools are stored on it. The reasons for using it are not just related to data protection legislation. There are other solutions and faster too, block-level or hardware-based solutions for doing full disk encryption. But, being a file-system eCryptfs allows us to manipulate mail on the individual (mail) file or (mailbox) directory level. Encrypted mail can be transferred over the network to the remote backup backend very efficiently because of it. If you require, or are allowed to do, snapshots they don't necessarily have to be done at the (fancy) file-system or volume level. Common ext4/xfs and a little rsync hard-links magic work just as well (up to about 1TB on cheap slow drives).

When doing backup restores or a backend fail-over eCryptfs keys can be inserted into the kernel keyring, and data mounted on the remote file-system to take over.

The system must be secure

Everyone has their IPS and IDS favorites, and implementations. But those, together with firewalls, application firewalls, virtual private networks, access controls, two-factor authentication and file-system encryption... still do not make your private and confidential data safe. E-mail is not confidential as SMTP is a plain-text protocol. I personally think of it as being in the public domain. The solution to authenticating correspondents and to protecting your data and intellectual property of your company, both in transit and stored on the Mailstore, is PGP/GPG encryption. It is essential.

Even then, confidential data and attachments from mailboxes of employees will find their way onto your project management suite, bug tracker, wiki... But that is another topic entirely. Thanks for reading.

by anrxc at August 27, 2014 08:27 PM

Yellow Bricks

Day 2 #VMworld Report

Day at VMworld, today was going the be the first of 2 EVO:RAIL sessions for me… but before we get there first the keynote. The keynote was great in my opinion. It felt a bit more loosened up than last year and it looked like they were having fun up on stage and that seemed to resonate well with the crowd. There were a couple of things which stood out to me: CloudVolumes, Project Fargo (VMFork), EVO:RAIL and of course the awesome integration of vCAC an NSX / VVOL / VSAN.

After the keynote I had to go straight to my session on EVO:RAIL. I presented with Dave Shanley, the lead engineer, and the room was packed. This session was a repeat which unfortunately ended up being scheduled in a room which was way too small. They managed though to let 50 extra people in, standing room only! However, there were still people waiting in the hallway all the way too the end and around the corner. I hope that those who did not get in will be able to make the session today in Salon 7 in the Marriott at 11:30 as it fits way more people. It was a good session in my opinion and we received some excellent feedback. The best feedback came from two of our direct competitors who both acknowledged they loved the user experience and the simplicity that EVO:RAIL offers.

After my session I went straight to the EVO ZONE. Wow, that place is packed every minute of the day, and when I say packed I mean packed. Great interest from customers and partners around what it is, what it does, and how they can buy one. Some awesome conversations with a customer who had a use case for ROBO deployments, 1500 sites, he said: No longer will I need skilled IT people to manage those site because of the simplicity of this interface but also the deployment mechanism. You inject a “JSON files” with all configuration attributes and then click “just go” and you are done in minutes.

At the end of my booth visit I walked around the solutions exchange and met a lot of great folks. After that it was time for the Office of CTO party. It was great to see a lot of people at the party I had not seen at the event yet. It was definitely a well organized event, with great food, music and people.

"Day 2 #VMworld Report" originally appeared on Follow me on twitter - @DuncanYB.

Pre-order my upcoming book Essential Virtual SAN via Pearson today!

by Duncan Epping at August 27, 2014 06:21 PM

Everything Sysadmin


The schedule for 2014 has been published and OMG it looks like an entirely new conference. By "new" I mean "new material"... I don't see slots filled with the traditional topics that used to get repeated each year. By "new" I also mean that all the sessions are heavily focused on forward-thinking technologies instead of legacy systems. This conference looks like the place to go if you want to move your career forward.

LISA also has a new byline: LISA: Where systems engineering and operations professionals share real-world knowledge about designing, building, and maintaining the critical systems of our interconnected world.

The conference website is here:

I already have approval from my boss to attend. I can't wait!

Disclaimer: I helped recruit for the tutorials, but credit goes to the tutorial coordinator and Usenix staff who did much more work than I did.

August 27, 2014 04:28 PM

How sysadmins can best understand Burning Man

So, you've heard about Burning Man. It has probably been described to you as either a hippie festival, where rich white people go to act cool, naked chicks, or a drug-crazed dance party in the desert. All of those descriptions are wrong..ish. Burning Man is a lot of things and can't be summarized in a sound bite. I've never been to Burning Man, but a lot of my friends are burners, most of whom are involved in organizing their own group that attends, or volunteer directly with the organization itself.

Imagine 50,000 people (really!) showing up for a 1-week event. You essentially have to build a city and then remove it. As it is a federal land, they have to "leave no trace" (leave it as clean as they found it). That's a lot of infrastructure to build and projects to coordinate.

We, as sysadmins, love infrastructure and are often facinated by how large projects are managed. Burning Man is a great example.

There is a documentary called Burning Man: Beyond Black Rock which not only explains the history and experience that is Burning Man, but it goes into a lot of "behind the scenes" details of how the project management and infrastructure management is done.

You can watch the documentary many ways:

There is a 4 minute trailer on (warning: autoplay)

The reviews on IMDB are pretty good. One noteworthy says:

I cannot say enough about the job these filmmakers did and the monumental task they took on in making this film. First, Burning Man is a topic that has been incredibly marginalized by the media to the point of becoming a recurring joke in The Simpsons (although the director of the Simpsons is at Burning Man every year), and second, to those who DO know about it, its such a sensitive topic and so hard to deliver something that will please the core group.

Well, these guys did it, and in style I must say. This doc is witty, fast moving, and most importantly profoundly informational and moving without seeming too close to the material.

I give mad props to anyone that can manage super huge projects so well. I found the documentary a powerful look at an amazing organization. Plus, you get to see a lot of amazing art and music.

I highly recommend this documentary.

Update: The post originally said the land is a national park. It is not. It is federal land. (Thanks to Mike for the correction!)

August 27, 2014 02:28 PM

Chris Siebenmann

The difference between Linux and FreeBSD boosters for me

A commentator on my entry about my cultural bad blood with FreeBSD said, in very small part:

I'm surprised that you didn't catch this same type of bad blood from the linux world. [...]

This is a good question and unfortunately my answers involve a certain amount of hand waving.

First off, I think I simply saw much less of the Linux elitism than I did of the FreeBSD elitism, partly because I wasn't looking in the places where it probably mostly occurred and partly because by the time I was looking at all, Linux was basically winning and so Linux people did less of it. To put it one way, I'm much more inclined towards the kind of places you found *BSD people in the early 00s than the kinds of places that were overrun by bright-eyed Linux idiots.

(I don't in general hold the enthusiasms of bright-eyed idiots of any stripe against a project. Bright-eyed idiots without enough experience to know better are everywhere and are perfectly capable of latching on to anything that catches their fancy.)

But I think a large part of it was that the Linux elitism I saw was both of a different sort than the *BSD elitism and also in large part so clearly uninformed and idiotic that it was hard to take seriously. To put it bluntly, the difference I can remember seeing between the informed people on both sides was that the Linux boosters mostly just argued that it was better while the *BSD boosters seemed to have a need to go further and slam Linux, Linux users, and Linux developers while wrapping themselves in the mantle of UCB BSD and Bell Labs Unix.

(Linux in part avoided this because it had no historical mantle to wrap itself in. Linux was so clearly ahistorical and hacked together (in a good sense) that no one could plausibly claim some magic to it deeper than 'we attracted the better developers'.)

I find arguments and boosterism about claimed technical superiority to be far less annoying and offputting than actively putting down other projects. While I'm sure there were Linux idiots putting down the *BSDs (because I can't imagine that there weren't), they were at most a a fringe element of what I saw as the overall Linux culture. This was not true of the FreeBSD and the *BSDs, where the extremely jaundiced views seemed to be part of the cultural mainline.

Or in short: I don't remember seeing as much Linux elitism as I saw *BSD elitism and the Linux boosterism I saw irritated me less in practice for various reasons.

(It's certainly possible that I was biased in my reactions to elitism on both sides. By the time I was even noticing the Linux versus *BSD feud we had already started to use Linux. I think before then I was basically ignoring the whole area of PC Unixes, feuds and all.)

by cks at August 27, 2014 06:26 AM

Yellow Bricks

Call to action: #VMworld attendees, please give back and throw a paper airplane!

Yesterday at VMworld I went to the hangspace as my friends from the VMware Foundation were there. They have a great section in the hangspace where you can give back to the community by simply throwing a paper airplane. The program is called “Destination Give Back” and is a great opportunity for VMware to share our values through YOU, the community / customers /partners!

Before I explain how it works, the VMware foundation committed to donating a maximum 250.000 dollars.

How It Works:

  • As a member of the VMware community, we invite attendees to have some fun and Give Back with us by flying a paper airplane.
  • Attendees, learn about a few sample causes (Children, Education, Environment, Human Rights, Women and Girls) selected by VMware employees through the stories on display, and choose the cause they are passionate about.
  • People get to test their ingenuity by building and flying a paper airplane for the cause they have chosen. The distance the plane traveled determines the amount that will be donated by the VMware Foundation to the cause of choice.

So to be clear, you as the person throwing the plane will not make the donation… throwing the plane as far as you possibly can is all you need to do. The farther it will go, the higher the donation will be. 83 feet is the record so far. So make sure to stop by at the hangspace, spent 2 minutes of your VMworld time and give back!

"Call to action: #VMworld attendees, please give back and throw a paper airplane!" originally appeared on Follow me on twitter - @DuncanYB.

Pre-order my upcoming book Essential Virtual SAN via Pearson today!

by Duncan Epping at August 27, 2014 12:57 AM

August 26, 2014

Ubuntu Geek

MysqlReport – Makes a friendly report of important MySQL status values

Mysqlreport makes a friendly report of important MySQL status values.Actually, it makes a friendly report of nearly every status value from SHOW STATUS. Unlike SHOW STATUS which simply dumps over 100 values to screen in one long list, mysqlreport interprets and formats the values and presents the basic values and many more inferred values in a human-readable format.

The benefit of mysqlreport is that it allows you to very quickly see a wide array of performance indicators for your MySQL server which would otherwise need to be calculated by hand from all the various SHOW STATUS values. For example, the Index Read Ratio is an important value but it’s not present in SHOW STATUS; it’s an inferred value (the ratio of Key_reads to Key_read_requests).
Read the rest of MysqlReport – Makes a friendly report of important MySQL status values (53 words)

© ruchi for Ubuntu Geek, 2014. | Permalink | No comment | Add to
Post tags: , , ,

Related posts

by ruchi at August 26, 2014 11:40 PM

Rich Bowen

LinuxCon NA 2014

Last week I attended LinuxCon North America in Chicago.

As always when I go to a conference, there's always about 5 things going on at any moment, and one has to decide where to be and what to do, and then wish you'd done the other thing.

I spent most of the time working the Red Hat booth, talking to people about RDO, OpenShift, Atomic, and, of course, 3D printing.

I also spent a little time over at the OpenStack booth, although it was mostly staffed pretty well without me. The cool thing about the OpenStack booth was the representation from many different companies, all working together to make OpenStack successful, and the ability to be cordial - even friendly - in the process.

While I didn't attend very many talks, there were a few that I made it to, and some of these were really great.

Rikki Endsley's talk You Know, for Kids! 7 Ideas for Improving Tech Education in Schools was largely a story about an unfortunate experience in a high school programming class, and the lessons learned from it. I'm very interested in stories like this, primarily because I want to teach my daughters, but also, my son, how to deal with gender discrimination in their various interests, although it seems particularly troublesome in geekly pursuits.

Guy Martin's talk Developing Open Source Leadership was brilliant. He talked about how to participate in Open Source projects, and encourage your employees to do so, for the specific goal of establishing your company as a leader in a particular field. While this sounds like it may be about subverting the character of Open Source for your own financial benefit, it didn't go that direction at all. Instead, he talked about being a good community citizen, and truly establishing leadership by participating, not merely by gaming the system. This was a great talk, and well worth attending if you happen to see him giving it again.

The 3D printing keynotes on Friday were very high in geek factor, and, as we had a 3D printer at the Red Hat booth,
I learned more about 3D printing last week than anything else.

A large part of the value of the conference (as with most tech conferences these days) was the evening and hallway conversations, evening events, dinner with various people, and conversations with people stopping by the booth. The technical content is always useful. The personal connections and stories are absolutely the most valuable thing. Running into old friends and making new ones is also always a highlight of these events.

Wednesday evening, I participated in an event where we talked with the folks from Chicago CTO Forum about The Apache Software Foundation. That was a lot of fun, and I learned at least as much as I taught.

Thursday was superhero day, with various people dressing up as their favorite heros. Alas, I didn't take a costume, but several of my coworkers did.

A final highlight of the conference (and of which I have no photos) was the running tour of the city. Friday morning, CittyRunningTours took me and 20 or so other runners on a historical and architectural tour of downtown Chicago. we ran about 4 miles, stopping every half mile or so for a history lesson. It was fascinating, as well as being a good run.

by rbowen at August 26, 2014 06:35 PM

Evaggelos Balaskas

[old] GPG key

I have decided to expire my current PGP key:


0×5882be3def6dc21a is the long version !

in 30 days from now, on 25 Sep 2014.

You can still use it to send me encrypted msg and i will use it to digital sign emails (and other staff) till that day.

After the 25th of Sep you may assume that this key is no longer valid.

I haven’t decided yet if i want to upload or advertise my new GPG key.

August 26, 2014 10:10 AM

Yellow Bricks

Day 1 #VMworld Report

It is Monday evening and this morning the madness started, VMworld 2014. Pat Gelsinger’s keynote was of course up first and the highlight for me was definitely the unveiling of the project (EVO:RAIL) I worked on for the last 18 months or so. It is just amazing to see everything come together, a huge engineering effort, the architectural aspects, business development and alliances work etc etc. So many things going, a truly unique experience to take a something from conception to release. I am glad I was provided the opportunity to have this experience and be part of this team. What I personally found interesting about the keynote, and also Carl’s talk, was the customer angle. Many different testimonies from customers who have been deploying SDDC in their datacenters with explanations of how it simplified their life. Also the vCloud Air announcements were interesting, that is definitely a space I will be watching in the future. I can’t wait for Ben Fathi’s keynote tomorrow, as we will get more tech detail and cool demos. I am hoping I will have a wifi connection tomorrow, so I can do some tweeting or life blogging etc.

After the keynote I had my first session… Well not really “my session” but a nice collaborative effort of many people, but the father of it all was Vaughn Steward. Vaughn invited me to be part of the VMware team on a Gameshow. The gameshow was a bit chaotic to be honest, there were some challenges with the slidedeck which is a shame, still I hope people enjoyed it though. I thought it was entertaining, but there is definitely room for improvement.

Second session of the day was with fellow bloggers/vExperts: Rick Scheerer, Chad Sakac, William Lam, Scott Lowe and I. As expected a fair amount of questions on NSX and EVO:RAIL. If you weren’t there, Derek Seaman managed to write down most questions and answers, thanks Derek! I always like these sessions as you do not know what people will ask and it is always a mix of technical to even personal questions. Early scores of this session are off the charts! I hope our repeat Wednesday will just be as good!

I am still heavily jetlagged, looking forward to tomorrow though, although I hope I will get a couple hours of sleep at least today.

"Day 1 #VMworld Report" originally appeared on Follow me on twitter - @DuncanYB.

Pre-order my upcoming book Essential Virtual SAN via Pearson today!

by Duncan Epping at August 26, 2014 05:47 AM

Chris Siebenmann

Why I don't like HTTP as a frontend to backend transport mechanism

An extremely common deployment pattern in modern web applications is not to have the Internet talk HTTP directly to your application but instead to put it behind a (relatively lightweight) frontend server like Apache, nginx, or lighttpd. While this approach has a number of advantages, it does leave you with the question of how you're going to transport requests and replies between your frontend web server and your actual backend web application. While there are a number of options, one popular answer is to just use HTTP; effectively you're using a reverse proxy.

(This is the method of, for example, Ruby's Unicorn and clones of it in other languages such as Gunicorn.)

As it happens, I don't like using HTTP for a transport this way; I would much rather use something like SCGI or FastCGI or basically any protocol that is not HTTP. My core problem with using HTTP as a transport protocol can be summarized by saying that standard HTTP does not have transparent encapsulation. Sadly that is jargon, so I'd better explain this better.

An incoming HTTP request from the outside world comes with a bunch of information and details; it has a source IP, a Host:, the URL it was requesting, and so on. Often some or many of these are interesting to your backend. However, many of them are going to be basically overwritten when your frontend turns around and makes its own HTTP request to your backend, because the new HTTP request has to use them itself. The source IP will be that of the frontend, the URL may well be translated by the frontend, the Host: may be forced to the name of your backend, and so on. The problem is that standard HTTP doesn't define a way to take the entire HTTP request, wrap it up intact and unaltered, and forward it off to some place for you to unwrap. Instead things are and have to be reused and overwritten in the frontend to backend HTTP request, so what your backend sees is a mixture of the original request plus whatever changes the frontend had to make in order to make a proper HTTP request to you.

You can hack around this; for example, your frontend can add special headers that contain copies of the information it has to overwrite and the backend can know to fish the information out of these headers and pretend that the request had them all the time. But this is an extra thing on top of HTTP, not a standard part of it, and there are all sorts of possibilities for incomplete and leaky abstractions here.

A separate transport protocol avoids all of this by completely separating the client's HTTP request to the frontend from the way it's transported to the backend. There's no choice but to completely encapsulate the HTTP request (and the reply) somehow and this enforces a strong separation between HTTP request information and transport information. In any competently designed protocol you can't possibly confuse one for the other.

Of course you could do the same thing with HTTP by defining an HTTP-in-HTTP encapsulation protocol. But as far as I know there is no official or generally supported protocol for this, so you won't find standard servers or clients for such a thing the way you can for SCGI, FastCGI, and so on.

(I feel that there are other pragmatic benefits of non-HTTP transport protocols, but I'm going to defer them to another entry.)

Sidebar: another confusion that HTTP as a transport causes

So far I've talked about HTTP requests, but there's an issue with HTTP replies as well because they aren't encapsulated either. In a backend server you have two sorts of errors, errors for the client (which should be passed through to them) and errors for the frontend server that tells it, for example, that something has gone terribly wrong in the backend. Because replies are not encapsulated you have no really good way of telling these apart. Is a 404 error a report from the web application to the client or an indication that your frontend is trying to talk to a missing or misconfigured endpoint on the backend server?

by cks at August 26, 2014 04:14 AM

Administered by Joe. Content copyright by their respective authors.