Friday, December 31, 2010

Where, oh where, is that configuration set?

Did I say last time that the most difficult thing about transitioning to Linux systems was figuring out how to get basic system information? I did? Well, let me change that. I will now say one of the biggest challenges is determining where to find configuration information pertaining to programs you want to run on Linux. There are 101 different places to stick things, and while this is undoubtedly one of the features that makes Linux so appealing to do-it-yourselfers, it makes it a little challenging for those of us who like order.

I'm working through Apache and Tomcat 6 again. I've found that there's no better way to get some of this stuff under my hat than to do it over and over again until I pretty much don't need to look at a guide or notes or anything like that. Really, I want to obtain a very deep understanding of the technology and not just be able to type in commands that someone else provided. I really want to know the why of what I'm doing in addition to the how. But I digress.

So, I'm working through Apache and reading through their documentation as I go along to better explain some of the directives available and the inner architecture of the service. I got to the point of where we can use Tomcat to serve up Java applications. Since the Tomcat container exists as its own app, you have to use a plug-in to get Apache to talk to it. For this you install mod_jk, which Apache uses to talk to Tomcat using the AJP13 protocol. When you install mod_jk, it creates a file called workers.properties. In all of the setups we have in-house, workers.properties is located in /etc/apache2. My install had no such file, which led me to believe that maybe I hadn't installed the module properly because it's supposed to be created by default. I checked and sure enough jk.load existed in /etc/apache2/mods-enabled, indicating that the installation was successful, so I went off to discover why this file was missing.

I found that workers.properties is by default actually created in /etc/libapache2-mod-jk; whoever configured these systems moved that file to /etc/apache2. I checked and there is indeed a workers.properties file in the default location as well, but it's not being used because the line that points to the file, the jKWorkersFile directive, dictates what file is used anyway. More interesting news is that some of the properties I have previously seen outlined in apache2.conf or httpd.conf, such as the directive that actually loads this module and other directives that control logging, can be defined pretty much anywhere. Some installations have this set up in the jk.load file, others have it set up apache2.conf, others in ports.conf, others in httpd.conf...skies the limit. As long as its an include in the main configuration file, you could put it in a file named peanutbutterjellytime if you wanted.

What this means to me is that it creates a slightly sharper learning curve because not only does one need to understand the concepts, one also has to be a detective to find where the configuration files and commands are even located in the system. I have two servers here running the same combination of applications, and they don't even match in that regard. My desire for standardization is fighting with the flexibility of Linux.

Saturday, December 18, 2010

Gathering Hardware Information in Linux

Undoubtedly one of the biggest struggles in getting used to working in Linux (specifically a server OS) is learning how to do the things you did with Windows and the GUI in a non-GUI environment. You really have to work on retraining your brain and your eyes.

I've started working on a risk management assessment exercise for my company. I've been using the documentation and structure provided by NIST (www.nist.gov) and one of the first steps is to properly and fully identify your systems. I had already gone through and inventoried all of the servers and switches and made note of things like the service tags, warranty expirations, number of drives, etc., but this exercise called for more in-depth information. Most of our servers, particularly the client-serving ones, are in a collocation facility so it's not possible to simply look at the hardware and ascertain things like whether or not the RAID controller is embedded or an HBA, and what kinds of drives they are, etc. Windows provides the Device Manager as a very quick and easy way to get hardware information, down to some pretty good detail. In Linux I found the same task to be more challenging: there are a variety of tools and utilities to use; they're not all included in your base install; the output can be cryptic and difficult to get through.

Here's a quick list of the resources I used/tried out to get the information I needed, starting with the ones with which I was already familiar and progressing to new finds.

/proc
This virtual directory has a lot of information about the current running system configuration, including some hardware components. Reading meminfo or cpuinfo gives good detail about the memory and the processor(s) installed as well as cpu capability from the flags field.

dmidecode
I like the amount of information you get from this tool. I wish it were a standard part of a base install because I think it's just easy to quickly get the info you need, especially if you make use of the flags or regular expressions to pull out the specific info you're looking for.


lspci
Very quick way to get basic information about the peripherals installed in the system. It essentially queries the /proc/pci file.

mpt-status
This tool turned out to be a lifesaver. I could not figure out how to get information on the physical drives in my system since it was using hardware RAID, masking the underlying physical structure of the drives to the OS. There is also apparently an added challenge when using LSI Logic RAID controllers, which is the default embedded controller installed in a lot of these cheaper Dell 1U servers. I had a pretty good guess that it was a 2-drive RAID 1 configuration since it was the same make/model as another onsite system, and I also had the info from Dell's site thanks to the service tag. I never like to let things like that lay though. What would happen if I didn't have all of that information? I should definitely know how to get it. <a href="http://freshmeat.net/projects/mptstatus/">The mpt-status tool </a> does not come installed either, but it's a fairly straightforward process to get it through package management. This tool lets you monitor the health and status of your RAID setup.

So, loads of options really for gathering the info I needed. The challenge again was simply figuring out which tools were best suited for my tasks, and then deciphering the output when I used them. That seems to really be one of the main aspects of Linux in general though. I would love to be able to use Dell's OMSA tool, but they don't officially support/make it for Ubuntu. There are various articles out there on compiling it from source and tweaking it to get it to work, but that's not the kind of things I'm psyched to do on a production box. Even if I got it working on a test machine I still would be loath to try it out because our servers are all different. They don't all have the same kernel versions, packages, etc. I think I will use the roundabout method for now.



Thursday, December 9, 2010

Bad Websites and Crappy Images

www.cisco.com

I don't know what the problem is. Such a huge company, surely they have come QA folks who review the site for usability. Right? Another of those little things that I miss about working for a large company that creeps up on you. If I had a question about a service contract, we had a team of talented product professionals who took care of that for us. I had no idea how squirrely it is to navigate those waters until this week when I began trying to track down the warranty information for our firewalls. I was looking for two simple pieces of information: did we have a contract, and when did it expire.

Cisco's site is difficult to navigate, and that's putting it mildly. I wound up sending a request yesterday to have my login tied to my company so that I could view service contracts, if there were any. That happened relatively quickly thanks to the magic of email and dealing with a person. However, when I tried to use that access today to actually get any information about the contract, it failed. Miserably. I was unable to gather anything from the main page. There were no obvious links to getting any kind of warranty status. The only way I was able to even view that I had coverage was by starting the process of creating a TAC request, a little hint that I picked up from searching the web.

I finally wound up calling them and asking for the information. The rep I spoke with first was pretty rude to me, going so far as to remark that she'd never had to call a woman by a guy's name before (I suppose she's never heard of Stevie Nicks or Billie Holliday). Thanks for sharing. She then directed me to a website, www.cisco.com/go/cscc. Now I'm the first person to second-guess myself and wonder if I simply missed something, so I asked if this was linked to on their website and I had simply overlooked it. Nope, she said. They'd taken it down on purpose. I was too confused and surprised to even ask why in the world they would take down a link to something so useful. Maybe to make room for more flash animation.

After ending our unpleasant conversation I went to that site and attempted to use the Tool link to access my contract info. I got an error page instead. I tried again. Same error. Wound up calling Cisco again. This time I spoke to a much more pleasant woman who both gave me the support end date for our firewalls and also answered my questions about the site in a professional manner. Turns out whomever set up my access forgot to tick the box that actually allows me to view the contracts online. Sweet.

Last but not least I wanted to see if we'd had any support calls on these devices since they'd been covered. Seemed like a pretty typical request to me. She was unable to help me but transferred me to the support team. Wow. Seriously, you'd think I'd asked him if I could get a tech onsite to help me jump my car. He seemed confused by the request, and the end result was that you apparently can only view a support history by the name of the person who logged the incident, and not by the support contract. Seems weird, right? Every incident would be, I imagine, linked to my support contract (otherwise how would you know that I had the right to the support), but you can't simply look at my support contract and tell me what tickets have been opened. I can do this through my ISP, but I can't do this through my world-renowned network equipment provider. Mind-boggling. If I can come up with the login credentials for my predecessor though they'll give me all the info I want about past cases. Again I say: sweet.

On to imaging. For whatever reason, the simple imaging solution as offered by one Curtis Preston alas does not work. At this point I have spent more time trying to get it to work than it deserves, and have officially called it quits on this experiment. I have moved on to another, more promising alternative with Mondo Rescue. Also free, also able to image a live server. We'll see how this pans out.

Sunday, December 5, 2010

Fusion Startup Issue

I tried to set up Ubuntu as a virtual machine on my Mac using VMWare Fusion, and was getting error messages about not being able to open /dev/vmmon and such every time. Did some searching and found this post about the background process not starting properly. I tried that command and it worked. I intend to find out why this happened and fix the root issue ultimately, but this at least gets me off the ground.

Backup Provider Showdown

My research into a backup solution led me to two final contestants: i365 (formerly Evault) and Venyu (formerly Amerivault). I had worked with both products in my previous life as a consultant so I knew a bit about the basics of the offerings going in. I had decided vaulting was the way to go because I absolutely, positively had no interest in dealing with tape, and it seemed that it was financially a better decision as well. Tape has a pretty hefty upfront cost: the licensing and support agreements for the software, the purchasing of a tape drive and tapes, and the recurring hardware warranty for the drive and replacing tapes as they get old. In addition tape backups have a much higher rate of failure than vaulting. My old company did a report on the backup success of our clients using tape, and man was it ugly! Lastly, we have two sites and we were definitely not going to be putting this solution in place in both, so we were still going to be copying content from one location to another, which was still a layer of complexity I wanted to avoid.

I got it down to these two vendors. I had researched as many cheap or free solutions as I could find, but in all honesty nothing fit the bill for an infrastructure like ours. If we had straight Linux across the board I think we would have been able to go with Amanda or something like that, although not without a lot of time spent by me figuring it out since I'm new to Ubuntu specifically and have never used Amanda before. However, we have an SBS 2003 box that needs to be backed up and doing anything with Exchange outside of basic flat file backups with open source.

Both had similar pricing (Venyu was $1.7 per compressed GB, i365 was $1 per uncompressed GB), the product was essentially the same since Venyu is apparently a reseller of i365's software, and the general offering was the same with a couple of minor differences.

I requested a trial with Venyu and worked with them for ~2-3 weeks getting a vault sent to us for the initial seed, installing the agents and setting up the Windows central control software and the web console. Everything was pretty straightforward and worked with the exception of the web console and the Linux agents. For whatever reason, some of the Linux boxes would not talk to the web console, and this was an important feature for us. The web console allows you to check the backups jobs, change them, and do restores from anywhere using the website. You can also technically do all of your backups through this web console so you don't have to install a Windows box somewhere just to do backups, which is what we had to do for the colo. One Linux box was running Fedora and that worked fine, but the Ubuntu boxes did not. Upon trying to get this to work and failing, I was informed that Ubuntu was not officially supported as a distro. This wasn't the first time I'd told them that ran Ubuntu, but it was the first time that I was hearing that they only supported certain distros of Linux. They sent me to the release notes for the Linux agent for reference. Thanks. That may have been more helpful when we were initially discussing our environment.

The real problem though was that while they saw the agents were not registering online, they also seemed unconcerned about it. As long as it was checking in to the Windows-based console it was not a big deal. This despite the fact that I had told the sales rep, who was my liaison, very clearly that the web component was a big factor for us. The tech with whom I was working was very laissez faire about the whole thing. He saw the agents weren't working and was like, "Yeah, we'll come back to that." He never did. It was interesting because in general I enjoy being casual with vendors and other folks with whom I work, but his demeanor often left me feeling like he was unprepared and rushing through things. Lots of," Why isn't this working? Oh, I forgot to do x" and that sort of thing. I didn't need to hear that kind of running commentary because, although I know it happens, it's not a confidence-builder to hear it happening. It also happened that this guy admitted during our first conversation that he wasn't a Linux guy and wouldn't be able to do much troubleshooting if something went wrong, and would need to send it off to someone else. I was wondering why this company would bother to assign us to a non-Linux guy when our environment is 90% Linux?

I tried i365 out as well after this all came to pass, and they were better in every way on the customer service side. Their trial was not as comprehensive. In short they didn't want to arrange a free complete run-through of our environment, which is what Venyu did. I wound up getting the go-ahead to test a web agent, but in all honesty it was enough. I installed the agent on one of our Ubuntu boxes that had not worked with Venyu, and it showed up online instantly. No problems there. I was unable to configure the agents though because they wouldn't communicate. I worked with one of their tech support people for less than 10 minutes and they said it's likely the firewall. Open ports blah and blah to our servers, which are blah and blah, and let's test it. Did that, and it worked. Just like that. That pretty much sold me on i365 right then and there.

My breakdown of the experience is as follows:

Venyu


The good

  • good pricing
  • easy-to-use interface
  • generous trial policy 
  • flexible vaulting (size increases with demand, which could be seen as bad as it's easy to go over)
The bad
  • communication 
  • online agents for Linux don't work
  • preparation (lack of knowledge from my technical contact)
  • lack of drive/investment
i365

The good
  • great communication
  • linux agents work
  • strong technical support with good follow-through
  • easy-to-use product
The bad
  • cost is higher
  • vaulting size is fixed and based on native size (even though it's compressed on their end)

In the end i365 is who will get our business.