BlogicBlog: View from the trenches

The blog about Java and XML with focus on troubleshooting issues and tools.

Friday, October 29, 2004

Can't afford Weblogic AppServer. Check out the discussion of the free ones.

Permanent article link

A discussion on JavaLobby is trying to compare the AppServer features and ease of development. Only free AppServers need to apply.

BlogicBlogger Over and Out

Wednesday, October 27, 2004

Using rsync to synchronize with remote Weblogic

Permanent article link

TheLastBlogger showed us that not every vendor limitation need to be solved by the vendor's fix.

Hurray for flexible utilities and solutions! I hope TheLastBlogger will keep recording the interesting solutions discovered during development/migration.

BlogicBlogger Over and Out

Thursday, October 21, 2004

Perfect multicast storm

Permanent article link

This is a story of a well meaning default causing more problems than a randomly picked value. Read it if you are running BEA Weblogic cluster on a switched network, especially with a CISCO switch. Read it even if you do not run WLS cluster, but are interested in TCP voodoo.

Weblogic server is a combination of many technologies. Quite a number of these technologies used to require a dedicated professional to configure and maintain. I am talking about such subsystems as SSL, transactional JDBC and - case to the point - multicast. Back then, the user interfaces and procedures were arcane and nobody was surprised that a specialist was required to make sure things were working.

Nowadays, people seem to believe that the default settings for a sub-system they are not familiar with will fullfil their need - whatever that need may be. And most of the time, the defaults do seem to work, especially when tested in an environment that does not match production setup in throughput, domain, or network layout (e.g. cluster). However, when the code is promoted to production, unexpected happens.

For this example, let's talk about multicasting. By itself, multicasting is a fairly heavy voodoo and applications that require multicast are usually setup and configured by gurus/vendors who ensure that all the i's are dotted and t's are crossed.

But BEA Weblogic ships with and uses multicast implementation as part of its clustering technology. As it is used in a limited scenario (single network, no switch cross-over, etc), multicast administration is suddenly simplified to a simple set of properties: primarily multicast address/port (others are Send Delay, TTL and buffer size). Look at the configuration page, all values are preset to defaults. In fact, you don't even need to see that page when you create your cluster; you have to manually choose it if anything needs to be changed. One way or another, the values are usually left at the defaults even for very large deployments.

Now, Weblogic's default multicast address is 237.0.0.1. If you cringed at this point, do not continue reading. You know way too much about TCP, MAC address and multicast special cases. You are a guru.

For the rest of us, following is an explanation why this default is a bad one.

If you read multicast RFCs - which are many in number (3170, 2236 , 1112, 2365, etc) you will find that not all multicast addresses are equal. There is a magic address 224.0.0.1 on which all IP hosts will listen and this includes gateways (which covers CISCO routers and switches).

Now, 224.0.0.1 is NOT 237.0.0.1, so there should be no issue. And on TCP layer, there is none. The problem happens bellow TCP layer. OSI Layer 2 switches (of which CISCO switch is an example), do not actually listen to TCP directly. Instead they listen for the multicast MAC address. There is a direct mapping between multicast IP address and multicast MAC address, but it is not unique. In fact, there can be 32 different IP addresses corresponding to the same MAC address (as described in this book excerpt, section 1.6.3).

So, to follow the specification, CISCO switch has to listen for the multicast address 224.0.0.1, which is MAC address 01-00-5e-00-00-01. But that is the MAC address for 237.0.0.1 as well. Therefore, every time a packet passes by for 237.0.0.1, CISCO switch firmware triggers a match.

Once the match is triggered, software part of the switch examines packet's full IP address and figures out that it was not a 224.0.0.1, but just one of the other 31 co-sharing IPs. The packet is discarded and nobody is hurt.

Or so it seems. Turns out that these interrupts are relatively expensive. When a packet is supposed to be sent to a different network, this cost is part of a business. But when it happens every time for a fairly frequent internal cross-server chit-chat (e.g. JNDI tree updates), the switch gets interrupted too often. And a switch that is interrupted unexpectedly often, may be sluggish to respond to legitimate requests.

This, under high enough load, can cause unexpected network timeouts that are nearly impossible to detect. After all, how often do you check your switch's CPU utilisation, when your DNS misbehaves?

So, what's the solution. Easy! Switch to a different multicast address that does not map to the same MAC as the 224.0.0.1. Weblogic will do exactly that for the next release.

The proposed new address is 239.192.0.0. If this makes you cringe, please let BEA know NOW! :-)

BlogicBlogger Over and Out

Saturday, October 16, 2004

I want my Scopo

Permanent article link

Scopo - a wearable display from Mitsubishi - is not going to be here until at least next year. I am really looking forward to it being available for public. It would go long way towards making wearable computing a reality.

They do not say what resolution it is, but the projection is equivalent to 10-inch screen. I wonder if it is good enough to actually write programs. Surely, I could fit a VI session into it, even if it is at 24x80 screen size.

I can barely wait.
BlogicBlogger Over and Out

Friday, October 15, 2004

Vegetable or Mineral: support's point of view

Permanent article link

The article on Hacknot talks about testers and the grief developers face from them due to incorrect problem reporting practices. Of course developers do exactly the same things when they are the ones reporting problems.

I am talking about Technical Support requests. When a developer (or an operations personnel, or - worse yet - an operations manager) calls with a support case, they do exactly the same things that Hacknot described for the testers.

As a very common example, the support case description will read Weblogic crashed without reason. After extensive probing, following things emerge:
  • The Weblogic process hasn't actualy exited, but is still around. It just stopped responding. This means Weblogic hanged, rather than crashed. Equally annoying, yet caused by completely different set of problems.
  • No reason turns out to be a new code drop that, in their opinion, changed nothing important and therefore was not worth mentioning.
  • Doing thread dumps and log/config analysis often shows a bottle neck and/or incorrect configuration. At this point, it usually emerges that the server load at the hang time was higher than ever before and no, it was not load tested in the same configuration (e.g. cluster with proxy) anywhere else.
And of course, if it is not an overly general description, it is 'Diagnosing Instead of Reporting' exactly as described in the article.

I have to admit that when I was a developer, I commited all those mistakes and more myself. I feel that work in BEA support has opened my eyes and makes the code I produce now more user friendly with better reporting.

As a result, my thought on the QA/Dev separation is to actually fold developers into the testing team for short rotational periods.

The developer will then discover whether there are real causes to reporting problems (insufficient logs, misleading dialogs or just unclear logic) that will only be acknowledged as problems when a developer walks a mile in a tester's shoes.

The final benefit is that developers and testers have different ways to do similar things and exchange of techniques is always beneficial.

BlogicBlogger Over and Out

Tuesday, October 12, 2004

Thinking of WebSphere 6 supportability features

Permanent article link

The news of IBM releasing the next version of WebSphere real soon now are propagating through the websites. As a Weblogic support person, it was interesting for me to see what new supportability features were included by IBM. Unfortunately, the details are very vague.

From the general description (failover, failure detection and recovery, etc), it sounded like the features we already had in Weblogic for a while. Some we even had for a very long while (e.g. WLS 6.1 proxy plugin that supports smart failover).

However, I did find one article that provided more serious details.

Specifically,

  • ClassLoaderViewer
  • Hang Detection code in 5.03, 5.1.1 and 6.0
  • LeakBot for memory leak detection
  • Session Data Crossover detecton facility in 5.03, 5.1.1 and 6.0
  • Connection Leak detection
  • ThreadAnalyzer for deadlock debugging
  • Event Alerter for notification


Some of the items above Weblogic already has. JDBC leak detection was there for a long time; deadlock detector of sorts is built into latest JRockit and there are several free standing tools; event alerter is possible via JMX, SNMP or custom loggers, etc.
Others, we probably should look into.

I will not open any secrets by saying that BEA has recently started to pay serious attention to supportability of its products with Dev2Dev utils, support patterns and new support seminars. Even more interesting announcements in this area should come out soon.

However, seeing IBM to put some real effort into supportability area may spur yet more effort from BEA side and will not do any harm to either companies The real benefits will of course be reaped by system administrators and operations staff.

Long live the healthy competition. As long as we win. :-)

BlogicBlogger Over and Out

Saturday, October 09, 2004

Podcasting loop: evolution continues

Permanent article link

Sometimes the only good outcome of a post is to provoke somebody else to do one better. Seemed that I had caused exactly that.

About a week ago, I wrote about extending podcasting loop using Windows Scripting Host and IPod's COM interface. In a comment, Scyro had written about disliking my approach and explained how to achieve the same effect using the smart playlists instead.

I have read his article and actually like his approach more than my own. It is more automatic, does not require to run any scripts and is cross-platform. I still prefer explicit rating system as a feedback rather than playcount, because I switch away from some tracks before the end or come back after the finish. It is easy to modify the solution either way.

Of course without my post, he may not have been motivated to write his. Between us, I believe, we explain how to make iPodding easier.

So, is my script still useful? Probably not for its original purpose. But if you want to delete the old tracks or do any other complex processing, you may still find it helpful as a base.

BlogicBlogger Over and Out

Monday, October 04, 2004

Who is blogging about Weblogic

Permanent article link

There are some people out there blogging about Weblogic. Here is the short list in hopes this article will make them a bit more noticeable:
  • Cid Danis. Weblogic Senior support engineer
  • Unknown, but thorough in documenting test problems
  • Vinny Carpenter, who is using Weblogic and blogs good links and articles
  • News from BEA Dev2Dev. From the horse's mouth so to speak
  • Moazam Rajas. He is actually SUN Senior support engineer, but if you run Weblogic - or any other J2EE servers - you will do well to read his articles

All of the blogs above have RSS feeds.

There is also a couple of interesting individual links:

If you use Weblogic, check these out.
BlogicBlogger Over and Out

Sunday, October 03, 2004

The power of GraphViz

Permanent article link

Ever felt the need to extract some relations from the configuration or data and present it in a visual form nicely layed out. Ever given that up as too hard due to the hard problem of laying out the elements? If you did, then check out GraphViz.

While GraphViz by itself is not Java, it is cross-platform and is fairly easy to setup and invoke from Java. It is also open-source, if that is important to you.

So, what does GraphViz do? Well, it does fancy graph layouts. Sounds boring until you realize that graphs - or data that could become graphs - are actually all over the Java world.

Let me give you a few examples:
  • Ant task dependancies are graphs. If your ant file is rather large, check out ant2dot, Vizant or AntGraph.
  • Spring dependencies are graphs. If your Spring project is getting complex, get an overview of it with SpringViz.
  • If your code is already deployed, but you experience hangs or slowdowns, Graphviz will help to visualize the thread dumps.
  • If you are not happy with GIF/SVG/PNG/PS output that Graphviz is capable of generating, you could always try a more interactive approach with ZGRViewer (a bit raw, but very promising project)
  • Many other uses exist, some of which are described at the GraphViz resource page
  • Finally, if you are able to generate text file that says a->c; b->c; c->d, you can generate GraphViz input (dot file) and let the program itself do the magic of layout. And once you master a simple dot file, you can start adding such bells and whistles as colors, clusters and structures.
If you have an itch to scratch, let GraphViz help you with it.

BlogicBlogger Over and Out

Saturday, October 02, 2004

Extending the podcasting loop

Permanent article link

So, you already use iPodder to listen to your ITConversations on the iPod with a 1-click ease.

All is well, except that it is starting to get difficult to remember which shows you already listened to and which ones are still new. You can of course go to iTunes, find the already-listened-to track and manually delete it or move it to the 'archived' section. But that is very much against 1-clickspirit.

It would be very nice to be able to mark from inside the iPod which tracks you are no longer interested in. Except that iPod is a read-only device.

Except that it is not! You can rate a track with 0-5 stars and that information will get synchronized to the iTunes on the next connection.

Introducing ArchiveTrack. Inspired by one of the iPodder impementations, it uses Windows Scripting Host to archive the tracks (in a given genre) using rating value as a trigger.

The script is given here in all entirety. Feel free to modify it or incorporate it into other setups as long as this article is attributed.


/*
Run this with 'cscript ArchiveTracks.js'

The script goes through the songs in sourceGenre and
moves/archives the ones that have ratings under the threshold
into targetGenre.

Usage scenario:
1) Download podcasting content using iPodder (1 - click)
2) Listen to it on your iPod
3) While you are listening, rate the track 1/2 stars (default script threshold)
4) After the next synchronization, run this script with iPod still connected

Version: 1.0b1 (Oct 2, 2004)
Author: Blogic Blogger
License: Use in any way desired, but mention the source and author
*/

try
{
var sourceGenre = 'Speech'; // The genre used by ITConversations
var targetGenre = 'Speech-Archive'; //The genre to collect all the listened content in
var archiveThreshold = 50; // 1 or 2 stars
var progressNoticeRepeatCount = 250;

WScript.Echo('Connecting to iTunes');
var iTunesApp = WScript.CreateObject("iTunes.Application");
var mainLibrary = iTunesApp.LibraryPlaylist;

var tracks = mainLibrary.Tracks;
var songCount = tracks.count;
WScript.Echo('Total Song Count: ' + songCount);

for (i=1; i<=songCount; i++)
{
if ((i%progressNoticeRepeatCount) == 0)
{
WScript.Echo('Checked ' + i + ' items');
};

var aTrack = tracks.Item(i);
if (aTrack.Genre==sourceGenre)
{
if ((aTrack.Rating > 0) && (aTrack.Rating < archiveThreshold))
{
WScript.Echo();
WScript.Echo(' Archiving: ' + aTrack.Name + ' by ' + aTrack.Artist + ' based on rating ' + aTrack.Rating);
WScript.Echo();
aTrack.Genre = targetGenre; //use 'aTrack.Delete();' to delete instead of archiving
}
else
{
WScript.Echo(' Keeping: ' + aTrack.Name + ' by ' + aTrack.Artist + ' based on rating ' + aTrack.Rating);
}
}
};
}
catch (e)
{
WScript.Echo('Problem archiving: ' + e);
};

try
{
WScript.Echo('Requesting iPod update');
iTunesApp.updateIPod();
WScript.Echo('Successful iPod update request');
}
catch (e)
{
WScript.Echo('Problem updating iPod ' + e);
}
WScript.Echo('Done!');


BlogicBlogger Over and Out

Connecting the Microsoft Dots

Permanent article link

When Win98 uptime was revealed to be never more than 49.7 days, Slashdot laughed. They could not imagine anybody actually wanting - or managing - to keep their Windows machine up for that long. I wonder if they would laugh now.

I wonder what other funny Microsoft problems are going to bite us well after their discovery.

BlogicBlogger Over and Out