Modem madness update

  Locations of visitors to this page
be notified of website changes? subscribe

Modem madness update

Path: sun.sirius.com!usenet
From: Don Hurter
Newsgroups: sirius.tech
Subject: Modem madness update
Date: 14 Sep 1995 05:14:29 GMT
Organization: Sirius Connections
Lines: 116

28.8k Modem status after a messy upgrade attempt

When it rains, it pours. We have been planning to add new 28.8k modems to the pool for a while, and at the same time decommission the early USR modems that have over time given us trouble. I came in early Tuesday morning, when I'd have a window of a few hours while only about 30% of the modems would see any use, and changed the hunting so that some of the modems could be taken down without any traffic on them. (I was even planning to announce a partial downtime, but then figured out a clever way to re-arrange the phone lines and equipment without anyone even knowing, which is how we normally handle minor repairs.) We have had a number of Multitech units online for a month, although they were at the tail of the hunting which means they didn't get hit all day long. Nonetheless they have been performing fairly reliably all this time, and when new units arrived we figured we could set them up, test each unit, then redirect the phone lines their way.

The new Multitechs installed without any trouble (other than one unit arriving DOA), and each one tested fine when we dialed in and confirmed speed, port address, etc. However, I could only install some of them, since we needed to remove the USR modems and their com servers to free up rack space and phone lines for the switch-over, hence the early morning operation. Nonetheless, everything switched over fine and I ran one last battery of tests before turning the 28.8k calls loose on the new Multitechs.

Later that morning the news server reported disk errors, which required a complete shutdown while Peter tried to recover the data on the disk. This event was discouraging, particularly since the new machine has been fairly thoroughly tested and the disks haven't even been in use for a month, and also because that eliminated a potential avenue to report any modem problems to sirius.announce. But hardware failure is part of the territory when networks and servers are involved.

At around 2 P.M. I saw that the new modems were under heavy use, without any obvious behavioral strangeness, and a few of us went out for lunch. Upon return some of the tech support staff were frantically trying to figure out why everyone started calling in with dialup problems, particularly endless ringing to 4728. We dialed in and found that a call placed to the beginning of the hunt group would ring and forward past half a dozen idle modems until one of them bothered to pick up the call, which had us a bit baffled since the earlier round of Multitechs never exhibited this behavior. We checked the cables, phone lines, portmasters, and the configurations of the modems themselves, and nothing pointed itself out as the source of the trouble. Then we discovered that the new shipment of modem cards were of a later ROM revision, which indicated that something had changed for the worse from the more reliable older units. Unfortunately the Multitech offices had closed by that time, so we were on our own for any further troubleshooting.

We reset each modem and then dialed into it to see what might be wrong, but each time they performed fine. Only after they were put back in service for another few hours did they once again stop answering incoming calls. Again we'd compare the newer units against the old, and couldn't find the difference that explained their intermittent problems. Since it would have been a tremendous rewiring job to re-install the USR modems, we reluctantly forwarded the 28.8k calls to the 14.4 pool and called it a night.

Early Wednesday morning I called Multitech and after a 45 minute hold session talked with a support person. He hadn't heard about problems with the newer ROM revisions, but suggested a few AT commands we might try to force a reset after each call. This is standard modem-pool configuration, although the factory setting on the Multitechs was ambiguously set to a different variation than what one would assume for proper resets. It turned out that there is an optional management/controller interface that fits in each rack chassis, which we have for one unit, and that controller is what Multitech uses to fully reset the modems. We never put ours into service, because we found out after we received it that it uses an out-of-date ArcNet interface while we strictly run Ethernet among our machines, and we also don't use PC software for network management. (Multitech is yielding to pressure from all their ISP customers and will release an IP/Ethernet controller later this year.) The modems can and do work fine without the controller (at least the earlier ROM models, that is), but from the factory the newer units depend on the Multitech controller to accomplish a reset upon DTR drop. We hardwired them using an AT string, and had to reconfigure each card all over again.

In the meantime one of the 14.4k modems crapped out (only the second one in a year!), which caused a minor annoyance. But then a customer's dedicated ISDN circuit also dropped out of service, which required a drawn out exchange with PacBell to rectify. During all this we needed to get _something_ back up for the 28.8k service, since reconfiguring and testing all the Multitechs would be a couple hours of work, ignoring all the other interruptions that pepper our days. I had to mount the USRs back on the racks and wire them up, and then switch all the phone lines over to get them working before we could tackle the Multitechs. But of course the fun didn't stop there.

The SCSI controller on the web server decided to take a brief vacation, so that became a minor brush fire. Also, we had to submit an application for another round of IP addresses to Internic, placate an anxious vendor who was making week-long phone tag into an olympic event, and figure out why no customer could reach certain Barrnet sites during that morning. The usual noise level of tech support, account billing, responding to mail, and everything else that clogs the arteries of everyday life only added to the general chaos that keeps us from dying of boredom here at Sirius.

We managed to reconfigure, rewire, and test all the Multitechs over the course of the afternoon, and hesitantly put a few back online, watching like nervous parents as they send their four-year-old across a divided highway for a cup of sugar. Twelve of the modems lit up in less than two minutes, and we monitored their activity looking for dropped lines or incomplete resets. As of this writing they seem to be working fine, so I'll leave them for the night before taking the USRs back down. Dealing with modem problems under full load is an experience that makes one yearn for more relaxed occupations like welter-weight prize fighting or maybe air traffic control.

If you encounter problems with the 28.8k modems please report them here, but I ask that you try to identify which port you were connected to, if possible, and what time the trouble occurred. In the meantime I think I'll crawl into a grass hut somewhere and keep myself amused by clacking a few rocks together while not thinking about computers.

-- Don

Have you found errors nontrivial or marginal, factual, analytical and illogical, arithmetical, temporal, or even typographical? Please let me know; drop me email. Thanks!
 

What's New?  •  Search this Site  •  Website Map
Travel  •  Burning Man  •  San Francisco
Kilts! Kilts! Kilts!  •  Macintosh  •  Technology  •  CU-SeeMe
This page is copyrighted 1993-2008 by Lila, Isaac, Rose, and Mickey Sattler. All rights reserved.