GCN July 13, 1998
Software glitches leave Navy Smart Ship dead in the water
By Gregory Slabodkin GCN Staff
The Navy's Smart Ship technology may not be as smart as the service contends.
Although PCs have reduced workloads for sailors aboard the Aegis missile cruiser USS Yorktown, software glitches resulted in system failures and crippled ship operations, according to Navy officials.
Navy brass have called the Yorktown Smart Ship pilot a success in reducing manpower, maintenance and costs. The Navy began running shipboard applications under Microsoft Windows NT so that fewer sailors would be needed to control key ship functions.
But the Navy last fall learned a difficult lesson about automation: The very information technology on which the ships depend also makes them vulnerable. The Yorktown last September suffered a systems failure when bad data was fed into its computers during maneuvers off the coast of Cape Charles, Va.
The ship had to be towed into the Naval base at Norfolk, Va., because a database overflow caused its propulsion system to fail, according to Anthony DiGiorgio, a civilian engineer with the Atlantic Fleet Technical Support Center in Norfolk.
"We are putting equipment in the engine room that we cannot maintain and, when it fails, results in a critical failure," DiGiorgio said. It took two days of pierside maintenance to fix the problem.
The Yorktown has been towed into port after other systems failures, he said.
Atlantic Fleet officials acknowledged that the Yorktown last September experienced what they termed "an engineering local area network casualty," but denied that the ship's systems failure lasted as long as DiGiorgio said. The Yorktown was dead in the water for about two hours and 45 minutes, fleet officials said, and did not have to be towed in.
"This is the only time this casualty has occurred and the only propulsion casualty involved with the control system since May 2, 1997, when software configuration was frozen," Vice Adm. Henry Giffin, commander of the Atlantic Fleet's Naval Surface Force, reported in an Oct. 24, 1997, memorandum.
Giffin wrote the memo to describe "what really happened in hope of clearing the scuttlebutt" surrounding the incident, he noted.
The Yorktown lost control of its propulsion system because its computers were unable to divide by the number zero, the memo said. The Yorktown's Standard Monitoring Control System administrator entered zero into the data field for the Remote Data Base Manager program. That caused the database to overflow and crash all LAN consoles and miniature remote terminal units, the memo said.
The program administrators are trained to bypass a bad data field and change the value if such a problem occurs again, Atlantic Fleet officials said.
But "the Yorktown's failure in September 1997 was not as simple as reported," DiGiorgio said.
"If you understand computers, you know that a computer normally is immune to the character of the data it processes," he wrote in the June U.S. Naval Institute's Proceedings Magazine. "Your $2.95 calculator, for example, gives you a zero when you try to divide a number by zero, and does not stop executing the next set of instructions. It seems that the computers on the Yorktown were not designed to tolerate such a simple failure."
The Navy reduced the Yorktown crew by 10 percent and saved more than $2.8 million a year using the computers. The ship uses dual 200-MHz Pentium Pros from Intergraph Corp. of Huntsville, Ala. The PCs and server run NT 4.0 over a high-speed, fiber-optic LAN.
Blame it on the OS
But according to DiGiorgio, who in an interview said he has serviced automated control systems on Navy ships for the past 26 years, the NT operating system is the source of the Yorktown's computer problems.
NT applications aboard the Yorktown provide damage control, run the ship's control center on the bridge, monitor the engines and navigate the ship when under way.
"Using Windows NT, which is known to have some failure modes, on a warship is similar to hoping that luck will be in our favor," DiGiorgio said.
Pacific and Atlantic fleets in March 1997 selected NT 4.0 as the standard OS for both networks and PCs as part of the Navy's Information Technology for the 21st Century initiative. Current guidance approved by the Navy's chief information officer calls for all new applications to run under NT.
Ron Redman, deputy technical director of the Fleet Introduction Division of the Aegis Program Executive Office, said there have been numerous software failures associated with NT aboard the Yorktown.
"Refining that is an ongoing process," Redman said. "Unix is a better system for control of equipment and machinery, whereas NT is a better system for the transfer of information and data. NT has never been fully refined and there are times when we have had shutdowns that resulted from NT."
The Yorktown has been towed into port several times because of the systems failures, he said.
"Because of politics, some things are being forced on us that without political pressure we might not do, like Windows NT," Redman said. "If it were up to me I probably would not have used Windows NT in this particular application. If we used Unix, we would have a system that has less of a tendency to go down."
Although Unix is more reliable, Redman said, NT may become more reliable with time.
The Navy is moving the service's command and control applications from Unix to NT as part of IT-21. Under IT-21, the Navy also plans to modernize ships in the Atlantic and Pacific fleets with asynchronous transfer mode LANs. Large ATM networks running NT have already been installed on the USS Abraham Lincoln and USS Essex.
But DiGiorgio said the LANs might experience a chain reaction of computer failures like those experienced on the Yorktown. That domino effect is inherent to the system design of shipboard LANs, he said.
"There is very little segregation of error when software shares bad data," DiGiorgio said. "Instead of one computer knocking off on the Yorktown, they all did, one after the other. What if this happened in actual combat?"
Although the Yorktown did not have backup systems, Redman said that future Smart Ships will have systems redundancy to ensure that ships can continue to operate.
But DiGiorgio said that the Smart Ship project needs to do more engineering up front.
"Installing a control system on a warship and resolving problems as the project progresses is a costly and naive process," DiGiorgio wrote in the Proceedings article. "Now, with the top people rotated off the Smart Ship Project, it would be wise for the Navy to investigate this fiasco more fully."
Redman has a different perspective. "If it were me, I wouldn't say all the things that Tony [DiGiorgio] has said out of discretion and consideration for being a long-term employee," he said. "But I will say this about Tony, he's a very bright engineer."
"Everybody plays the obedience role where you cannot criticize the system," said DiGiorgio, a self-described whistle-blower. "I'm not that kind of guy."